Science.gov

Sample records for rapid virulence annotation

  1. Rapid Virulence Annotation (RVA): identification of virulence factors using a bacterial genome library and multiple invertebrate hosts.

    PubMed

    Waterfield, Nicholas R; Sanchez-Contreras, Maria; Eleftherianos, Ioannis; Dowling, Andrea; Yang, Guowei; Wilkinson, Paul; Parkhill, Julian; Thomson, Nicholas; Reynolds, Stuart E; Bode, Helge B; Dorus, Steven; Ffrench-Constant, Richard H

    2008-10-14

    Current sequence databases now contain numerous whole genome sequences of pathogenic bacteria. However, many of the predicted genes lack any functional annotation. We describe an assumption-free approach, Rapid Virulence Annotation (RVA), for the high-throughput parallel screening of genomic libraries against four different taxa: insects, nematodes, amoeba, and mammalian macrophages. These hosts represent different aspects of both the vertebrate and invertebrate immune system. Here, we apply RVA to the emerging human pathogen Photorhabdus asymbiotica using "gain of toxicity" assays of recombinant Escherichia coli clones. We describe a wealth of potential virulence loci and attribute biological function to several putative genomic islands, which may then be further characterized using conventional molecular techniques. The application of RVA to other pathogen genomes promises to ascribe biological function to otherwise uncharacterized virulence genes.

  2. RATT: Rapid Annotation Transfer Tool

    PubMed Central

    Otto, Thomas D.; Dillon, Gary P.; Degrave, Wim S.; Berriman, Matthew

    2011-01-01

    Second-generation sequencing technologies have made large-scale sequencing projects commonplace. However, making use of these datasets often requires gene function to be ascribed genome wide. Although tool development has kept pace with the changes in sequence production, for tasks such as mapping, de novo assembly or visualization, genome annotation remains a challenge. We have developed a method to rapidly provide accurate annotation for new genomes using previously annotated genomes as a reference. The method, implemented in a tool called RATT (Rapid Annotation Transfer Tool), transfers annotations from a high-quality reference to a new genome on the basis of conserved synteny. We demonstrate that a Mycobacterium tuberculosis genome or a single 2.5 Mb chromosome from a malaria parasite can be annotated in less than five minutes with only modest computational resources. RATT is available at http://ratt.sourceforge.net. PMID:21306991

  3. The RAST Server: Rapid Annotations using Subsystems Technology

    PubMed Central

    Aziz, Ramy K; Bartels, Daniela; Best, Aaron A; DeJongh, Matthew; Disz, Terrence; Edwards, Robert A; Formsma, Kevin; Gerdes, Svetlana; Glass, Elizabeth M; Kubal, Michael; Meyer, Folker; Olsen, Gary J; Olson, Robert; Osterman, Andrei L; Overbeek, Ross A; McNeil, Leslie K; Paarmann, Daniel; Paczian, Tobias; Parrello, Bruce; Pusch, Gordon D; Reich, Claudia; Stevens, Rick; Vassieva, Olga; Vonstein, Veronika; Wilke, Andreas; Zagnitko, Olga

    2008-01-01

    Background The number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them. Description We describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment. The service normally makes the annotated genome available within 12–24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service. Conclusion By providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes. PMID:18261238

  4. The RAST server : rapid annotations using subsystems technology.

    SciTech Connect

    Aziz, R. K.; Bartels, D.; Best, A. A.; DeJongh, M.; Disz, T.; Edwards, R. A.; Formsma, K.; Gerdes, S.; Glass, E. M.; Kubal, M.; Meyer, F.; Olsen, G. J.; Olson, R.; Osterman, A. L.; Overbeek, R. A.; McNeil, L. K.; Paarmann, D.; Paczian, T.; Parrello, B.; Pusch, G. D.; Reich, C.; Stevens, R.; Vassieva, O.; Vonstein, V.; Wilke, A.; Zagnitko, O.; Mathematics and Computer Science; Fellowship for Interpretation of Genomes; Univ. of Chicago; Univ. of Illinois; The Burnham Inst.; Hope Coll.; Univ. of Tenn.; Cairo Univ.

    2008-02-08

    The number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them. We describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment. The service normally makes the annotated genome available within 12-24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service. By providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.

  5. Tool for rapid annotation of microbial SNPs (TRAMS): a simple program for rapid annotation of genomic variation in prokaryotes.

    PubMed

    Reumerman, Richard A; Tucker, Nicholas P; Herron, Paul R; Hoskisson, Paul A; Sangal, Vartul

    2013-09-01

    Next generation sequencing (NGS) has been widely used to study genomic variation in a variety of prokaryotes. Single nucleotide polymorphisms (SNPs) resulting from genomic comparisons need to be annotated for their functional impact on the coding sequences. We have developed a program, TRAMS, for functional annotation of genomic SNPs which is available to download as a single file executable for WINDOWS users with limited computational experience and as a Python script for Mac OS and Linux users. TRAMS needs a tab delimited text file containing SNP locations, reference nucleotide and SNPs in variant strains along with a reference genome sequence in GenBank or EMBL format. SNPs are annotated as synonymous, nonsynonymous or nonsense. Nonsynonymous SNPs in start and stop codons are separated as non-start and non-stop SNPs, respectively. SNPs in multiple overlapping features are annotated separately for each feature and multiple nucleotide polymorphisms within a codon are combined before annotation. We have also developed a workflow for Galaxy, a highly used tool for analysing NGS data, to map short reads to a reference genome and extract and annotate the SNPs. TRAMS is a simple program for rapid and accurate annotation of SNPs that will be very useful for microbiologists in analysing genomic diversity in microbial populations.

  6. Genome Annotation Transfer Utility (GATU): rapid annotation of viral genomes using a closely related reference genome.

    PubMed

    Tcherepanov, Vasily; Ehlers, Angelika; Upton, Chris

    2006-06-13

    Since DNA sequencing has become easier and cheaper, an increasing number of closely related viral genomes have been sequenced. However, many of these have been deposited in GenBank without annotations, severely limiting their value to researchers. While maintaining comprehensive genomic databases for a set of virus families at the Viral Bioinformatics Resource Center http://www.biovirus.org and Viral Bioinformatics - Canada http://www.virology.ca, we found that researchers were unnecessarily spending time annotating viral genomes that were close relatives of already annotated viruses. We have therefore designed and implemented a novel tool, Genome Annotation Transfer Utility (GATU), to transfer annotations from a previously annotated reference genome to a new target genome, thereby greatly reducing this laborious task. GATU transfers annotations from a reference genome to a closely related target genome, while still giving the user final control over which annotations should be included. GATU also detects open reading frames present in the target but not the reference genome and provides the user with a variety of bioinformatics tools to quickly determine if these ORFs should also be included in the annotation. After this process is complete, GATU saves the newly annotated genome as a GenBank, EMBL or XML-format file. The software is coded in Java and runs on a variety of computer platforms. Its user-friendly Graphical User Interface is specifically designed for users trained in the biological sciences. GATU greatly simplifies the initial stages of genome annotation by using a closely related genome as a reference. It is not intended to be a gene prediction tool or a "complete" annotation system, but we have found that it significantly reduces the time required for annotation of genes and mature peptides as well as helping to standardize gene names between related organisms by transferring reference genome annotations to the target genome. The program is freely

  7. Approaching the Functional Annotation of Fungal Virulence Factors Using Cross-Species Genetic Interaction Profiling

    PubMed Central

    Brown, Jessica C. S.; Madhani, Hiten D.

    2012-01-01

    In many human fungal pathogens, genes required for disease remain largely unannotated, limiting the impact of virulence gene discovery efforts. We tested the utility of a cross-species genetic interaction profiling approach to obtain clues to the molecular function of unannotated pathogenicity factors in the human pathogen Cryptococcus neoformans. This approach involves expression of C. neoformans genes of interest in each member of the Saccharomyces cerevisiae gene deletion library, quantification of their impact on growth, and calculation of the cross-species genetic interaction profiles. To develop functional predictions, we computed and analyzed the correlations of these profiles with existing genetic interaction profiles of S. cerevisiae deletion mutants. For C. neoformans LIV7, which has no S. cerevisiae ortholog, this profiling approach predicted an unanticipated role in the Golgi apparatus. Validation studies in C. neoformans demonstrated that Liv7 is a functional Golgi factor where it promotes the suppression of the exposure of a specific immunostimulatory molecule, mannose, on the cell surface, thereby inhibiting phagocytosis. The genetic interaction profile of another pathogenicity gene that lacks an S. cerevisiae ortholog, LIV6, strongly predicted a role in endosome function. This prediction was also supported by studies of the corresponding C. neoformans null mutant. Our results demonstrate the utility of quantitative cross-species genetic interaction profiling for the functional annotation of fungal pathogenicity proteins of unknown function including, surprisingly, those that are not conserved in sequence across fungi. PMID:23300468

  8. Approaching the functional annotation of fungal virulence factors using cross-species genetic interaction profiling.

    PubMed

    Brown, Jessica C S; Madhani, Hiten D

    2012-01-01

    In many human fungal pathogens, genes required for disease remain largely unannotated, limiting the impact of virulence gene discovery efforts. We tested the utility of a cross-species genetic interaction profiling approach to obtain clues to the molecular function of unannotated pathogenicity factors in the human pathogen Cryptococcus neoformans. This approach involves expression of C. neoformans genes of interest in each member of the Saccharomyces cerevisiae gene deletion library, quantification of their impact on growth, and calculation of the cross-species genetic interaction profiles. To develop functional predictions, we computed and analyzed the correlations of these profiles with existing genetic interaction profiles of S. cerevisiae deletion mutants. For C. neoformans LIV7, which has no S. cerevisiae ortholog, this profiling approach predicted an unanticipated role in the Golgi apparatus. Validation studies in C. neoformans demonstrated that Liv7 is a functional Golgi factor where it promotes the suppression of the exposure of a specific immunostimulatory molecule, mannose, on the cell surface, thereby inhibiting phagocytosis. The genetic interaction profile of another pathogenicity gene that lacks an S. cerevisiae ortholog, LIV6, strongly predicted a role in endosome function. This prediction was also supported by studies of the corresponding C. neoformans null mutant. Our results demonstrate the utility of quantitative cross-species genetic interaction profiling for the functional annotation of fungal pathogenicity proteins of unknown function including, surprisingly, those that are not conserved in sequence across fungi.

  9. Bioinformatics annotation of the hypothetical proteins found by omics techniques can help to disclose additional virulence factors.

    PubMed

    Hernández, Sergio; Gómez, Antonio; Cedano, Juan; Querol, Enrique

    2009-10-01

    The advent of genomics should have facilitated the identification of microbial virulence factors, a key objective for vaccine design. When the bacterial pathogen infects the host it expresses a set of genes, a number of them being virulence factors. Among the genes identified by techniques as microarrays, in vivo expression technology, signature-tagged mutagenesis and differential fluorescence induction there are many related to cellular stress, basal metabolism, etc., which cannot be directly involved in virulence, or at least cannot be considered useful candidates to be deleted for designing a live attenuated vaccine. Among the genes disclosed by these methodologies there are a number of hypothetical or unknown proteins. As they can hide some true virulence factors, we have reannotated all of these hypothetical proteins from several respiratory pathogens by a careful and in-depth analysis of each one. Although some of the re-annotations match with functions that can be related to microbial virulence, the identification of virulence factors remains difficult.

  10. Rapid Annotation of Interictal Epileptiform Discharges via Template Matching under Dynamic Time Warping

    PubMed Central

    Dauwels, J.; Rakthanmanon, T.; Keogh, E.; Cash, S.S.; Westover, M.B.

    2017-01-01

    Background EEG interpretation relies on experts who are in short supply. There is a great need for automated pattern recognition systems to assist with interpretation. However, attempts to develop such systems have been limited by insufficient expert-annotated data. To address these issues, we developed a system named NeuroBrowser for EEG review and rapid waveform annotation. New Methods At the core of NeuroBrowser lies on ultrafast template matching under Dynamic Time Warping, which substantially accelerates the task of annotation. Results Our results demonstrate that NeuroBrowser can reduce the time required for annotation of interictal epileptiform discharges by EEG experts by 20–90%, with an average of approximately 70%. Comparison with Existing Method(s) In comparison with conventional manual EEG annotation, NeuroBrowser is able to save EEG experts approximately 70% on average of the time spent in annotating interictal epileptiform discharges. We have already extracted 19,000+ interictal epileptiform discharges from 100 patient EEG recordings. To our knowledge this represents the largest annotated database of interictal epileptiform discharges in existence. Conclusion NeuroBrowser is an integrated system for rapid waveform annotation. While the algorithm is currently tailored to annotation of interictal epileptiform discharges in scalp EEG recordings, the concepts can be easily generalized to other waveforms and signal types. PMID:26944098

  11. Systematic annotation and analysis of “virmugens” - virulence factors whose mutants can be used as live attenuated vaccines

    PubMed Central

    Racz, Rebecca; Chung, Monica; Xiang, Zuoshuang; He, Yongqun

    2012-01-01

    Live attenuated vaccines are usually generated by mutation of genes encoding virulence factors. “Virmugen” is coined here to represent a gene that encodes for a virulent factor of a pathogen and has been proven feasible in animal models to make a live attenuated vaccine by knocking out this gene. Not all virulence factors are virmugens. VirmugenDB is a web-based virmugen database (http://www.violinet.org/virmugendb). Currently, VirmugenDB includes 225 virmugens that have been verified to be valuable for vaccine development against 57 bacterial, viral, and protozoan pathogens. Bioinformatics analysis has revealed significant patterns in virmugens. For example, 10 Gram-negative and one Gram-positive bacterial aroA genes are virmugens. A sequence analysis has revealed at least 50% of identities in the protein sequences of the 10 Gram-negative bacterial aroA virmugens. As a pathogen case study, Brucella virmugens were analyzed. Out of 15 verified Brucella virmugens, six are related to carbohydrate or nucleotide transport and metabolism, and two involving cell membrane biogenesis. In addition, 54 virmugens from 24 viruses and 12 virmugens from 4 parasites are also stored in VirmugenDB. Virmugens tend to involve metabolism of nutrients (e.g., amino acids, carbohydrates, and nucleotides) and cell membrane formation. Host genes whose expressions were regulated by virmugen mutation vaccines or wild type virulent pathogens have also been annotated and systematically compared. The bioinformatics annotation and analysis of virmugens helps elucidate enriched virmugen profiles and the mechanisms of protective immunity, and further supports rational vaccine design. PMID:23219434

  12. SpikeGUI: Software for Rapid Interictal Discharge Annotation via Template Matching and Online Machine Learning

    PubMed Central

    Jin, Jing; Dauwels, Justin; Cash, Sydney; Westover, M. Brandon

    2015-01-01

    Detection of interictal discharges is a key element of interpreting EEGs during the diagnosis and management of epilepsy. Because interpretation of clinical EEG data is time-intensive and reliant on experts who are in short supply, there is a great need for automated spike detectors. However, attempts to develop general-purpose spike detectors have so far been severely limited by a lack of expert-annotated data. Huge databases of interictal discharges are therefore in great demand for the development of general-purpose detectors. Detailed manual annotation of interictal discharges is time consuming, which severely limits the willingness of experts to participate. To address such problems, a graphical user interface “SpikeGUI” was developed in our work for the purposes of EEG viewing and rapid interictal discharge annotation. “SpikeGUI” substantially speeds up the task of annotating interictal discharges using a custom-built algorithm based on a combination of template matching and online machine learning techniques. While the algorithm is currently tailored to annotation of interictal epileptiform discharges, it can easily be generalized to other waveforms and signal types. PMID:25570976

  13. Rapid identification of sequences for orphan enzymes to power accurate protein annotation.

    PubMed

    Ramkissoon, Kevin R; Miller, Jennifer K; Ojha, Sunil; Watson, Douglas S; Bomar, Martha G; Galande, Amit K; Shearer, Alexander G

    2013-01-01

    The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the "back catalog" of enzymology--"orphan enzymes," those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme "back catalog" is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology's "back catalog" another powerful tool to drive accurate genome annotation.

  14. Rapid Identification of Sequences for Orphan Enzymes to Power Accurate Protein Annotation

    PubMed Central

    Ojha, Sunil; Watson, Douglas S.; Bomar, Martha G.; Galande, Amit K.; Shearer, Alexander G.

    2013-01-01

    The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the “back catalog” of enzymology – “orphan enzymes,” those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme “back catalog” is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology’s “back catalog” another powerful tool to drive accurate genome annotation. PMID:24386392

  15. Screen of Non-annotated Small Secreted Proteins of Pseudomonas syringae Reveals a Virulence Factor That Inhibits Tomato Immune Proteases

    PubMed Central

    Shindo, Takayuki; Kaschani, Farnusch; Kovács, Judit; Tian, Fang; Kourelis, Jiorgos; Hong, Tram Ngoc; Colby, Tom; Shabab, Mohammed; Chawla, Rohini; Kumari, Selva; Ilyas, Muhammad; Hörger, Anja C.; Alfano, James R.; van der Hoorn, Renier A. L.

    2016-01-01

    Pseudomonas syringae pv. tomato DC3000 (PtoDC3000) is an extracellular model plant pathogen, yet its potential to produce secreted effectors that manipulate the apoplast has been under investigated. Here we identified 131 candidate small, secreted, non-annotated proteins from the PtoDC3000 genome, most of which are common to Pseudomonas species and potentially expressed during apoplastic colonization. We produced 43 of these proteins through a custom-made gateway-compatible expression system for extracellular bacterial proteins, and screened them for their ability to inhibit the secreted immune protease C14 of tomato using competitive activity-based protein profiling. This screen revealed C14-inhibiting protein-1 (Cip1), which contains motifs of the chagasin-like protease inhibitors. Cip1 mutants are less virulent on tomato, demonstrating the importance of this effector in apoplastic immunity. Cip1 also inhibits immune protease Pip1, which is known to suppress PtoDC3000 infection, but has a lower affinity for its close homolog Rcr3, explaining why this protein is not recognized in tomato plants carrying the Cf-2 resistance gene, which uses Rcr3 as a co-receptor to detect pathogen-derived protease inhibitors. Thus, this approach uncovered a protease inhibitor of P. syringae, indicating that also P. syringae secretes effectors that selectively target apoplastic host proteases of tomato, similar to tomato pathogenic fungi, oomycetes and nematodes. PMID:27603016

  16. Serial infection of diverse host (Mus) genotypes rapidly impedes pathogen fitness and virulence

    PubMed Central

    Kubinak, Jason L.; Cornwall, Douglas H.; Hasenkrug, Kim J.; Adler, Frederick R.; Potts, Wayne K.

    2015-01-01

    Reduced genetic variation among hosts may favour the emergence of virulent infectious diseases by enhancing pathogen replication and its associated virulence due to adaptation to a limited set of host genotypes. Here, we test this hypothesis using experimental evolution of a mouse-specific retroviral pathogen, Friend virus (FV) complex. We demonstrate rapid fitness (i.e. viral titre) and virulence increases when FV complex serially infects a series of inbred mice representing the same genotype, but not when infecting a diverse array of inbred mouse strains modelling the diversity in natural host populations. Additionally, a single infection of a different host genotype was sufficient to constrain the emergence of a high fitness/high virulence FV complex phenotype in these experiments. The potent inhibition of viral fitness and virulence was associated with an observed loss of the defective retroviral genome (spleen focus-forming virus), whose presence exacerbates infection and drives disease in susceptible mice. Results from our experiments provide an important first step in understanding how genetic variation among vertebrate hosts influences pathogen evolution and suggests that serial exposure to different genotypes within a single host species may act as a constraint on pathogen adaptation that prohibits the emergence of more virulent infections. From a practical perspective, these results have implications for low-diversity host populations such as endangered species and domestic animals. PMID:25392466

  17. Rapid Bacterial Identification, Resistance, Virulence and Type Profiling using Selected Reaction Monitoring Mass Spectrometry

    PubMed Central

    Charretier, Yannick; Dauwalder, Olivier; Franceschi, Christine; Degout-Charmette, Elodie; Zambardi, Gilles; Cecchini, Tiphaine; Bardet, Chloe; Lacoux, Xavier; Dufour, Philippe; Veron, Laurent; Rostaing, Hervé; Lanet, Veronique; Fortin, Tanguy; Beaulieu, Corinne; Perrot, Nadine; Dechaume, Dominique; Pons, Sylvie; Girard, Victoria; Salvador, Arnaud; Durand, Géraldine; Mallard, Frédéric; Theretz, Alain; Broyer, Patrick; Chatellier, Sonia; Gervasi, Gaspard; Van Nuenen, Marc; Ann Roitsch, Carolyn; Van Belkum, Alex; Lemoine, Jérôme; Vandenesch, François; Charrier, Jean-Philippe

    2015-01-01

    Mass spectrometry (MS) in Selected Reaction Monitoring (SRM) mode is proposed for in-depth characterisation of microorganisms in a multiplexed analysis. Within 60–80 minutes, the SRM method performs microbial identification (I), antibiotic-resistance detection (R), virulence assessment (V) and it provides epidemiological typing information (T). This SRM application is illustrated by the analysis of the human pathogen Staphylococcus aureus, demonstrating its promise for rapid characterisation of bacteria from positive blood cultures of sepsis patients. PMID:26350205

  18. NDER: A novel web application using annotated whole slide images for rapid improvements in human pattern recognition.

    PubMed

    Reder, Nicholas P; Glasser, Daniel; Dintzis, Suzanne M; Rendi, Mara H; Garcia, Rochelle L; Henriksen, Jonathan C; Kilgore, Mark R

    2016-01-01

    Whole-slide images (WSIs) present a rich source of information for education, training, and quality assurance. However, they are often used in a fashion similar to glass slides rather than in novel ways that leverage the advantages of WSI. We have created a pipeline to transform annotated WSI into pattern recognition training, and quality assurance web application called novel diagnostic electronic resource (NDER). Create an efficient workflow for extracting annotated WSI for use by NDER, an attractive web application that provides high-throughput training. WSI were annotated by a resident and classified into five categories. Two methods of extracting images and creating image databases were compared. Extraction Method 1: Manual extraction of still images and validation of each image by four breast pathologists. Extraction Method 2: Validation of annotated regions on the WSI by a single experienced breast pathologist and automated extraction of still images tagged by diagnosis. The extracted still images were used by NDER. NDER briefly displays an image, requires users to classify the image after time has expired, then gives users immediate feedback. The NDER workflow is efficient: annotation of a WSI requires 5 min and validation by an expert pathologist requires An additional one to 2 min. The pipeline is highly automated, with only annotation and validation requiring human input. NDER effectively displays hundreds of high-quality, high-resolution images and provides immediate feedback to users during a 30 min session. NDER efficiently uses annotated WSI to rapidly increase pattern recognition and evaluate for diagnostic proficiency.

  19. Evaluation of vascular clearance as a marker for virulence of alphaviruses: disassociation of rapid clearance with low virulence of venezuelan encephalitis virus strains in guinea pigs.

    PubMed Central

    Jahrling, P B; Heisey, G B; Hesse, R A

    1977-01-01

    The concept that relates low virulence of certain alphaviruses to low viremia and efficient vascular clearance of virus was tested in guinea pigs. Previously published studies with hamsters suggested that virulent strains maintain high viremias primarily because they are cleared inefficiently from the blood. In the present study, with guinea pigs, six of six virulent strains of Venezuelan encephalitis virus were cleared inefficiently, whereas three of six nonlethal or benign virus strains were cleared rapidly. However, three other guinea pig-benign Venezuelan encephalitis virus strains cleared slowly, to produce a high viremia was correlated with inefficient growth in primary viral replication sites. Thus, the potential of some alphaviruses to produce destructive lesions may be restricted by efficient clearance of virus from the blood, whereas the growth of other benign alphavirus strains may be restricted after the virus is presented to target cells. PMID:892910

  20. NDER: A novel web application using annotated whole slide images for rapid improvements in human pattern recognition

    PubMed Central

    Reder, Nicholas P.; Glasser, Daniel; Dintzis, Suzanne M.; Rendi, Mara H.; Garcia, Rochelle L.; Henriksen, Jonathan C.; Kilgore, Mark R.

    2016-01-01

    Context: Whole-slide images (WSIs) present a rich source of information for education, training, and quality assurance. However, they are often used in a fashion similar to glass slides rather than in novel ways that leverage the advantages of WSI. We have created a pipeline to transform annotated WSI into pattern recognition training, and quality assurance web application called novel diagnostic electronic resource (NDER). Aims: Create an efficient workflow for extracting annotated WSI for use by NDER, an attractive web application that provides high-throughput training. Materials and Methods: WSI were annotated by a resident and classified into five categories. Two methods of extracting images and creating image databases were compared. Extraction Method 1: Manual extraction of still images and validation of each image by four breast pathologists. Extraction Method 2: Validation of annotated regions on the WSI by a single experienced breast pathologist and automated extraction of still images tagged by diagnosis. The extracted still images were used by NDER. NDER briefly displays an image, requires users to classify the image after time has expired, then gives users immediate feedback. Results: The NDER workflow is efficient: annotation of a WSI requires 5 min and validation by an expert pathologist requires An additional one to 2 min. The pipeline is highly automated, with only annotation and validation requiring human input. NDER effectively displays hundreds of high-quality, high-resolution images and provides immediate feedback to users during a 30 min session. Conclusions: NDER efficiently uses annotated WSI to rapidly increase pattern recognition and evaluate for diagnostic proficiency. PMID:27563490

  1. STRAP PTM: Software Tool for Rapid Annotation and Differential Comparison of Protein Post-Translational Modifications

    PubMed Central

    Spencer, Jean L.; Bhatia, Vivek N.; Whelan, Stephen A.; Costello, Catherine E.

    2014-01-01

    The identification of protein post-translational modifications (PTMs) is an increasingly important component of proteomics and biomarker discovery, but very few tools exist for performing fast and easy characterization of global PTM changes and differential comparison of PTMs across groups of data obtained from liquid chromatography-tandem mass spectrometry experiments. STRAP PTM (Software Tool for Rapid Annotation of Proteins: Post-Translational Modification edition) is a program that was developed to facilitate the characterization of PTMs using spectral counting and a novel scoring algorithm to accelerate the identification of differential PTMs from complex data sets. The software facilitates multi-sample comparison by collating, scoring, and ranking PTMs and by summarizing data visually. The freely available software (beta release) installs on a PC and processes data in protXML format obtained from files parsed through the Trans-Proteomic Pipeline. The easy-to-use interface allows examination of results at protein, peptide, and PTM levels, and the overall design offers tremendous flexibility that provides proteomics insight beyond simple assignment and counting. PMID:25422678

  2. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

    PubMed Central

    Overbeek, Ross; Olson, Robert; Pusch, Gordon D.; Olsen, Gary J.; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Parrello, Bruce; Shukla, Maulik; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang; Stevens, Rick

    2014-01-01

    In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources. PMID:24293654

  3. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

    PubMed

    Overbeek, Ross; Olson, Robert; Pusch, Gordon D; Olsen, Gary J; Davis, James J; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Parrello, Bruce; Shukla, Maulik; Vonstein, Veronika; Wattam, Alice R; Xia, Fangfang; Stevens, Rick

    2014-01-01

    In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.

  4. Rapid-Viability PCR Method for Detection of Live, Virulent Bacillus anthracis in Environmental Samples ▿

    PubMed Central

    Létant, Sonia E.; Murphy, Gloria A.; Alfaro, Teneile M.; Avila, Julie R.; Kane, Staci R.; Raber, Ellen; Bunt, Thomas M.; Shah, Sanjiv R.

    2011-01-01

    In the event of a biothreat agent release, hundreds of samples would need to be rapidly processed to characterize the extent of contamination and determine the efficacy of remediation activities. Current biological agent identification and viability determination methods are both labor- and time-intensive such that turnaround time for confirmed results is typically several days. In order to alleviate this issue, automated, high-throughput sample processing methods were developed in which real-time PCR analysis is conducted on samples before and after incubation. The method, referred to as rapid-viability (RV)-PCR, uses the change in cycle threshold after incubation to detect the presence of live organisms. In this article, we report a novel RV-PCR method for detection of live, virulent Bacillus anthracis, in which the incubation time was reduced from 14 h to 9 h, bringing the total turnaround time for results below 15 h. The method incorporates a magnetic bead-based DNA extraction and purification step prior to PCR analysis, as well as specific real-time PCR assays for the B. anthracis chromosome and pXO1 and pXO2 plasmids. A single laboratory verification of the optimized method applied to the detection of virulent B. anthracis in environmental samples was conducted and showed a detection level of 10 to 99 CFU/sample with both manual and automated RV-PCR methods in the presence of various challenges. Experiments exploring the relationship between the incubation time and the limit of detection suggest that the method could be further shortened by an additional 2 to 3 h for relatively clean samples. PMID:21764960

  5. Comparative omics-driven genome annotation refinement: application across Yersiniae.

    PubMed

    Schrimpe-Rutledge, Alexandra C; Jones, Marcus B; Chauhan, Sadhana; Purvine, Samuel O; Sanford, James A; Monroe, Matthew E; Brewer, Heather M; Payne, Samuel H; Ansong, Charles; Frank, Bryan C; Smith, Richard D; Peterson, Scott N; Motin, Vladimir L; Adkins, Joshua N

    2012-01-01

    Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. The annotation process is now performed almost exclusively in an automated fashion to balance the large number of sequences generated. One possible way of reducing errors inherent to automated computational annotations is to apply data from omics measurements (i.e. transcriptional and proteomic) to the un-annotated genome with a proteogenomic-based approach. Here, the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species. Transcriptomic and proteomic data derived from highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis Pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 incorrect (i.e., observed frameshifts, extended start sites, and translated pseudogenes) protein-coding sequences within the three current genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus the discovery of many translated pseudogenes, including the insertion-ablated argD, underscores a need for functional analyses to investigate hypotheses related to divergence. Refinements included the discovery of a seemingly essential ribosomal protein, several virulence-associated factors, a transcriptional regulator, and many hypothetical proteins that were missed during annotation.

  6. Rapid evolution of virulence leading to host extinction under host-parasite coevolution.

    PubMed

    Rafaluk, Charlotte; Gildenhard, Markus; Mitschke, Andreas; Telschow, Arndt; Schulenburg, Hinrich; Joop, Gerrit

    2015-06-13

    Host-parasite coevolution is predicted to result in changes in the virulence of the parasite in order to maximise its reproductive success and transmission potential, either via direct host-to-host transfer or through the environment. The majority of coevolution experiments, however, do not allow for environmental transmission or persistence of long lived parasite stages, in spite of the fact that these may be critical for the evolutionary success of spore forming parasites under natural conditions. We carried out a coevolution experiment using the red flour beetle, Tribolium castaneum, and its natural microsporidian parasite, Paranosema whitei. Beetles and their environment, inclusive of spores released into it, were transferred from generation to generation. We additionally took a modelling approach to further assess the importance of transmissive parasite stages on virulence evolution. In all parasite treatments of the experiment, coevolution resulted in extinction of the host population, with a pronounced increase in virulence being seen. Our modelling approach highlighted the presence of environmental transmissive parasite stages as being critical to the trajectory of virulence evolution in this system. The extinction of host populations was unexpected, particularly as parasite virulence is often seen to decrease in host-parasite coevolution. This, in combination with the increase in virulence and results obtained from the model, suggest that the inclusion of transmissive parasite stages is important to improving our understanding of virulence evolution.

  7. Phylogenetic relationship and virulence inference of Streptococcus Anginosus Group: curated annotation and whole-genome comparative analysis support distinct species designation

    PubMed Central

    2013-01-01

    VNTR numbers that occurred over the course of one year. Conclusions The comparative genomic analysis of the SAG clarifies the phylogenetics of these bacteria and supports the distinct species classification. Numerous potential virulence determinants were identified and provide a foundation for further studies into SAG pathogenesis. Furthermore, the data may be used to enable the development of rapid diagnostic assays and therapeutics for these pathogens. PMID:24341328

  8. Rapid differentiation of Ralstonia solanacearum avirulent and virulent strains by cell fractioning of an isolate using high performance liquid chromatography.

    PubMed

    Zheng, Xuefang; Zhu, Yujing; Liu, Bo; Yu, Qian; Lin, Naiquan

    2016-01-01

    Ralstonia solanacearum is one of the most destructive plant bacterial pathogens worldwide. The population dynamics and genetic stability are important issues, especially when an avirulent strain is used for biocontrol. In this study, we developed a rapid method to differentiate the virulent and avirulent strains of R. solanacearum and to predict the biocontrol efficiency of an avirulent strain using high performance liquid chromatography (HPLC). Three chromatographic peaks P1, P2 and P3 were observed on the HPLC spectra among 68 avirulent and 28 virulent R. solanacearum strains. Based on the HPLC peaks, 96 strains total were assigned to three categories. For avirulent strains, the intense peak is P1, while for virulent strains, P3 is the majority. Based on the HLPC spectra of R. solanacearum strains, a chromatography titer index (CTI) was established as CTIi = Si/(S1+S2+S3) × 100% (i represents an individual HPLC peak; S1, S2 and S3 represent peak areas of P1, P2 and P3, respectively). The avirulent strains had high values of CTI1 ranging from 63.6 to 100.0%, while the virulent strains displayed high values of CTI3 ranging from 90.2 to 100.0%. Biological inoculation studies of 68 avirulent strains revealed that the biocontrol efficacy was the best when CTI1 = 100%. The purity and genetic stability of R. solanacearum strains were confirmed in the P1 fraction of avirulent strain FJAT-1957 and P3 fraction of virulent strain FJAT-1925 after 30 generations of consecutive subculture. These results confirmed that fractioning by HPLC and their deduced CTI can be used for rapid and efficient evaluation and prediction of an isolate of R. solanacearum. To the best of our knowledge, this is the first report that HPLC fractioning can be used for rapid differentiation of virulent and avirulent strains of R. solanacearum. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. Highly potent host external immunity acts as a strong selective force enhancing rapid parasite virulence evolution.

    PubMed

    Rafaluk, Charlotte; Yang, Wentao; Mitschke, Andreas; Rosenstiel, Philip; Schulenburg, Hinrich; Joop, Gerrit

    2017-05-01

    Virulence is often under selection during host-parasite coevolution. In order to increase fitness, parasites are predicted to circumvent and overcome host immunity. A particular challenge for pathogens are external immune systems, chemical defence systems comprised of potent antimicrobial compounds released by prospective hosts into the environment. We carried out an evolution experiment, allowing for coevolution to occur, with the entomopathogenic fungus, Beauveria bassiana, and the red flour beetle, Tribolium castaneum, which has a well-documented external immune system with strong inhibitory effects against B. bassiana. After just seven transfers of experimental evolution we saw a significant increase in parasite induced host mortality, a proxy for virulence, in all B. bassiana lines. This apparent virulence increase was mainly the result of the B. bassiana lines evolving resistance to the beetles' external immune defences, not due to increased production of toxins or other harmful substances. Transcriptomic analyses of evolved B. bassiana implicated the up-regulation of oxidative stress resistance genes in the observed resistance to external immunity. It was concluded that external immunity acts as a powerful selective force for virulence evolution, with an increase in virulence being achieved apparently entirely by overcoming these defences, most likely due to elevated oxidative stress resistance. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.

  10. Virulence Determination

    USDA-ARS?s Scientific Manuscript database

    This chapter reviews the in vitro and in vivo assays that are available for determination of pathogenic potential of Listeria monocytogenes bacteria, highlighting the value of using multiplex PCR for rapid and accurate assessment of listerial virulence....

  11. MAKER-P: a tool kit for the rapid creation, management, and quality control of plant genome annotations.

    PubMed

    Campbell, Michael S; Law, MeiYee; Holt, Carson; Stein, Joshua C; Moghe, Gaurav D; Hufnagel, David E; Lei, Jikai; Achawanantakun, Rujira; Jiao, Dian; Lawrence, Carolyn J; Ware, Doreen; Shiu, Shin-Han; Childs, Kevin L; Sun, Yanni; Jiang, Ning; Yandell, Mark

    2014-02-01

    We have optimized and extended the widely used annotation engine MAKER in order to better support plant genome annotation efforts. New features include better parallelization for large repeat-rich plant genomes, noncoding RNA annotation capabilities, and support for pseudogene identification. We have benchmarked the resulting software tool kit, MAKER-P, using the Arabidopsis (Arabidopsis thaliana) and maize (Zea mays) genomes. Here, we demonstrate the ability of the MAKER-P tool kit to automatically update, extend, and revise the Arabidopsis annotations in light of newly available data and to annotate pseudogenes and noncoding RNAs absent from The Arabidopsis Informatics Resource 10 build. Our results demonstrate that MAKER-P can be used to manage and improve the annotations of even Arabidopsis, perhaps the best-annotated plant genome. We have also installed and benchmarked MAKER-P on the Texas Advanced Computing Center. We show that this public resource can de novo annotate the entire Arabidopsis and maize genomes in less than 3 h and produce annotations of comparable quality to those of the current The Arabidopsis Information Resource 10 and maize V2 annotation builds.

  12. Non-thermal Plasma Exposure Rapidly Attenuates Bacterial AHL-Dependent Quorum Sensing and Virulence

    PubMed Central

    Flynn, Padrig B.; Busetti, Alessandro; Wielogorska, Ewa; Chevallier, Olivier P.; Elliott, Christopher T.; Laverty, Garry; Gorman, Sean P.; Graham, William G.; Gilmore, Brendan F.

    2016-01-01

    The antimicrobial activity of atmospheric pressure non-thermal plasma has been exhaustively characterised, however elucidation of the interactions between biomolecules produced and utilised by bacteria and short plasma exposures are required for optimisation and clinical translation of cold plasma technology. This study characterizes the effects of non-thermal plasma exposure on acyl homoserine lactone (AHL)-dependent quorum sensing (QS). Plasma exposure of AHLs reduced the ability of such molecules to elicit a QS response in bacterial reporter strains in a dose-dependent manner. Short exposures (30–60 s) produce of a series of secondary compounds capable of eliciting a QS response, followed by the complete loss of AHL-dependent signalling following longer exposures. UPLC-MS analysis confirmed the time-dependent degradation of AHL molecules and their conversion into a series of by-products. FT-IR analysis of plasma-exposed AHLs highlighted the appearance of an OH group. In vivo assessment of the exposure of AHLs to plasma was examined using a standard in vivo model. Lettuce leaves injected with the rhlI/lasI mutant PAO-MW1 alongside plasma treated N-butyryl-homoserine lactone and n-(3-oxo-dodecanoyl)-homoserine lactone, exhibited marked attenuation of virulence. This study highlights the capacity of atmospheric pressure non-thermal plasma to modify and degrade AHL autoinducers thereby attenuating QS-dependent virulence in P. aeruginosa. PMID:27242335

  13. Non-thermal Plasma Exposure Rapidly Attenuates Bacterial AHL-Dependent Quorum Sensing and Virulence.

    PubMed

    Flynn, Padrig B; Busetti, Alessandro; Wielogorska, Ewa; Chevallier, Olivier P; Elliott, Christopher T; Laverty, Garry; Gorman, Sean P; Graham, William G; Gilmore, Brendan F

    2016-05-31

    The antimicrobial activity of atmospheric pressure non-thermal plasma has been exhaustively characterised, however elucidation of the interactions between biomolecules produced and utilised by bacteria and short plasma exposures are required for optimisation and clinical translation of cold plasma technology. This study characterizes the effects of non-thermal plasma exposure on acyl homoserine lactone (AHL)-dependent quorum sensing (QS). Plasma exposure of AHLs reduced the ability of such molecules to elicit a QS response in bacterial reporter strains in a dose-dependent manner. Short exposures (30-60 s) produce of a series of secondary compounds capable of eliciting a QS response, followed by the complete loss of AHL-dependent signalling following longer exposures. UPLC-MS analysis confirmed the time-dependent degradation of AHL molecules and their conversion into a series of by-products. FT-IR analysis of plasma-exposed AHLs highlighted the appearance of an OH group. In vivo assessment of the exposure of AHLs to plasma was examined using a standard in vivo model. Lettuce leaves injected with the rhlI/lasI mutant PAO-MW1 alongside plasma treated N-butyryl-homoserine lactone and n-(3-oxo-dodecanoyl)-homoserine lactone, exhibited marked attenuation of virulence. This study highlights the capacity of atmospheric pressure non-thermal plasma to modify and degrade AHL autoinducers thereby attenuating QS-dependent virulence in P. aeruginosa.

  14. Comparative Omics-Driven Genome Annotation Refinement: Application across Yersiniae

    SciTech Connect

    Rutledge, Alexandra C.; Jones, Marcus B.; Chauhan, Sadhana; Purvine, Samuel O.; Sanford, James; Monroe, Matthew E.; Brewer, Heather M.; Payne, Samuel H.; Ansong, Charles; Frank, Bryan C.; Smith, Richard D.; Peterson, Scott; Motin, Vladimir L.; Adkins, Joshua N.

    2012-03-27

    Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. To date, the perceived value of manual curation for genome annotations is not offset by the real cost and time associated with the process. In order to balance the large number of sequences generated, the annotation process is now performed almost exclusively in an automated fashion for most genome sequencing projects. One possible way to reduce errors inherent to automated computational annotations is to apply data from 'omics' measurements (i.e. transcriptional and proteomic) to the un-annotated genome with a proteogenomic-based approach. This approach does require additional experimental and bioinformatics methods to include omics technologies; however, the approach is readily automatable and can benefit from rapid developments occurring in those research domains as well. The annotation process can be improved by experimental validation of transcription and translation and aid in the discovery of annotation errors. Here the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species, as is becoming common in sequencing efforts. Transcriptomic and proteomic data derived from three highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 previously incorrect protein-coding sequences (e.g., observed frameshifts, extended start sites, and translated pseudogenes) within the three current Yersinia genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus

  15. Rapid acquisition of polymorphic virulence markers during adaptation of highly pathogenic avian influenza H5N8 virus in the mouse

    PubMed Central

    Choi, Won-Suk; Baek, Yun Hee; Kwon, Jin Jung; Jeong, Ju Hwan; Park, Su-Jin; Kim, Young-il; Yoon, Sun-Woo; Hwang, Jungwon; Kim, Myung Hee; Kim, Chul-Joong; Webby, Richard J.; Choi, Young Ki; Song, Min-Suk

    2017-01-01

    Emergence of a highly pathogenic avian influenza (HPAI) H5N8 virus in Asia and its spread to Europe and North America has caused great concern for human health. Although the H5N8 virus has been only moderately pathogenic to mammalian hosts, virulence can still increase. We evaluated the pathogenic potential of several H5N8 strains via the mouse-adaptation method. Two H5N8 viruses were sequentially passaged in BALB/c mice and plaque-purified from lung samples. The viruses rapidly obtained high virulence (MLD50, up to 0.5 log10 PFU/mL) within 5 passages. Sequence analysis revealed the acquisition of several virulence markers, including the novel marker P708S in PB1 gene. Combinations of markers synergistically enhanced viral replication and polymerase activity in human cell lines and virulence and multiorgan dissemination in mice. These results suggest that H5N8 viruses can rapidly acquire virulence markers in mammalian hosts; thus, rapid spread as well as repeated viral introduction into the hosts may significantly increase the risk of human infection and elevate pandemic potential. PMID:28094780

  16. Rapid identification of genes controlling virulence and immunity in malaria parasites

    PubMed Central

    Xangsayarath, Phonepadith; Tang, Jianxia; Yahata, Kazuhide; Zoungrana, Augustin; Mitaka, Hayato; Acharjee, Arita; Datta, Partha P.; Hunt, Paul; Carter, Richard; Kaneko, Osamu; Mustonen, Ville; Pain, Arnab

    2017-01-01

    Identifying the genetic determinants of phenotypes that impact disease severity is of fundamental importance for the design of new interventions against malaria. Here we present a rapid genome-wide approach capable of identifying multiple genetic drivers of medically relevant phenotypes within malaria parasites via a single experiment at single gene or allele resolution. In a proof of principle study, we found that a previously undescribed single nucleotide polymorphism in the binding domain of the erythrocyte binding like protein (EBL) conferred a dramatic change in red blood cell invasion in mutant rodent malaria parasites Plasmodium yoelii. In the same experiment, we implicated merozoite surface protein 1 (MSP1) and other polymorphic proteins, as the major targets of strain-specific immunity. Using allelic replacement, we provide functional validation of the substitution in the EBL gene controlling the growth rate in the blood stages of the parasites. PMID:28704525

  17. A rapid PCR assay to characterize the intact pks15/1 gene, a virulence marker in Mycobacterium tuberculosis.

    PubMed

    Zenteno-Cuevas, Roberto; Hernandez-Morales, Rodrigo Javier; Pérez-Navarro, Lucia Monserrat; Muñiz-Salazar, Raquel; Santiago-García, Juan

    2016-02-01

    Intact pks15/1 is involved in the biosynthesis of phenolic glycolipids and proposed as a marker for virulence and phylogeny in tuberculosis. Identification of intact condition is achieved mainly by DNA sequencing. For this reason the aim of this study was to develop a reproducible endpoint PCR-assay to characterize it.

  18. MAKER-P: A Tool Kit for the Rapid Creation, Management, and Quality Control of Plant Genome Annotations1[W][OPEN

    PubMed Central

    Campbell, Michael S.; Law, MeiYee; Holt, Carson; Stein, Joshua C.; Moghe, Gaurav D.; Hufnagel, David E.; Lei, Jikai; Achawanantakun, Rujira; Jiao, Dian; Lawrence, Carolyn J.; Ware, Doreen; Shiu, Shin-Han; Childs, Kevin L.; Sun, Yanni; Jiang, Ning; Yandell, Mark

    2014-01-01

    We have optimized and extended the widely used annotation engine MAKER in order to better support plant genome annotation efforts. New features include better parallelization for large repeat-rich plant genomes, noncoding RNA annotation capabilities, and support for pseudogene identification. We have benchmarked the resulting software tool kit, MAKER-P, using the Arabidopsis (Arabidopsis thaliana) and maize (Zea mays) genomes. Here, we demonstrate the ability of the MAKER-P tool kit to automatically update, extend, and revise the Arabidopsis annotations in light of newly available data and to annotate pseudogenes and noncoding RNAs absent from The Arabidopsis Informatics Resource 10 build. Our results demonstrate that MAKER-P can be used to manage and improve the annotations of even Arabidopsis, perhaps the best-annotated plant genome. We have also installed and benchmarked MAKER-P on the Texas Advanced Computing Center. We show that this public resource can de novo annotate the entire Arabidopsis and maize genomes in less than 3 h and produce annotations of comparable quality to those of the current The Arabidopsis Information Resource 10 and maize V2 annotation builds. PMID:24306534

  19. Rapidly Evolving Genes Are Key Players in Host Specialization and Virulence of the Fungal Wheat Pathogen Zymoseptoria tritici (Mycosphaerella graminicola).

    PubMed

    Poppe, Stephan; Dorsheimer, Lena; Happel, Petra; Stukenbrock, Eva Holtgrewe

    2015-07-01

    The speciation of pathogens can be driven by divergent host specialization. Specialization to a new host is possible via the acquisition of advantageous mutations fixed by positive selection. Comparative genome analyses of closely related species allows for the identification of such key substitutions via inference of genome-wide signatures of positive selection. We previously used a comparative genomics framework to identify genes that have evolved under positive selection during speciation of the prominent wheat pathogen Zymoseptoria tritici (synonym Mycosphaerella graminicola). In this study, we conducted functional analyses of four genes exhibiting strong signatures of positive selection in Z. tritici. We deleted the four genes in Z. tritici and confirm a virulence-related role of three of the four genes ΔZt80707, ΔZt89160 and ΔZt103264. The two mutants ΔZt80707 and ΔZt103264 show a significant reduction in virulence during infection of wheat; the ΔZt89160 mutant causes a hypervirulent phenotype in wheat. Mutant phenotypes of ΔZt80707, ΔZt89160 and ΔZt103264 can be restored by insertion of the wild-type genes. However, the insertion of the Zt80707 and Zt89160 orthologs from Z. pseudotritici and Z. ardabiliae do not restore wild-type levels of virulence, suggesting that positively selected substitutions in Z. tritici may relate to divergent host specialization. Interestingly, the gene Zt80707 encodes also a secretion signal that targets the protein for cell secretion. This secretion signal is however only transcribed in Z. tritici, suggesting that Z. tritici-specific substitutions relate to a new function of the protein in the extracellular space of the wheat-Z. tritici interaction. Together, the results presented here highlight that Zt80707, Zt103264 and Zt89160 represent key genes involved in virulence and host-specific disease development of Z. tritici. Our findings illustrate that evolutionary predictions provide a powerful tool for the

  20. antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences.

    PubMed

    Medema, Marnix H; Blin, Kai; Cimermancic, Peter; de Jager, Victor; Zakrzewski, Piotr; Fischbach, Michael A; Weber, Tilmann; Takano, Eriko; Breitling, Rainer

    2011-07-01

    Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide variety of microbes. However, rapidly and reliably pinpointing all the potential gene clusters for secondary metabolites in dozens of newly sequenced genomes has been extremely challenging, due to their biochemical heterogeneity, the presence of unknown enzymes and the dispersed nature of the necessary specialized bioinformatics tools and resources. Here, we present antiSMASH (antibiotics & Secondary Metabolite Analysis Shell), the first comprehensive pipeline capable of identifying biosynthetic loci covering the whole range of known secondary metabolite compound classes (polyketides, non-ribosomal peptides, terpenes, aminoglycosides, aminocoumarins, indolocarbazoles, lantibiotics, bacteriocins, nucleosides, beta-lactams, butyrolactones, siderophores, melanins and others). It aligns the identified regions at the gene cluster level to their nearest relatives from a database containing all other known gene clusters, and integrates or cross-links all previously available secondary-metabolite specific gene analysis methods in one interactive view. antiSMASH is available at http://antismash.secondarymetabolites.org.

  1. Genome Annotation and Curation Using MAKER and MAKER-P

    PubMed Central

    Campbell, Michael S.; Holt, Carson; Moore, Barry; Yandell, Mark

    2014-01-01

    This unit describes how to use the genome annotation and curation tools MAKER and MAKER-P to annotate protein coding and non-coding RNA genes in newly assembled genomes, update/combine legacy annotations in light of new evidence, add quality metrics to annotations from other pipelines, and map existing annotations to a new assembly. MAKER and MAKER-P can rapidly annotate genomes of any size, and scale to match available computational resources. PMID:25501943

  2. Annotated Videography.

    ERIC Educational Resources Information Center

    United States Holocaust Memorial Museum, Washington, DC.

    This annotated list of 43 videotapes recommended for classroom use addresses various themes for teaching about the Holocaust, including: (1) overviews of the Holocaust; (2) life before the Holocaust; (3) propaganda; (4) racism, anti-Semitism; (5) "enemies of the state"; (6) ghettos; (7) camps; (8) genocide; (9) rescue; (10) resistance;…

  3. Beyond genomic variation--comparison and functional annotation of three Brassica rapa genomes: a turnip, a rapid cycling and a Chinese cabbage.

    PubMed

    Lin, Ke; Zhang, Ningwen; Severing, Edouard I; Nijveen, Harm; Cheng, Feng; Visser, Richard G F; Wang, Xiaowu; de Ridder, Dick; Bonnema, Guusje

    2014-03-31

    Brassica rapa is an economically important crop species. During its long breeding history, a large number of morphotypes have been generated, including leafy vegetables such as Chinese cabbage and pakchoi, turnip tuber crops and oil crops. To investigate the genetic variation underlying this morphological variation, we re-sequenced, assembled and annotated the genomes of two B. rapa subspecies, turnip crops (turnip) and a rapid cycling. We then analysed the two resulting genomes together with the Chinese cabbage Chiifu reference genome to obtain an impression of the B. rapa pan-genome. The number of genes with protein-coding changes between the three genotypes was lower than that among different accessions of Arabidopsis thaliana, which can be explained by the smaller effective population size of B. rapa due to its domestication. Based on orthology to a number of non-brassica species, we estimated the date of divergence among the three B. rapa morphotypes at approximately 250,000 YA, far predating Brassica domestication (5,000-10,000 YA). By analysing genes unique to turnip we found evidence for copy number differences in peroxidases, pointing to a role for the phenylpropanoid biosynthesis pathway in the generation of morphological variation. The estimated date of divergence among three B. rapa morphotypes implies that prior to domestication there was already considerably divergence among B. rapa genotypes. Our study thus provides two new B. rapa reference genomes, delivers a set of computer tools to analyse the resulting pan-genome and uses these to shed light on genetic drivers behind the rich morphological variation found in B. rapa.

  4. Construction of Customized Sub-Databases from NCBI-nr Database for Rapid Annotation of Huge Metagenomic Datasets Using a Combined BLAST and MEGAN Approach

    PubMed Central

    Yu, Ke; Zhang, Tong

    2013-01-01

    We developed a fast method to construct local sub-databases from the NCBI-nr database for the quick similarity search and annotation of huge metagenomic datasets based on BLAST-MEGAN approach. A three-step sub-database annotation pipeline (SAP) was further proposed to conduct the annotation in a much more time-efficient way which required far less computational capacity than the direct NCBI-nr database BLAST-MEGAN approach. The 1st BLAST of SAP was conducted using the original metagenomic dataset against the constructed sub-database for a quick screening of candidate target sequences. Then, the candidate target sequences identified in the 1st BLAST were subjected to the 2nd BLAST against the whole NCBI-nr database. The BLAST results were finally annotated using MEGAN to filter out those mistakenly selected sequences in the 1st BLAST to guarantee the accuracy of the results. Based on the tests conducted in this study, SAP achieved a speedup of ∼150–385 times at the BLAST e-value of 1e–5, compared to the direct BLAST against NCBI-nr database. The annotation results of SAP are exactly in agreement with those of the direct NCBI-nr database BLAST-MEGAN approach, which is very time-consuming and computationally intensive. Selecting rigorous thresholds (e.g. e-value of 1e–10) would further accelerate SAP process. The SAP pipeline may also be coupled with novel similarity search tools (e.g. RAPsearch) other than BLAST to achieve even faster annotation of huge metagenomic datasets. Above all, this sub-database construction method and SAP pipeline provides a new time-efficient and convenient annotation similarity search strategy for laboratories without access to high performance computing facilities. SAP also offers a solution to high performance computing facilities for the processing of more similarity search tasks. PMID:23573212

  5. Construction of customized sub-databases from NCBI-nr database for rapid annotation of huge metagenomic datasets using a combined BLAST and MEGAN approach.

    PubMed

    Yu, Ke; Zhang, Tong

    2013-01-01

    We developed a fast method to construct local sub-databases from the NCBI-nr database for the quick similarity search and annotation of huge metagenomic datasets based on BLAST-MEGAN approach. A three-step sub-database annotation pipeline (SAP) was further proposed to conduct the annotation in a much more time-efficient way which required far less computational capacity than the direct NCBI-nr database BLAST-MEGAN approach. The 1(st) BLAST of SAP was conducted using the original metagenomic dataset against the constructed sub-database for a quick screening of candidate target sequences. Then, the candidate target sequences identified in the 1(st) BLAST were subjected to the 2(nd) BLAST against the whole NCBI-nr database. The BLAST results were finally annotated using MEGAN to filter out those mistakenly selected sequences in the 1(st) BLAST to guarantee the accuracy of the results. Based on the tests conducted in this study, SAP achieved a speedup of ~150-385 times at the BLAST e-value of 1e-5, compared to the direct BLAST against NCBI-nr database. The annotation results of SAP are exactly in agreement with those of the direct NCBI-nr database BLAST-MEGAN approach, which is very time-consuming and computationally intensive. Selecting rigorous thresholds (e.g. e-value of 1e-10) would further accelerate SAP process. The SAP pipeline may also be coupled with novel similarity search tools (e.g. RAPsearch) other than BLAST to achieve even faster annotation of huge metagenomic datasets. Above all, this sub-database construction method and SAP pipeline provides a new time-efficient and convenient annotation similarity search strategy for laboratories without access to high performance computing facilities. SAP also offers a solution to high performance computing facilities for the processing of more similarity search tasks.

  6. Rapid Generation of Replication-Deficient Monovalent and Multivalent Vaccines for Bluetongue Virus: Protection against Virulent Virus Challenge in Cattle and Sheep

    PubMed Central

    Celma, Cristina C. P.; Boyce, Mark; van Rijn, Piet A.; Eschbaumer, Michael; Wernike, Kerstin; Hoffmann, Bernd; Beer, Martin; Haegeman, Andy; De Clercq, Kris

    2013-01-01

    Since 1998, 9 of the 26 serotypes of bluetongue virus (BTV) have spread throughout Europe, and serotype 8 has suddenly emerged in northern Europe, causing considerable economic losses, direct (mortality and morbidity) but also indirect, due to restriction in animal movements. Therefore, many new types of vaccines, particularly subunit vaccines, with improved safety and efficacy for a broad range of BTV serotypes are currently being developed by different laboratories. Here we exploited a reverse genetics-based replication-deficient BTV serotype 1 (BTV-1) (disabled infectious single cycle [DISC]) strain to generate a series of DISC vaccine strains. Cattle and sheep were vaccinated with these viruses either singly or in cocktail form as a multivalent vaccine candidate. All vaccinated animals were seroconverted and developed neutralizing antibody responses to their respective serotypes. After challenge with the virulent strains at 21 days postvaccination, vaccinated animals showed neither any clinical reaction nor viremia. Further, there was no interference with protection with a multivalent preparation of six distinct DISC viruses. These data indicate that a very-rapid-response vaccine could be developed based on which serotypes are circulating in the population at the time of an outbreak. PMID:23824810

  7. Multi-Locus Variable Number of Tandem Repeat Analysis for Rapid and Accurate Typing of Virulent Multidrug Resistant Escherichia coli Clones

    PubMed Central

    Naseer, Umaer; Olsson-Liljequist, Barbro E.; Woodford, Neil; Dhanji, Hiran; Cantón, Rafael; Sundsfjord, Arnfinn; Lindstedt, Bjørn-Arne

    2012-01-01

    One hundred E. coli isolates from Norway (n = 37), Sweden (n = 24), UK (n = 20) and Spain (n = 19), producing CTX-M-type - (n = 84), or SHV-12 (n = 4) extended spectrum β-lactamases, or the plasmid mediated AmpC, CMY-2 (n = 12), were typed using multi-locus sequence typing (MLST) and multi-locus variable number of tandem repeat analysis (MLVA). Isolates clustered into 33 Sequence Types (STs) and 14 Sequence Type Complexes (STCs), and 58 MLVA-Types (MTs) and 25 different MLVA-Type Complexes (MTCs). A strong agreement between the MLST profile and MLVA typing results was observed, in which all ST131-isolates (n = 39) and most of the STC-648 (n = 10), STC-38 (n = 9), STC-10 (n = 9), STC-405 (n = 8) and STC-23 (n = 6) isolates were clustered distinctly into MTC-29, -36, -20, -14, -10 and -39, respectively. MLVA is a rapid and accurate tool for genotyping isolates of globally disseminated virulent multidrug resistant E. coli lineages, including ST131. PMID:22859970

  8. Ranking Biomedical Annotations with Annotator's Semantic Relevancy

    PubMed Central

    2014-01-01

    Biomedical annotation is a common and affective artifact for researchers to discuss, show opinion, and share discoveries. It becomes increasing popular in many online research communities, and implies much useful information. Ranking biomedical annotations is a critical problem for data user to efficiently get information. As the annotator's knowledge about the annotated entity normally determines quality of the annotations, we evaluate the knowledge, that is, semantic relationship between them, in two ways. The first is extracting relational information from credible websites by mining association rules between an annotator and a biomedical entity. The second way is frequent pattern mining from historical annotations, which reveals common features of biomedical entities that an annotator can annotate with high quality. We propose a weighted and concept-extended RDF model to represent an annotator, a biomedical entity, and their background attributes and merge information from the two ways as the context of an annotator. Based on that, we present a method to rank the annotations by evaluating their correctness according to user's vote and the semantic relevancy between the annotator and the annotated entity. The experimental results show that the approach is applicable and efficient even when data set is large. PMID:24899918

  9. Ranking biomedical annotations with annotator's semantic relevancy.

    PubMed

    Wu, Aihua

    2014-01-01

    Biomedical annotation is a common and affective artifact for researchers to discuss, show opinion, and share discoveries. It becomes increasing popular in many online research communities, and implies much useful information. Ranking biomedical annotations is a critical problem for data user to efficiently get information. As the annotator's knowledge about the annotated entity normally determines quality of the annotations, we evaluate the knowledge, that is, semantic relationship between them, in two ways. The first is extracting relational information from credible websites by mining association rules between an annotator and a biomedical entity. The second way is frequent pattern mining from historical annotations, which reveals common features of biomedical entities that an annotator can annotate with high quality. We propose a weighted and concept-extended RDF model to represent an annotator, a biomedical entity, and their background attributes and merge information from the two ways as the context of an annotator. Based on that, we present a method to rank the annotations by evaluating their correctness according to user's vote and the semantic relevancy between the annotator and the annotated entity. The experimental results show that the approach is applicable and efficient even when data set is large.

  10. Bacillus anthracis-like bacteria and other B. cereus group members in a microbial community within the International Space Station: a challenge for rapid and easy molecular detection of virulent B. anthracis.

    PubMed

    van Tongeren, Sandra P; Roest, Hendrik I J; Degener, John E; Harmsen, Hermie J M

    2014-01-01

    For some microbial species, such as Bacillus anthracis, the etiologic agent of the disease anthrax, correct detection and identification by molecular methods can be problematic. The detection of virulent B. anthracis is challenging due to multiple virulence markers that need to be present in order for B. anthracis to be virulent and its close relationship to Bacillus cereus and other members of the B. cereus group. This is especially the case in environments where build-up of Bacillus spores can occur and several representatives of the B. cereus group may be present, which increases the chance for false-positives. In this study we show the presence of B. anthracis-like bacteria and other members of the B. cereus group in a microbial community within the human environment of the International Space Station and their preliminary identification by using conventional culturing as well as molecular techniques including 16S rDNA sequencing, PCR and real-time PCR. Our study shows that when monitoring the microbial hygiene in a given human environment, health risk assessment is troublesome in the case of virulent B. anthracis, especially if this should be done with rapid, easy to apply and on-site molecular methods.

  11. Bacillus anthracis-Like Bacteria and Other B. cereus Group Members in a Microbial Community Within the International Space Station: A Challenge for Rapid and Easy Molecular Detection of Virulent B. anthracis

    PubMed Central

    van Tongeren, Sandra P.; Roest, Hendrik I. J.; Degener, John E.; Harmsen, Hermie J. M.

    2014-01-01

    For some microbial species, such as Bacillus anthracis, the etiologic agent of the disease anthrax, correct detection and identification by molecular methods can be problematic. The detection of virulent B. anthracis is challenging due to multiple virulence markers that need to be present in order for B. anthracis to be virulent and its close relationship to Bacillus cereus and other members of the B. cereus group. This is especially the case in environments where build-up of Bacillus spores can occur and several representatives of the B. cereus group may be present, which increases the chance for false-positives. In this study we show the presence of B. anthracis-like bacteria and other members of the B. cereus group in a microbial community within the human environment of the International Space Station and their preliminary identification by using conventional culturing as well as molecular techniques including 16S rDNA sequencing, PCR and real-time PCR. Our study shows that when monitoring the microbial hygiene in a given human environment, health risk assessment is troublesome in the case of virulent B. anthracis, especially if this should be done with rapid, easy to apply and on-site molecular methods. PMID:24945323

  12. Computational algorithms to predict Gene Ontology annotations

    PubMed Central

    2015-01-01

    Background Gene function annotations, which are associations between a gene and a term of a controlled vocabulary describing gene functional features, are of paramount importance in modern biology. Datasets of these annotations, such as the ones provided by the Gene Ontology Consortium, are used to design novel biological experiments and interpret their results. Despite their importance, these sources of information have some known issues. They are incomplete, since biological knowledge is far from being definitive and it rapidly evolves, and some erroneous annotations may be present. Since the curation process of novel annotations is a costly procedure, both in economical and time terms, computational tools that can reliably predict likely annotations, and thus quicken the discovery of new gene annotations, are very useful. Methods We used a set of computational algorithms and weighting schemes to infer novel gene annotations from a set of known ones. We used the latent semantic analysis approach, implementing two popular algorithms (Latent Semantic Indexing and Probabilistic Latent Semantic Analysis) and propose a novel method, the Semantic IMproved Latent Semantic Analysis, which adds a clustering step on the set of considered genes. Furthermore, we propose the improvement of these algorithms by weighting the annotations in the input set. Results We tested our methods and their weighted variants on the Gene Ontology annotation sets of three model organism genes (Bos taurus, Danio rerio and Drosophila melanogaster ). The methods showed their ability in predicting novel gene annotations and the weighting procedures demonstrated to lead to a valuable improvement, although the obtained results vary according to the dimension of the input annotation set and the considered algorithm. Conclusions Out of the three considered methods, the Semantic IMproved Latent Semantic Analysis is the one that provides better results. In particular, when coupled with a proper

  13. MvirDB--a microbial database of protein toxins, virulence factors and antibiotic resistance genes for bio-defence applications.

    PubMed

    Zhou, C E; Smith, J; Lam, M; Zemla, A; Dyer, M D; Slezak, T

    2007-01-01

    Knowledge of toxins, virulence factors and antibiotic resistance genes is essential for bio-defense applications aimed at identifying 'functional' signatures for characterizing emerging or engineered pathogens. Whereas genetic signatures identify a pathogen, functional signatures identify what a pathogen is capable of. To facilitate rapid identification of sequences and characterization of genes for signature discovery, we have collected all publicly available (as of this writing), organized sequences representing known toxins, virulence factors, and antibiotic resistance genes in one convenient database, which we believe will be of use to the bio-defense research community. MvirDB integrates DNA and protein sequence information from Tox-Prot, SCORPION, the PRINTS virulence factors, VFDB, TVFac, Islander, ARGO and a subset of VIDA. Entries in MvirDB are hyperlinked back to their original sources. A blast tool allows the user to blast against all DNA or protein sequences in MvirDB, and a browser tool allows the user to search the database to retrieve virulence factor descriptions, sequences, and classifications, and to download sequences of interest. MvirDB has an automated weekly update mechanism. Each protein sequence in MvirDB is annotated using our fully automated protein annotation system and is linked to that system's browser tool. MvirDB can be accessed at http://mvirdb.llnl.gov/.

  14. Mining GO annotations for improving annotation consistency.

    PubMed

    Faria, Daniel; Schlicker, Andreas; Pesquita, Catia; Bastos, Hugo; Ferreira, António E N; Albrecht, Mario; Falcão, André O

    2012-01-01

    Despite the structure and objectivity provided by the Gene Ontology (GO), the annotation of proteins is a complex task that is subject to errors and inconsistencies. Electronically inferred annotations in particular are widely considered unreliable. However, given that manual curation of all GO annotations is unfeasible, it is imperative to improve the quality of electronically inferred annotations. In this work, we analyze the full GO molecular function annotation of UniProtKB proteins, and discuss some of the issues that affect their quality, focusing particularly on the lack of annotation consistency. Based on our analysis, we estimate that 64% of the UniProtKB proteins are incompletely annotated, and that inconsistent annotations affect 83% of the protein functions and at least 23% of the proteins. Additionally, we present and evaluate a data mining algorithm, based on the association rule learning methodology, for identifying implicit relationships between molecular function terms. The goal of this algorithm is to assist GO curators in updating GO and correcting and preventing inconsistent annotations. Our algorithm predicted 501 relationships with an estimated precision of 94%, whereas the basic association rule learning methodology predicted 12,352 relationships with a precision below 9%.

  15. Bacterial genome annotation.

    PubMed

    Beckloff, Nicholas; Starkenburg, Shawn; Freitas, Tracey; Chain, Patrick

    2012-01-01

    Annotation of prokaryotic sequences can be separated into structural and functional annotation. Structural annotation is dependent on algorithmic interrogation of experimental evidence to discover the physical characteristics of a gene. This is done in an effort to construct accurate gene models, so understanding function or evolution of genes among organisms is not impeded. Functional annotation is dependent on sequence similarity to other known genes or proteins in an effort to assess the function of the gene. Combining structural and functional annotation across genomes in a comparative manner promotes higher levels of accurate annotation as well as an advanced understanding of genome evolution. As the availability of bacterial sequences increases and annotation methods improve, the value of comparative annotation will increase.

  16. RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes.

    PubMed

    Brettin, Thomas; Davis, James J; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Olsen, Gary J; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D; Shukla, Maulik; Thomason, James A; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R; Xia, Fangfang

    2015-02-10

    The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.

  17. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes

    PubMed Central

    Brettin, Thomas; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Olsen, Gary J.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Thomason, James A.; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang

    2015-01-01

    The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception. PMID:25666585

  18. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes

    SciTech Connect

    Brettin, Thomas; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Olsen, Gary J.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Thomason, III, James A.; Stevens, Rick; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang

    2015-02-10

    The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.

  19. Dynamic multimedia annotation tool

    NASA Astrophysics Data System (ADS)

    Pfund, Thomas; Marchand-Maillet, Stephane

    2001-12-01

    Annotating image collections is crucial for different multimedia applications. Not only this provides an alternative access to visual information but it is a critical step to perform the evaluation of content-based image retrieval systems. Annotation is a tedious task so that there is a real need for developing tools that lighten the work of annotators. The tool should be flexible and offer customization so as to make the annotator the most comfortable. It should also automate the most tasks as possible. In this paper, we present a still image annotation tool that has been developed with the aim of being flexible and adaptive. The principle is to create a set of dynamic web pages that are an interface to a SQL database. The keyword set is fixed and every image receives from concurrent annotators a set of keywords along with time stamps and annotator Ids. Each annotator has the possibility of going back and forth within the collection and its previous annotations. He is helped by a number of search services and customization options. An administrative section allows the supervisor to control the parameter of the annotation, including the keyword set, given via an XML structure. The architecture of the tool is made flexible so as to accommodate further options through its development.

  20. Computing human image annotation.

    PubMed

    Channin, David S; Mongkolwat, Pattanasak; Kleper, Vladimir; Rubin, Daniel L

    2009-01-01

    An image annotation is the explanatory or descriptive information about the pixel data of an image that is generated by a human (or machine) observer. An image markup is the graphical symbols placed over the image to depict an annotation. In the majority of current, clinical and research imaging practice, markup is captured in proprietary formats and annotations are referenced only in free text radiology reports. This makes these annotations difficult to query, retrieve and compute upon, hampering their integration into other data mining and analysis efforts. This paper describes the National Cancer Institute's Cancer Biomedical Informatics Grid's (caBIG) Annotation and Image Markup (AIM) project, focusing on how to use AIM to query for annotations. The AIM project delivers an information model for image annotation and markup. The model uses controlled terminologies for important concepts. All of the classes and attributes of the model have been harmonized with the other models and common data elements in use at the National Cancer Institute. The project also delivers XML schemata necessary to instantiate AIMs in XML as well as a software application for translating AIM XML into DICOM S/R and HL7 CDA. Large collections of AIM annotations can be built and then queried as Grid or Web services. Using the tools of the AIM project, image annotations and their markup can be captured and stored in human and machine readable formats. This enables the inclusion of human image observation and inference as part of larger data mining and analysis activities.

  1. Galileo Reader and Annotator

    NASA Astrophysics Data System (ADS)

    Besomi, O.

    2011-06-01

    In his readings, Galileo made frequent use of annotations. Here, I will offer a general glance at them by discussing the case of the annotations to the Libra astronomica published in 1619 by Orazio Grassi, a Jesuit mathematician of the Collegio Romano. The annotations directly reflect Galileo's reaction to Grassi's book in a heated debate between the two astronomers. Galileo and Grassi had opposite ideas about the nature of the comets, which resulted in different scientific and theological implications. The annotations represent the starting point for Galileo's reply to the Libra, namely Il Saggiatore, which was published four years later and dedicated to the new pope Urban VIII.

  2. Annotated Humanities Programs.

    ERIC Educational Resources Information Center

    Adler, Richard R.; Applebee, Arthur

    The humanities programs offered in 1968 by 227 United States secondary schools are listed alphabetically by state, including almost 100 new programs not annotated in the 1967 listing (see TE 000 224). Each annotation presents a brief description of the approach to study used in the particular humanities course (e.g., American Studies, Culture…

  3. SEED Software Annotations.

    ERIC Educational Resources Information Center

    Bethke, Dee; And Others

    This document provides a composite index of the first five sets of software annotations produced by Project SEED. The software has been indexed by title, subject area, and grade level, and it covers sets of annotations distributed in September 1986, April 1987, September 1987, November 1987, and February 1988. The date column in the index…

  4. Complete genome sequence of a virulent Streptococcus agalactiae strain 138P isolated from disease Nile tilapia

    USDA-ARS?s Scientific Manuscript database

    The complete genome of a virulent Streptococcus agalactiae strain 138P is 1838701 bp in size, containing 1831 genes. The genome has 1593 coding sequences, 152 pseudo genes, 16 rRNAs, 69 tRNAs, and 1 non-coding RNA. The annotation of the genome is added by the NCBI Prokaryotic Genome Annotation Pipel...

  5. yrGATE: a web-based gene-structure annotation tool for the identification and dissemination of eukaryotic genes.

    PubMed

    Wilkerson, Matthew D; Schlueter, Shannon D; Brendel, Volker

    2006-01-01

    Your Gene structure Annotation Tool for Eukaryotes (yrGATE) provides an Annotation Tool and Community Utilities for worldwide web-based community genome and gene annotation. Annotators can evaluate gene structure evidence derived from multiple sources to create gene structure annotations. Administrators regulate the acceptance of annotations into published gene sets. yrGATE is designed to facilitate rapid and accurate annotation of emerging genomes as well as to confirm, refine, or correct currently published annotations. yrGATE is highly portable and supports different standard input and output formats. The yrGATE software and usage cases are available at http://www.plantgdb.org/prj/yrGATE.

  6. yrGATE: a web-based gene-structure annotation tool for the identification and dissemination of eukaryotic genes

    PubMed Central

    Wilkerson, Matthew D; Schlueter, Shannon D; Brendel, Volker

    2006-01-01

    Your Gene structure Annotation Tool for Eukaryotes (yrGATE) provides an Annotation Tool and Community Utilities for worldwide web-based community genome and gene annotation. Annotators can evaluate gene structure evidence derived from multiple sources to create gene structure annotations. Administrators regulate the acceptance of annotations into published gene sets. yrGATE is designed to facilitate rapid and accurate annotation of emerging genomes as well as to confirm, refine, or correct currently published annotations. yrGATE is highly portable and supports different standard input and output formats. The yrGATE software and usage cases are available at . PMID:16859520

  7. Dictionary-driven protein annotation

    PubMed Central

    Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

    2002-01-01

    Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were

  8. Dictionary-driven protein annotation.

    PubMed

    Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel

    2002-09-01

    Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were

  9. Novel extracellular chitinases rapidly and specifically induced by general bacterial elicitors and suppressed by virulent bacteria as a marker of early basal resistance in tobacco.

    PubMed

    Ott, Péter G; Varga, Gabriella J; Szatmári, Agnes; Bozsó, Zoltan; Klement, Eva; Medzihradszky, Katalin F; Besenyei, Eszter; Czelleng, Arnold; Klement, Zoltán

    2006-02-01

    Early basal resistance (EBR, formerly known as early induced resistance) is triggered by general bacterial elicitors. EBR has been suggested to inhibit or retard expression of the type III secretion system of pathogenic bacteria and may also prevent nonpathogenic bacteria from colonizing the plant tissue. The quickness of EBR here plays a crucial role, compensating for a low bactericidal efficacy. This inhibitory activity should take place in the cell wall, as bacteria do not enter living plant cells. We found several soluble proteins in the intercellular fluid of tobacco leaf parenchyma that coincided with EBR under different environmental (light and temperature) conditions known to affect EBR. The two most prominent proteins proved to be novel chitinases (EC 3.2.1.14) that were transcriptionally induced before and during EBR development. Their expression in the apoplast was fast and not stress-regulated as opposed to many pathogenesis-related proteins. Nonpathogenic, saprophytic, and avirulent bacteria all induced EBR and the chitinases. Studies using these chitinases as EBR markers revealed that the virulent Pseudomonas syringae pv. tabaci, being sensitive to EBR, must suppress it while suppressing the chitinases. EBR, the chitinases, as well as their suppression are quantitatively related, implying a delicate balance determining the outcome of an infection.

  10. K-Nearest Neighbors Relevance Annotation Model for Distance Education

    ERIC Educational Resources Information Center

    Ke, Xiao; Li, Shaozi; Cao, Donglin

    2011-01-01

    With the rapid development of Internet technologies, distance education has become a popular educational mode. In this paper, the authors propose an online image automatic annotation distance education system, which could effectively help children learn interrelations between image content and corresponding keywords. Image automatic annotation is…

  11. K-Nearest Neighbors Relevance Annotation Model for Distance Education

    ERIC Educational Resources Information Center

    Ke, Xiao; Li, Shaozi; Cao, Donglin

    2011-01-01

    With the rapid development of Internet technologies, distance education has become a popular educational mode. In this paper, the authors propose an online image automatic annotation distance education system, which could effectively help children learn interrelations between image content and corresponding keywords. Image automatic annotation is…

  12. Rapid multiplex PCR and Real-Time TaqMan PCR assays for detection of Salmonella enterica and the highly virulent serovars Choleraesuis and Paratyphi C

    USDA-ARS?s Scientific Manuscript database

    Salmonella enterica is a human pathogen with over 2,500 serovars characterized. S. enterica serovars Choleraesuis (Cs) and Paratyphi C (Pc) are two globally distributed serovars. We have developed a rapid molecular typing method to detect Cs and Pc in food samples by using a comparative genomics ap...

  13. O-antigen and virulence profiling of Shiga toxin-producing Escherichia coli by a rapid and cost-effective DNA microarray colorimetric method

    USDA-ARS?s Scientific Manuscript database

    Shiga toxin-producing Escherichia coli (STEC) is a leading cause of foodborne illness worldwide. To evaluate better methods to rapidly detect and genotype Shiga toxin-producing Escherichia coli strains, the present study evaluated the use of the ampliPHOX colorimetric detection technology, based on ...

  14. An annotated energy bibliography

    NASA Technical Reports Server (NTRS)

    Blow, S. J.

    1979-01-01

    Comprehensive annotated compilation of books, journals, periodicals, and reports on energy and energy related topics, contains approximately 10,0000 tehcnical and nontechnical references from bibliographic and other sources dated January 1975 through May 1977.

  15. An annotated energy bibliography

    NASA Technical Reports Server (NTRS)

    Blow, S. J.

    1979-01-01

    Comprehensive annotated compilation of books, journals, periodicals, and reports on energy and energy related topics, contains approximately 10,0000 tehcnical and nontechnical references from bibliographic and other sources dated January 1975 through May 1977.

  16. An Introduction to Genome Annotation.

    PubMed

    Campbell, Michael S; Yandell, Mark

    2015-12-17

    Genome projects have evolved from large international undertakings to tractable endeavors for a single lab. Accurate genome annotation is critical for successful genomic, genetic, and molecular biology experiments. These annotations can be generated using a number of approaches and available software tools. This unit describes methods for genome annotation and a number of software tools commonly used in gene annotation.

  17. Annotation and visualization of endogenous retroviral sequences using the Distributed Annotation System (DAS) and eBioX

    PubMed Central

    Martínez Barrio, Álvaro; Lagercrantz, Erik; Sperber, Göran O; Blomberg, Jonas; Bongcam-Rudloff, Erik

    2009-01-01

    Background The Distributed Annotation System (DAS) is a widely used network protocol for sharing biological information. The distributed aspects of the protocol enable the use of various reference and annotation servers for connecting biological sequence data to pertinent annotations in order to depict an integrated view of the data for the final user. Results An annotation server has been devised to provide information about the endogenous retroviruses detected and annotated by a specialized in silico tool called RetroTector. We describe the procedure to implement the DAS 1.5 protocol commands necessary for constructing the DAS annotation server. We use our server to exemplify those steps. Data distribution is kept separated from visualization which is carried out by eBioX, an easy to use open source program incorporating multiple bioinformatics utilities. Some well characterized endogenous retroviruses are shown in two different DAS clients. A rapid analysis of areas free from retroviral insertions could be facilitated by our annotations. Conclusion The DAS protocol has shown to be advantageous in the distribution of endogenous retrovirus data. The distributed nature of the protocol is also found to aid in combining annotation and visualization along a genome in order to enhance the understanding of ERV contribution to its evolution. Reference and annotation servers are conjointly used by eBioX to provide visualization of ERV annotations as well as other data sources. Our DAS data source can be found in the central public DAS service repository, , or at . PMID:19534743

  18. Protein Sequence Annotation Tool (PSAT): A centralized web-based meta-server for high-throughput sequence annotations

    SciTech Connect

    Leung, Elo; Huang, Amy; Cadag, Eithon; Montana, Aldrin; Soliman, Jan Lorenz; Zhou, Carol L. Ecale

    2016-01-20

    In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resulting functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.

  19. Protein Sequence Annotation Tool (PSAT): A centralized web-based meta-server for high-throughput sequence annotations

    DOE PAGES

    Leung, Elo; Huang, Amy; Cadag, Eithon; ...

    2016-01-20

    In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less

  20. MannDB: A microbial annotation database for protein characterization

    SciTech Connect

    Zhou, C; Lam, M; Smith, J; Zemla, A; Dyer, M; Kuczmarski, T; Vitalis, E; Slezak, T

    2006-05-19

    MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. MannDB is a relational database that organizes data resulting from fully automated, high-throughput protein-sequence analyses using open-source tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins) are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, is performed. A web client browser enables viewing of computational results and downloaded annotations, and a query tool enables structured and free-text search capabilities. When available, links to external databases, including MvirDB, are provided. MannDB contains whole-proteome analyses for at least one representative organism from each category of biological threat organism listed by APHIS, CDC, HHS, NIAID, USDA, USFDA, and WHO. MannDB comprises a large number of genomes and comprehensive protein sequence analyses representing organisms listed as high

  1. Semantic Annotation of Mutable Data

    PubMed Central

    Morris, Robert A.; Dou, Lei; Hanken, James; Kelly, Maureen; Lowery, David B.; Ludäscher, Bertram; Macklin, James A.; Morris, Paul J.

    2013-01-01

    Electronic annotation of scientific data is very similar to annotation of documents. Both types of annotation amplify the original object, add related knowledge to it, and dispute or support assertions in it. In each case, annotation is a framework for discourse about the original object, and, in each case, an annotation needs to clearly identify its scope and its own terminology. However, electronic annotation of data differs from annotation of documents: the content of the annotations, including expectations and supporting evidence, is more often shared among members of networks. Any consequent actions taken by the holders of the annotated data could be shared as well. But even those current annotation systems that admit data as their subject often make it difficult or impossible to annotate at fine-enough granularity to use the results in this way for data quality control. We address these kinds of issues by offering simple extensions to an existing annotation ontology and describe how the results support an interest-based distribution of annotations. We are using the result to design and deploy a platform that supports annotation services overlaid on networks of distributed data, with particular application to data quality control. Our initial instance supports a set of natural science collection metadata services. An important application is the support for data quality control and provision of missing data. A previous proof of concept demonstrated such use based on data annotations modeled with XML-Schema. PMID:24223697

  2. Rapid multiplex PCR and real-time TaqMan PCR assays for detection of Salmonella enterica and the highly virulent serovars Choleraesuis and Paratyphi C.

    PubMed

    Woods, David F; Reen, F Jerry; Gilroy, Deirdre; Buckley, Jim; Frye, Jonathan G; Boyd, E Fidelma

    2008-12-01

    Salmonella enterica is a human pathogen with over 2,500 serovars characterized. S. enterica serovars Choleraesuis and Paratyphi C are two globally distributed serovars. We have developed a rapid molecular-typing method to detect serovars Choleraesuis and Paratyphi C in food samples by using a comparative-genomics approach to identify regions unique to each serovar from the sequenced genomes. A Salmonella-specific primer pair based on oriC was designed as an internal control to establish accuracy, sensitivity, and reproducibility. Serovar-specific primer sets based on regions of difference between serovars Choleraesuis and Paratyphi C were designed for real-time PCR assays. Three primer sets were used to screen a collection of over 100 Salmonella strains, and both serovars Choleraesuis and Paratyphi C gave unique amplification patterns. To develop the technique for practical use, its sensitivity for detection of Salmonella spp. in a food matrix was determined by spiking experiments. The technique was also adapted for a real-time PCR rapid-detection assay for both serovars Choleraesuis and Paratyphi C that complements the current procedures for Salmonella sp. isolation and serotyping.

  3. Rapid identification of Salmonella serovars in feces by specific detection of virulence genes, invA and spvC, by an enrichment broth culture-multiplex PCR combination assay.

    PubMed Central

    Chiu, C H; Ou, J T

    1996-01-01

    In order to make a rapid and definite diagnosis of Salmonella enteritis in children, an enrichment broth culture-multiplex PCR combination assay was devised to identify Salmonella serovars directly from fecal samples. Two pairs of oligonucleotide primers were prepared according to the sequences of the chromosomal invA and plasmid spvC genes. PCR with these two primers would produce either one amplicon (from the invA gene) or two amplicons (from the invA and spvC genes), depending on whether or not the Salmonella bacteria contained a virulence plasmid. The fecal sample was diluted 10- to 20-fold into gram-negative enrichment broth and incubated to eliminate inhibitory compounds and also to allow selective enrichment of the bacteria. One or two amplicons were obtained, the expected result if Salmonella bacteria were present. The detection limit of this PCR was about 200 bacteria per reaction mixture. The primers were specific, as no amplification products were obtained with 18 species and 22 isolates of non-Salmonella bacteria tested which could be present in the feces or cause contamination. In contrast, when 23 commonly seen Salmonella serovars (38 isolates) were tested, all were shown to carry the invA gene and seven concomitantly harbored the spvC gene of the virulence plasmid. This assay was applied to the diagnosis of Salmonella enteritis in 57 children who were suffering from mucoid and/or bloody diarrhea. Of the 57 children, 38 were PCR positive and 22 were culture positive. There were two culture-positive samples that were not detected by PCR. Thus, this PCR assay showed an efficiency of 95% (38 of 40), which is much higher than the 60% (24 of 40) by culture alone. Not only is this method more sensitive, rapid, and efficient but it will cause only an incremental increase in the cost of stool processing, since enrichment cultivation of fecal samples from diarrheal patients using gram-negative enrichment broth is a routine practice for identification in many

  4. Rapid, high-throughput identification of anthrax-causing and emetic Bacillus cereus group genome assemblies using BTyper, a computational tool for virulence-based classification of Bacillus cereus group isolates using nucleotide sequencing data.

    PubMed

    Carroll, Laura M; Kovac, Jasna; Miller, Rachel A; Wiedmann, Martin

    2017-06-16

    The Bacillus cereus group comprises nine species, several of which are pathogenic. Differentiating between isolates that may cause disease and those that do not is a matter of public health and economic importance, but can be particularly challenging due to the high genomic similarity of the group. To this end, we have developed BTyper, a computational tool that employs a combination of (i) virulence gene-based typing, (ii) multi-locus sequence typing (MLST), (iii) panC clade typing, and (iv) rpoB allelic typing to rapidly classify B. cereus group isolates using nucleotide sequencing data. BTyper was applied to a set of 662 B. cereus group genome assemblies to (i) identify anthrax-associated genes in non-B. anthracis members of the B. cereus group, and (iI) identify assemblies from B. cereus group strains with emetic potential. With BTyper, anthrax toxin genes cya, lef and pagA were detected in 8 genomes classified in NCBI as B. cereus that clustered into two distinct groups using k-medoids clustering, while B. anthracis poly-γ-D-glutamate capsule biosynthesis genes capABCDE or hyaluronic acid capsule gene hasA were detected in an additional 16 assemblies classified as either B. cereus or B. thuringiensis isolated from clinical, environmental, and food sources. Emetic toxin genes cesABCD were detected in 24 assemblies belonging to panC clades III and VI that had been isolated from food, clinical, and environmental settings. The command line version of BTyper is available at https://github.com/lmc297/BTyper In addition, BMiner, a companion application for analyzing multiple BTyper output files in aggregate, can be found at https://github.com/lmc297/BMinerImportanceBacillus cereus is a foodborne pathogen that is estimated to cause tens of thousands of illnesses each year in the United States alone. Even with molecular methods, it can be difficult to distinguish non-pathogenic B. cereus group isolates from their pathogenic counterparts, including the human pathogen B

  5. Rapid, High-Throughput Identification of Anthrax-Causing and Emetic Bacillus cereus Group Genome Assemblies via BTyper, a Computational Tool for Virulence-Based Classification of Bacillus cereus Group Isolates by Using Nucleotide Sequencing Data

    PubMed Central

    Carroll, Laura M.; Miller, Rachel A.; Wiedmann, Martin

    2017-01-01

    ABSTRACT The Bacillus cereus group comprises nine species, several of which are pathogenic. Differentiating between isolates that may cause disease and those that do not is a matter of public health and economic importance, but it can be particularly challenging due to the high genomic similarity within the group. To this end, we have developed BTyper, a computational tool that employs a combination of (i) virulence gene-based typing, (ii) multilocus sequence typing (MLST), (iii) panC clade typing, and (iv) rpoB allelic typing to rapidly classify B. cereus group isolates using nucleotide sequencing data. BTyper was applied to a set of 662 B. cereus group genome assemblies to (i) identify anthrax-associated genes in non-B. anthracis members of the B. cereus group, and (ii) identify assemblies from B. cereus group strains with emetic potential. With BTyper, the anthrax toxin genes cya, lef, and pagA were detected in 8 genomes classified by the NCBI as B. cereus that clustered into two distinct groups using k-medoids clustering, while either the B. anthracis poly-γ-d-glutamate capsule biosynthesis genes capABCDE or the hyaluronic acid capsule hasA gene was detected in an additional 16 assemblies classified as either B. cereus or Bacillus thuringiensis isolated from clinical, environmental, and food sources. The emetic toxin genes cesABCD were detected in 24 assemblies belonging to panC clades III and VI that had been isolated from food, clinical, and environmental settings. The command line version of BTyper is available at https://github.com/lmc297/BTyper. In addition, BMiner, a companion application for analyzing multiple BTyper output files in aggregate, can be found at https://github.com/lmc297/BMiner. IMPORTANCE Bacillus cereus is a foodborne pathogen that is estimated to cause tens of thousands of illnesses each year in the United States alone. Even with molecular methods, it can be difficult to distinguish nonpathogenic B. cereus group isolates from their

  6. Algal functional annotation tool

    SciTech Connect

    2012-07-12

    Abstract BACKGROUND: Progress in genome sequencing is proceeding at an exponential pace, and several new algal genomes are becoming available every year. One of the challenges facing the community is the association of protein sequences encoded in the genomes with biological function. While most genome assembly projects generate annotations for predicted protein sequences, they are usually limited and integrate functional terms from a limited number of databases. Another challenge is the use of annotations to interpret large lists of 'interesting' genes generated by genome-scale datasets. Previously, these gene lists had to be analyzed across several independent biological databases, often on a gene-by-gene basis. In contrast, several annotation databases, such as DAVID, integrate data from multiple functional databases and reveal underlying biological themes of large gene lists. While several such databases have been constructed for animals, none is currently available for the study of algae. Due to renewed interest in algae as potential sources of biofuels and the emergence of multiple algal genome sequences, a significant need has arisen for such a database to process the growing compendiums of algal genomic data. DESCRIPTION: The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of genes on KEGG

  7. Human Genome Annotation

    NASA Astrophysics Data System (ADS)

    Gerstein, Mark

    A central problem for 21st century science is annotating the human genome and making this annotation useful for the interpretation of personal genomes. My talk will focus on annotating the 99% of the genome that does not code for canonical genes, concentrating on intergenic features such as structural variants (SVs), pseudogenes (protein fossils), binding sites, and novel transcribed RNAs (ncRNAs). In particular, I will describe how we identify regulatory sites and variable blocks (SVs) based on processing next-generation sequencing experiments. I will further explain how we cluster together groups of sites to create larger annotations. Next, I will discuss a comprehensive pseudogene identification pipeline, which has enabled us to identify >10K pseudogenes in the genome and analyze their distribution with respect to age, protein family, and chromosomal location. Throughout, I will try to introduce some of the computational algorithms and approaches that are required for genome annotation. Much of this work has been carried out in the framework of the ENCODE, modENCODE, and 1000 genomes projects.

  8. Evaluating Computational Gene Ontology Annotations.

    PubMed

    Škunca, Nives; Roberts, Richard J; Steffen, Martin

    2017-01-01

    Two avenues to understanding gene function are complementary and often overlapping: experimental work and computational prediction. While experimental annotation generally produces high-quality annotations, it is low throughput. Conversely, computational annotations have broad coverage, but the quality of annotations may be variable, and therefore evaluating the quality of computational annotations is a critical concern.In this chapter, we provide an overview of strategies to evaluate the quality of computational annotations. First, we discuss why evaluating quality in this setting is not trivial. We highlight the various issues that threaten to bias the evaluation of computational annotations, most of which stem from the incompleteness of biological databases. Second, we discuss solutions that address these issues, for example, targeted selection of new experimental annotations and leveraging the existing experimental annotations.

  9. The GATO gene annotation tool for research laboratories.

    PubMed

    Fujita, A; Massirer, K B; Durham, A M; Ferreira, C E; Sogayar, M C

    2005-11-01

    Large-scale genome projects have generated a rapidly increasing number of DNA sequences. Therefore, development of computational methods to rapidly analyze these sequences is essential for progress in genomic research. Here we present an automatic annotation system for preliminary analysis of DNA sequences. The gene annotation tool (GATO) is a Bioinformatics pipeline designed to facilitate routine functional annotation and easy access to annotated genes. It was designed in view of the frequent need of genomic researchers to access data pertaining to a common set of genes. In the GATO system, annotation is generated by querying some of the Web-accessible resources and the information is stored in a local database, which keeps a record of all previous annotation results. GATO may be accessed from everywhere through the internet or may be run locally if a large number of sequences are going to be annotated. It is implemented in PHP and Perl and may be run on any suitable Web server. Usually, installation and application of annotation systems require experience and are time consuming, but GATO is simple and practical, allowing anyone with basic skills in informatics to access it without any special training. GATO can be downloaded at [http://mariwork.iq.usp.br/gato/]. Minimum computer free space required is 2 MB.

  10. Correction of the Caulobacter crescentus NA1000 genome annotation.

    PubMed

    Ely, Bert; Scott, LaTia Etheredge

    2014-01-01

    Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.

  11. Ontology-Based Prediction and Prioritization of Gene Functional Annotations.

    PubMed

    Chicco, Davide; Masseroli, Marco

    2016-01-01

    Genes and their protein products are essential molecular units of a living organism. The knowledge of their functions is key for the understanding of physiological and pathological biological processes, as well as in the development of new drugs and therapies. The association of a gene or protein with its functions, described by controlled terms of biomolecular terminologies or ontologies, is named gene functional annotation. Very many and valuable gene annotations expressed through terminologies and ontologies are available. Nevertheless, they might include some erroneous information, since only a subset of annotations are reviewed by curators. Furthermore, they are incomplete by definition, given the rapidly evolving pace of biomolecular knowledge. In this scenario, computational methods that are able to quicken the annotation curation process and reliably suggest new annotations are very important. Here, we first propose a computational pipeline that uses different semantic and machine learning methods to predict novel ontology-based gene functional annotations; then, we introduce a new semantic prioritization rule to categorize the predicted annotations by their likelihood of being correct. Our tests and validations proved the effectiveness of our pipeline and prioritization of predicted annotations, by selecting as most likely manifold predicted annotations that were later confirmed.

  12. Algal functional annotation tool

    SciTech Connect

    Lopez, D.; Casero, D.; Cokus, S. J.; Merchant, S. S.; Pellegrini, M.

    2012-07-01

    The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of genes on KEGG pathway maps and batch gene identifier conversion.

  13. Re-Annotator: Annotation Pipeline for Microarray Probe Sequences.

    PubMed

    Arloth, Janine; Bader, Daniel M; Röh, Simone; Altmann, Andre

    2015-01-01

    Microarray technologies are established approaches for high throughput gene expression, methylation and genotyping analysis. An accurate mapping of the array probes is essential to generate reliable biological findings. However, manufacturers of the microarray platforms typically provide incomplete and outdated annotation tables, which often rely on older genome and transcriptome versions that differ substantially from up-to-date sequence databases. Here, we present the Re-Annotator, a re-annotation pipeline for microarray probe sequences. It is primarily designed for gene expression microarrays but can also be adapted to other types of microarrays. The Re-Annotator uses a custom-built mRNA reference database to identify the positions of gene expression array probe sequences. We applied Re-Annotator to the Illumina Human-HT12 v4 microarray platform and found that about one quarter (25%) of the probes differed from the manufacturer's annotation. In further computational experiments on experimental gene expression data, we compared Re-Annotator to another probe re-annotation tool, ReMOAT, and found that Re-Annotator provided an improved re-annotation of microarray probes. A thorough re-annotation of probe information is crucial to any microarray analysis. The Re-Annotator pipeline is freely available at http://sourceforge.net/projects/reannotator along with re-annotated files for Illumina microarrays HumanHT-12 v3/v4 and MouseRef-8 v2.

  14. Injectors and Annotations

    NASA Technical Reports Server (NTRS)

    Filman, Robert E.

    2004-01-01

    In a previous paper, we presented the Object Infrastructure Framework. The goal of that system is to simplify the creation of distributed applications. The primary claim of that work is that non-functional 'ilities' could be achieved by controlling and manipulating the communications between components, thereby simplifying the development of distributed systems. A secondary element of that paper is to argue for extending the conventional distributed objects model in two important ways: 1) The ability to insert injectors (filters, wrappers) into the communication path between components; 2) The ability to annotate communications with additional information, and to propagate these annotations through an application. Here we express the descriptions of that paper.

  15. Modeling loosely annotated images using both given and imagined annotations

    NASA Astrophysics Data System (ADS)

    Tang, Hong; Boujemaa, Nozha; Chen, Yunhao; Deng, Lei

    2011-12-01

    In this paper, we present an approach to learn latent semantic analysis models from loosely annotated images for automatic image annotation and indexing. The given annotation in training images is loose due to: 1. ambiguous correspondences between visual features and annotated keywords; 2. incomplete lists of annotated keywords. The second reason motivates us to enrich the incomplete annotation in a simple way before learning a topic model. In particular, some ``imagined'' keywords are poured into the incomplete annotation through measuring similarity between keywords in terms of their co-occurrence. Then, both given and imagined annotations are employed to learn probabilistic topic models for automatically annotating new images. We conduct experiments on two image databases (i.e., Corel and ESP) coupled with their loose annotations, and compare the proposed method with state-of-the-art discrete annotation methods. The proposed method improves word-driven probability latent semantic analysis (PLSA-words) up to a comparable performance with the best discrete annotation method, while a merit of PLSA-words is still kept, i.e., a wider semantic range.

  16. Cheating. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Wildemuth, Barbara M., Comp.

    This 89-item, annotated bibliography was compiled to provide access to research and discussions of cheating and, specifically, cheating on tests. It is not limited to any educational level, nor is it confined to any specific curriculum area. Two data bases were searched by computer, and a library search was conducted. A computer search of the…

  17. Automated Microbial Genome Annotation

    SciTech Connect

    Land, Miriam

    2009-05-29

    Miriam Land of the DOE Joint Genome Institute at Oak Ridge National Laboratory gives a talk on the current state and future challenges of moving toward automated microbial genome annotation at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  18. Annotation: The Savant Syndrome

    ERIC Educational Resources Information Center

    Heaton, Pamela; Wallace, Gregory L.

    2004-01-01

    Background: Whilst interest has focused on the origin and nature of the savant syndrome for over a century, it is only within the past two decades that empirical group studies have been carried out. Methods: The following annotation briefly reviews relevant research and also attempts to address outstanding issues in this research area.…

  19. Ghostwriting: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Simmons, Donald B.

    Drawn from communication journals, historical and news magazines, business and industrial magazines, political science and world affairs journals, general interest periodicals, and literary and political review magazines, the approximately 90 entries in this annotated bibliography discuss ghostwriting as practiced through the ages and reveal the…

  20. Annotated Bibliography. First Edition.

    ERIC Educational Resources Information Center

    Haring, Norris G.

    An annotated bibliography which presents approximately 300 references from 1951 to 1973 on the education of severely/profoundly handicapped persons. Citations are grouped alphabetically by author's name within the following categories: characteristics and treatment, gross motor development, sensory and motor development, physical therapy for the…

  1. Ghostwriting: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Simmons, Donald B.

    Drawn from communication journals, historical and news magazines, business and industrial magazines, political science and world affairs journals, general interest periodicals, and literary and political review magazines, the approximately 90 entries in this annotated bibliography discuss ghostwriting as practiced through the ages and reveal the…

  2. Mechanisms and evolution of virulence in oomycetes.

    PubMed

    Jiang, Rays H Y; Tyler, Brett M

    2012-01-01

    Many destructive diseases of plants and animals are caused by oomycetes, a group of eukaryotic pathogens important to agricultural, ornamental, and natural ecosystems. Understanding the mechanisms underlying oomycete virulence and the genomic processes by which those mechanisms rapidly evolve is essential to developing effective long-term control measures for oomycete diseases. Several common mechanisms underlying oomycete virulence, including protein toxins and cell-entering effectors, have emerged from comparing oomycetes with different genome characteristics, parasitic lifestyles, and host ranges. Oomycete genomes display a strongly bipartite organization in which conserved housekeeping genes are concentrated in syntenic gene-rich blocks, whereas virulence genes are dispersed into highly dynamic, repeat-rich regions. There is also evidence that key virulence genes have been acquired by horizontal transfer from other eukaryotic and prokaryotic species.

  3. Transcriptional profiles of virulent and precocious strains of Eimeria tenella at sporozoite stage; novel biological insight into attenuated asexual development.

    PubMed

    Matsubayashi, Makoto; Kawahara, Fumiya; Hatta, Takeshi; Yamagishi, Junya; Miyoshi, Takeharu; Anisuzzaman; Sasai, Kazumi; Isobe, Takashi; Kita, Kiyoshi; Tsuji, Naotoshi

    2016-06-01

    Chicken coccidiosis is caused by Eimeria spp., particularly Eimeria tenella, and is characterized by watery or hemorrhagic diarrhea, resulting in death in severe cases. Precociously attenuated live vaccines are widely used to control the disease, and these are produced by serially passaging virulent strains through chickens, and the collection of oocysts from feces at progressively earlier time points during oocyst shedding. Sporozoites of the precocious strain rapidly enter the intestinal mucosa, and their subsequent asexual development reduces their growth. However, there have been few detailed genetic or transcriptional analyses of the strains. Here, we used RNA sequencing to gain novel biological insight into the pathogenicity and precocity of E. tenella. We compared the differential transcription in the sporozoites (the initial stage of endogenous development) of virulent and precocious strains by mapping the sequence reads onto the draft genome of E. tenella. About 90% of the reads from both strains were mapped to the genome, and 16,630 estimated transcript regions were identified. Using Gene Ontology slim and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses and the annotation of the estimated transcripts with Blastx, we found that the expression of some genes involved in carbohydrate metabolism were expressed two-fold more strongly in the virulent strain than in the precocious strain. Characteristically, genes related to proteins secreted from the apical complex, proteases, cell attachment proteins, mitochondrial proteins, and transporters were most strongly upregulated in the virulent strain. Interestingly, the expression of genes associated with cell survival, development, or proliferation was strongly upregulated in the precocious strain. These findings suggest that virulent strains survive long before invasion and invade actively/successfully into host cells, whereas proliferative processes appear to affect precocity.

  4. Evolution of virulence when transmission occurs before disease.

    PubMed

    Osnas, Erik E; Dobson, Andrew P

    2010-08-23

    Most models of virulence evolution assume that transmission and virulence are constant during an infection. In many viral (HIV and influenza), bacterial (TB) and prion (BSE and CWD) systems, disease-induced mortality occurs long after the host becomes infectious. Therefore, we constructed a model with two infected classes that differ in transmission rate and virulence in order to understand how the evolutionarily stable strategy (ESS) depends on the relative difference in transmission and virulence between classes, on the transition rate between classes and on the recovery rate from the second class. We find that ESS virulence decreases when expressed early in the infection or when transmission occurs late in an infection. When virulence occurred relatively equally in each class and there was disease recovery, ESS virulence increased with increased transition rate. In contrast, ESS virulence first increased and then decreased with transition rate when there was little virulence early in the infection and a rapid recovery rate. This model predicts that ESS virulence is highly dependent on the timing of transmission and pathology after infection; thus, pathogen evolution may either increase or decrease virulence after emergence in a new host.

  5. Apollo: a sequence annotation editor.

    PubMed

    Lewis, S E; Searle, S M J; Harris, N; Gibson, M; Lyer, V; Richter, J; Wiel, C; Bayraktaroglu, L; Birney, E; Crosby, M A; Kaminker, J S; Matthews, B B; Prochnik, S E; Smithy, C D; Tupy, J L; Rubin, G M; Misra, S; Mungall, C J; Clamp, M E

    2002-01-01

    The well-established inaccuracy of purely computational methods for annotating genome sequences necessitates an interactive tool to allow biological experts to refine these approximations by viewing and independently evaluating the data supporting each annotation. Apollo was developed to meet this need, enabling curators to inspect genome annotations closely and edit them. FlyBase biologists successfully used Apollo to annotate the Drosophila melanogaster genome and it is increasingly being used as a starting point for the development of customized annotation editing tools for other genome projects.

  6. Single-Dose Mucosal Immunization with a Candidate Universal Influenza Vaccine Provides Rapid Protection from Virulent H5N1, H3N2 and H1N1 Viruses

    PubMed Central

    Price, Graeme E.; Soboleski, Mark R.; Lo, Chia-Yun; Misplon, Julia A.; Quirion, Mary R.; Houser, Katherine V.; Pearce, Melissa B.; Pappas, Claudia; Tumpey, Terrence M.; Epstein, Suzanne L.

    2010-01-01

    Background The sudden emergence of novel influenza viruses is a global public health concern. Conventional influenza vaccines targeting the highly variable surface glycoproteins hemagglutinin and neuraminidase must antigenically match the emerging strain to be effective. In contrast, “universal” vaccines targeting conserved viral components could be used regardless of viral strain or subtype. Previous approaches to universal vaccination have required protracted multi-dose immunizations. Here we evaluate a single dose universal vaccine strategy using recombinant adenoviruses (rAd) expressing the conserved influenza virus antigens matrix 2 and nucleoprotein. Methodology/Principal Findings In BALB/c mice, administration of rAd via the intranasal route was superior to intramuscular immunization for induction of mucosal responses and for protection against highly virulent H1N1, H3N2, or H5N1 influenza virus challenge. Mucosally vaccinated mice not only survived, but had little morbidity and reduced lung virus titers. Protection was observed as early as 2 weeks post-immunization, and lasted at least 10 months, as did antibodies and lung T cells with activated phenotypes. Virus-specific IgA correlated with but was not essential for protection, as demonstrated in studies with IgA-deficient animals. Conclusion/Significance Mucosal administration of NP and M2-expressing rAd vectors provided rapid and lasting protection from influenza viruses in a subtype-independent manner. Such vaccines could be used in the interval between emergence of a new virus strain and availability of strain-matched vaccines against it. This strikingly effective single-dose vaccination thus represents a candidate off-the-shelf vaccine for emergency use during an influenza pandemic. PMID:20976273

  7. MitoFish and MitoAnnotator: A Mitochondrial Genome Database of Fish with an Accurate and Automatic Annotation Pipeline

    PubMed Central

    Iwasaki, Wataru; Fukunaga, Tsukasa; Isagozawa, Ryota; Yamada, Koichiro; Maeda, Yasunobu; Satoh, Takashi P.; Sado, Tetsuya; Mabuchi, Kohji; Takeshima, Hirohiko; Miya, Masaki; Nishida, Mutsumi

    2013-01-01

    Mitofish is a database of fish mitochondrial genomes (mitogenomes) that includes powerful and precise de novo annotations for mitogenome sequences. Fish occupy an important position in the evolution of vertebrates and the ecology of the hydrosphere, and mitogenomic sequence data have served as a rich source of information for resolving fish phylogenies and identifying new fish species. The importance of a mitogenomic database continues to grow at a rapid pace as massive amounts of mitogenomic data are generated with the advent of new sequencing technologies. A severe bottleneck seems likely to occur with regard to mitogenome annotation because of the overwhelming pace of data accumulation and the intrinsic difficulties in annotating sequences with degenerating transfer RNA structures, divergent start/stop codons of the coding elements, and the overlapping of adjacent elements. To ease this data backlog, we developed an annotation pipeline named MitoAnnotator. MitoAnnotator automatically annotates a fish mitogenome with a high degree of accuracy in approximately 5 min; thus, it is readily applicable to data sets of dozens of sequences. MitoFish also contains re-annotations of previously sequenced fish mitogenomes, enabling researchers to refer to them when they find annotations that are likely to be erroneous or while conducting comparative mitogenomic analyses. For users who need more information on the taxonomy, habitats, phenotypes, or life cycles of fish, MitoFish provides links to related databases. MitoFish and MitoAnnotator are freely available at http://mitofish.aori.u-tokyo.ac.jp/ (last accessed August 28, 2013); all of the data can be batch downloaded, and the annotation pipeline can be used via a web interface. PMID:23955518

  8. MitoFish and MitoAnnotator: a mitochondrial genome database of fish with an accurate and automatic annotation pipeline.

    PubMed

    Iwasaki, Wataru; Fukunaga, Tsukasa; Isagozawa, Ryota; Yamada, Koichiro; Maeda, Yasunobu; Satoh, Takashi P; Sado, Tetsuya; Mabuchi, Kohji; Takeshima, Hirohiko; Miya, Masaki; Nishida, Mutsumi

    2013-11-01

    Mitofish is a database of fish mitochondrial genomes (mitogenomes) that includes powerful and precise de novo annotations for mitogenome sequences. Fish occupy an important position in the evolution of vertebrates and the ecology of the hydrosphere, and mitogenomic sequence data have served as a rich source of information for resolving fish phylogenies and identifying new fish species. The importance of a mitogenomic database continues to grow at a rapid pace as massive amounts of mitogenomic data are generated with the advent of new sequencing technologies. A severe bottleneck seems likely to occur with regard to mitogenome annotation because of the overwhelming pace of data accumulation and the intrinsic difficulties in annotating sequences with degenerating transfer RNA structures, divergent start/stop codons of the coding elements, and the overlapping of adjacent elements. To ease this data backlog, we developed an annotation pipeline named MitoAnnotator. MitoAnnotator automatically annotates a fish mitogenome with a high degree of accuracy in approximately 5 min; thus, it is readily applicable to data sets of dozens of sequences. MitoFish also contains re-annotations of previously sequenced fish mitogenomes, enabling researchers to refer to them when they find annotations that are likely to be erroneous or while conducting comparative mitogenomic analyses. For users who need more information on the taxonomy, habitats, phenotypes, or life cycles of fish, MitoFish provides links to related databases. MitoFish and MitoAnnotator are freely available at http://mitofish.aori.u-tokyo.ac.jp/ (last accessed August 28, 2013); all of the data can be batch downloaded, and the annotation pipeline can be used via a web interface.

  9. RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes

    DOE PAGES

    Brettin, Thomas; Davis, James J.; Disz, Terry; ...

    2015-02-10

    The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offersmore » a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.« less

  10. GSV Annotated Bibliography

    SciTech Connect

    Roberts, Randy S.; Pope, Paul A.; Jiang, Ming; Trucano, Timothy G.; Aragon, Cecilia R.; Ni, Kevin; Wei, Thomas; Chilton, Lawrence K.; Bakel, Alan

    2011-06-14

    The following annotated bibliography was developed as part of the Geospatial Algorithm Veri cation and Validation (GSV) project for the Simulation, Algorithms and Modeling program of NA-22. Veri cation and Validation of geospatial image analysis algorithms covers a wide range of technologies. Papers in the bibliography are thus organized into the following ve topic areas: Image processing and analysis, usability and validation of geospatial image analysis algorithms, image distance measures, scene modeling and image rendering, and transportation simulation models.

  11. Rapid Identification of Bacterial Virulence Factors

    DTIC Science & Technology

    2014-04-15

    peptidoglycan . In an effort to discover new drugs to treat tuberculosis, Anthony, 2011 chose alanine racemase as the target of a drug discovery effort in M...monocytogenes this enzyme acetylates the peptidoglycan layer. This modification confers resistance to various types of antimicrobial compounds that

  12. Code generation through annotation of macromolecular structure data.

    PubMed

    Biggs, J; Pu, C; Bourne, P

    1997-01-01

    The maintenance of software which uses a rapidly evolving data annotation scheme is time consuming and expensive. At the same time without current software the annotation scheme itself becomes limited and is less likely to be widely adopted. A solution to this problem has been developed for the macromolecular Crystallographic Information File (mmCIF) annotation scheme. The approach could be generalized for a variety of annotation schemes used or proposed for molecular biology data. mmCIF provides a highly structured and complete annotation for describing NMR and X-ray crystallographic data and the resulting macromolecular structures. This annotation is maintained in the mmCIF dictionary which currently contains over 3,200 terms. A major challenge is to maintain code for converting between mmCIF and Protein Data Bank (PDB) annotations while both continue to evolve. The solution has been to define a simple domain specific language (DSL) which is added to the extensive annotation already found in the mmCIF dictionary. The DSL calls specific mapping modules for each category of data item in the mmCIF dictionary. Adding or changing the mapping between PDB and mmCIF items of data is straightforward since data categories (and hence mapping modules) correspond to elements of macromolecular structure familiar to the experimentalist. Each time a change is made to the macromolecular annotation the appropriate change is made to the easily located and modifiable mapping modules. A code generator is then called which reads the mapping modules and creates a new executable for performing the data conversion. In this way code is easily kept current by individuals with limited programming skill, but who have an understanding of macromolecular structure and details of the annotation scheme. Most important, the conversion process becomes part of the global dictionary and is not open to a variety of interpretations by different research groups writing code based on dictionary contents

  13. Pseudomonas aeruginosa Virulence and Pathogenesis Issues

    USDA-ARS?s Scientific Manuscript database

    Regulation of gene expression can occur through cell-cell communication or quorum sensing (QS) via the production of small molecules called autoinducers. QS is known to control expression of a number of virulence factors. Another form of gene regulation which allows the bacteria to rapidly adapt t...

  14. Functional Annotation Analytics of Rhodopseudomonas palustris Genomes.

    PubMed

    Simmons, Shaneka S; Isokpehi, Raphael D; Brown, Shyretha D; McAllister, Donee L; Hall, Charnia C; McDuffy, Wanaki M; Medley, Tamara L; Udensi, Udensi K; Rajnarayanan, Rajendram V; Ayensu, Wellington K; Cohly, Hari H P

    2011-01-01

    Rhodopseudomonas palustris, a nonsulphur purple photosynthetic bacteria, has been extensively investigated for its metabolic versatility including ability to produce hydrogen gas from sunlight and biomass. The availability of the finished genome sequences of six R. palustris strains (BisA53, BisB18, BisB5, CGA009, HaA2 and TIE-1) combined with online bioinformatics software for integrated analysis presents new opportunities to determine the genomic basis of metabolic versatility and ecological lifestyles of the bacteria species. The purpose of this investigation was to compare the functional annotations available for multiple R. palustris genomes to identify annotations that can be further investigated for strain-specific or uniquely shared phenotypic characteristics. A total of 2,355 protein family Pfam domain annotations were clustered based on presence or absence in the six genomes. The clustering process identified groups of functional annotations including those that could be verified as strain-specific or uniquely shared phenotypes. For example, genes encoding water/glycerol transport were present in the genome sequences of strains CGA009 and BisB5, but absent in strains BisA53, BisB18, HaA2 and TIE-1. Protein structural homology modeling predicted that the two orthologous 240 aa R. palustris aquaporins have water-specific transport function. Based on observations in other microbes, the presence of aquaporin in R. palustris strains may improve freeze tolerance in natural conditions of rapid freezing such as nitrogen fixation at low temperatures where access to liquid water is a limiting factor for nitrogenase activation. In the case of adaptive loss of aquaporin genes, strains may be better adapted to survive in conditions of high-sugar content such as fermentation of biomass for biohydrogen production. Finally, web-based resources were developed to allow for interactive, user-defined selection of the relationship between protein family annotations and the R

  15. Functional Annotation Analytics of Rhodopseudomonas palustris Genomes

    PubMed Central

    Simmons, Shaneka S.; Isokpehi, Raphael D.; Brown, Shyretha D.; McAllister, Donee L.; Hall, Charnia C.; McDuffy, Wanaki M.; Medley, Tamara L.; Udensi, Udensi K.; Rajnarayanan, Rajendram V.; Ayensu, Wellington K.; Cohly, Hari H.P.

    2011-01-01

    Rhodopseudomonas palustris, a nonsulphur purple photosynthetic bacteria, has been extensively investigated for its metabolic versatility including ability to produce hydrogen gas from sunlight and biomass. The availability of the finished genome sequences of six R. palustris strains (BisA53, BisB18, BisB5, CGA009, HaA2 and TIE-1) combined with online bioinformatics software for integrated analysis presents new opportunities to determine the genomic basis of metabolic versatility and ecological lifestyles of the bacteria species. The purpose of this investigation was to compare the functional annotations available for multiple R. palustris genomes to identify annotations that can be further investigated for strain-specific or uniquely shared phenotypic characteristics. A total of 2,355 protein family Pfam domain annotations were clustered based on presence or absence in the six genomes. The clustering process identified groups of functional annotations including those that could be verified as strain-specific or uniquely shared phenotypes. For example, genes encoding water/glycerol transport were present in the genome sequences of strains CGA009 and BisB5, but absent in strains BisA53, BisB18, HaA2 and TIE-1. Protein structural homology modeling predicted that the two orthologous 240 aa R. palustris aquaporins have water-specific transport function. Based on observations in other microbes, the presence of aquaporin in R. palustris strains may improve freeze tolerance in natural conditions of rapid freezing such as nitrogen fixation at low temperatures where access to liquid water is a limiting factor for nitrogenase activation. In the case of adaptive loss of aquaporin genes, strains may be better adapted to survive in conditions of high-sugar content such as fermentation of biomass for biohydrogen production. Finally, web-based resources were developed to allow for interactive, user-defined selection of the relationship between protein family annotations and the R

  16. Annotation and visualization of endogenous retroviral sequences using the Distributed Annotation System (DAS) and eBioX.

    PubMed

    Barrio, Alvaro Martínez; Lagercrantz, Erik; Sperber, Göran O; Blomberg, Jonas; Bongcam-Rudloff, Erik

    2009-06-16

    The Distributed Annotation System (DAS) is a widely used network protocol for sharing biological information. The distributed aspects of the protocol enable the use of various reference and annotation servers for connecting biological sequence data to pertinent annotations in order to depict an integrated view of the data for the final user. An annotation server has been devised to provide information about the endogenous retroviruses detected and annotated by a specialized in silico tool called RetroTector. We describe the procedure to implement the DAS 1.5 protocol commands necessary for constructing the DAS annotation server. We use our server to exemplify those steps. Data distribution is kept separated from visualization which is carried out by eBioX, an easy to use open source program incorporating multiple bioinformatics utilities. Some well characterized endogenous retroviruses are shown in two different DAS clients. A rapid analysis of areas free from retroviral insertions could be facilitated by our annotations. The DAS protocol has shown to be advantageous in the distribution of endogenous retrovirus data. The distributed nature of the protocol is also found to aid in combining annotation and visualization along a genome in order to enhance the understanding of ERV contribution to its evolution. Reference and annotation servers are conjointly used by eBioX to provide visualization of ERV annotations as well as other data sources. Our DAS data source can be found in the central public DAS service repository, http://www.dasregistry.org, or at http://loka.bmc.uu.se/das/sources.

  17. Fuzzy Emotional Semantic Analysis and Automated Annotation of Scene Images

    PubMed Central

    Cao, Jianfang; Chen, Lichao

    2015-01-01

    With the advances in electronic and imaging techniques, the production of digital images has rapidly increased, and the extraction and automated annotation of emotional semantics implied by images have become issues that must be urgently addressed. To better simulate human subjectivity and ambiguity for understanding scene images, the current study proposes an emotional semantic annotation method for scene images based on fuzzy set theory. A fuzzy membership degree was calculated to describe the emotional degree of a scene image and was implemented using the Adaboost algorithm and a back-propagation (BP) neural network. The automated annotation method was trained and tested using scene images from the SUN Database. The annotation results were then compared with those based on artificial annotation. Our method showed an annotation accuracy rate of 91.2% for basic emotional values and 82.4% after extended emotional values were added, which correspond to increases of 5.5% and 8.9%, respectively, compared with the results from using a single BP neural network algorithm. Furthermore, the retrieval accuracy rate based on our method reached approximately 89%. This study attempts to lay a solid foundation for the automated emotional semantic annotation of more types of images and therefore is of practical significance. PMID:25838818

  18. The Ensembl gene annotation system

    PubMed Central

    Aken, Bronwen L.; Ayling, Sarah; Barrell, Daniel; Clarke, Laura; Curwen, Valery; Fairley, Susan; Fernandez Banet, Julio; Billis, Konstantinos; García Girón, Carlos; Hourlier, Thibaut; Howe, Kevin; Kähäri, Andreas; Kokocinski, Felix; Martin, Fergal J.; Murphy, Daniel N.; Nag, Rishi; Ruffier, Magali; Schuster, Michael; Tang, Y. Amy; Vogel, Jan-Hinnerk; White, Simon; Zadissa, Amonida; Flicek, Paul

    2016-01-01

    The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects. Furthermore, it generates the automatic alignment-based annotation for the human and mouse GENCODE gene sets. The system is based on the alignment of biological sequences, including cDNAs, proteins and RNA-seq reads, to the target genome in order to construct candidate transcript models. Careful assessment and filtering of these candidate transcripts ultimately leads to the final gene set, which is made available on the Ensembl website. Here, we describe the annotation process in detail. Database URL: http://www.ensembl.org/index.html PMID:27337980

  19. Phylogenetic molecular function annotation

    NASA Astrophysics Data System (ADS)

    Engelhardt, Barbara E.; Jordan, Michael I.; Repo, Susanna T.; Brenner, Steven E.

    2009-07-01

    It is now easier to discover thousands of protein sequences in a new microbial genome than it is to biochemically characterize the specific activity of a single protein of unknown function. The molecular functions of protein sequences have typically been predicted using homology-based computational methods, which rely on the principle that homologous proteins share a similar function. However, some protein families include groups of proteins with different molecular functions. A phylogenetic approach for predicting molecular function (sometimes called "phylogenomics") is an effective means to predict protein molecular function. These methods incorporate functional evidence from all members of a family that have functional characterizations using the evolutionary history of the protein family to make robust predictions for the uncharacterized proteins. However, they are often difficult to apply on a genome-wide scale because of the time-consuming step of reconstructing the phylogenies of each protein to be annotated. Our automated approach for function annotation using phylogeny, the SIFTER (Statistical Inference of Function Through Evolutionary Relationships) methodology, uses a statistical graphical model to compute the probabilities of molecular functions for unannotated proteins. Our benchmark tests showed that SIFTER provides accurate functional predictions on various protein families, outperforming other available methods.

  20. Annotation: the savant syndrome.

    PubMed

    Heaton, Pamela; Wallace, Gregory L

    2004-07-01

    Whilst interest has focused on the origin and nature of the savant syndrome for over a century, it is only within the past two decades that empirical group studies have been carried out. The following annotation briefly reviews relevant research and also attempts to address outstanding issues in this research area. Traditionally, savants have been defined as intellectually impaired individuals who nevertheless display exceptional skills within specific domains. However, within the extant literature, cases of savants with developmental and other clinical disorders, but with average intellectual functioning, are increasingly reported. We thus propose that focus should diverge away from IQ scores to encompass discrepancies between functional impairments and unexpected skills. It has long been observed that savant skills are more prevalent in individuals with autism than in those with other disorders. Therefore, in this annotation we seek to explore the parameters of the savant syndrome by considering these skills within the context of neuropsychological accounts of autism. A striking finding amongst those with savant skills, but without the diagnosis of autism, is the presence of cognitive features and behavioural traits associated with the disorder. We thus conclude that autism (or autistic traits) and savant skills are inextricably linked and we should therefore look to autism in our quest to solve the puzzle of the savant syndrome. Copyright 2004 Association for Child Psychology and Psychiatry

  1. Visualizing GO Annotations.

    PubMed

    Supek, Fran; Škunca, Nives

    2017-01-01

    Contemporary techniques in biology produce readouts for large numbers of genes simultaneously, the typical example being differential gene expression measurements. Moreover, those genes are often richly annotated using GO terms that describe gene function and that can be used to summarize the results of the genome-scale experiments. However, making sense of such GO enrichment analyses may be challenging. For instance, overrepresented GO functions in a set of differentially expressed genes are typically output as a flat list, a format not adequate to capture the complexities of the hierarchical structure of the GO annotation labels.In this chapter, we survey various methods to visualize large, difficult-to-interpret lists of GO terms. We catalog their availability-Web-based or standalone, the main principles they employ in summarizing large lists of GO terms, and the visualization styles they support. These brief commentaries on each software are intended as a helpful inventory, rather than comprehensive descriptions of the underlying algorithms. Instead, we show examples of their use and suggest that the choice of an appropriate visualization tool may be crucial to the utility of GO in biological discovery.

  2. Managing Development Projects: A Selected, Annotated Bibliography. Annotated Bibliography #5.

    ERIC Educational Resources Information Center

    Chuenyane, Zachariah; And Others

    A selected annotated bibliography on managing development projects, intended for rural development practitioners, highlights items that outline some pressing issues and concerns confronting those involved in rural development in general and rural project management in particular. A section of annotated entries lists 21 publications on project…

  3. Visualizing Genomic Annotations with the UCSC Genome Browser.

    PubMed

    Hung, Jui-Hung; Weng, Zhiping

    2016-11-01

    Genomic data and annotations are rapidly accumulating in databases such as the UCSC Genome Browser, NCBI, and Ensembl. Given the massive scale of these genomic databases, it is important to be able to easily retrieve known data and annotations of a specified genomic locus. For example, for a newly identified cis-regulatory element bound by a transcription factor, questions that immediately come to mind include whether the element is near a transcriptional start site and, if so, the name of the corresponding gene, and whether the histones or DNA at the locus are modified. The UCSC Genome Browser organizes data and annotations (called tracks) around the reference sequences or draft assemblies of many eukaryotic genomes and presents them using a powerful web-based graphical interface. This protocol describes how to use the UCSC Genome Browser to visualize selected tracks at specified genomic regions, download the data and annotations for further analysis, and retrieve multiple sequence alignments and their conservation scores.

  4. Structural and functional annotation of the porcine immunome

    PubMed Central

    2013-01-01

    evolution as compared to 4.1% across the entire genome. Conclusions This extensive annotation dramatically extends the genome-based knowledge of the molecular genetics and structure of a major portion of the porcine immunome. Our complementary functional approach using co-expression during immune response has provided new putative immune response annotation for over 500 porcine genes. Our phylogenetic analysis of this core immunome cluster confirms rapid evolutionary change in this set of genes, and that, as in other species, such genes are important components of the pig’s adaptation to pathogen challenge over evolutionary time. These comprehensive and integrated analyses increase the value of the porcine genome sequence and provide important tools for global analyses and data-mining of the porcine immune response. PMID:23676093

  5. Is the Juice Worth the Squeeze? Costs and Benefits of Multiple Human Annotators for Clinical Text De-identification

    PubMed Central

    Carrell, D. S.; Cronkite, D. J.; Malin, B. A.; Aberdeen, J. S.; Hirschman, L.

    2016-01-01

    Summary Background Clinical text contains valuable information but must be de-identified before it can be used for secondary purposes. Accurate annotation of personally identifiable information (PII) is essential to the development of automated de-identification systems and to manual redaction of PII. Yet the accuracy of annotations may vary considerably across individual annotators and annotation is costly. As such, the marginal benefit of incorporating additional annotators has not been well characterized. Objectives This study models the costs and benefits of incorporating increasing numbers of independent human annotators to identify the instances of PII in a corpus. We used a corpus with gold standard annotations to evaluate the performance of teams of annotators of increasing size. Methods Four annotators independently identified PII in a 100-document corpus consisting of randomly selected clinical notes from Family Practice clinics in a large integrated health care system. These annotations were pooled and validated to generate a gold standard corpus for evaluation. Results Recall rates for all PII types ranged from 0.90 to 0.98 for individual annotators to 0.998 to 1.0 for teams of three, when measured against the gold standard. Median cost per PII instance discovered during corpus annotation ranged from $0.71 for an individual annotator to $377 for annotations discovered only by a fourth annotator. Conclusions Incorporating a second annotator into a PII annotation process reduces unredacted PII and improves the quality of annotations to 0.99 recall, yielding clear benefit at reasonable cost; the cost advantages of annotation teams larger than two diminish rapidly. PMID:27405787

  6. Is the Juice Worth the Squeeze? Costs and Benefits of Multiple Human Annotators for Clinical Text De-identification.

    PubMed

    Carrell, David S; Cronkite, David J; Malin, Bradley A; Aberdeen, John S; Hirschman, Lynette

    2016-08-05

    Clinical text contains valuable information but must be de-identified before it can be used for secondary purposes. Accurate annotation of personally identifiable information (PII) is essential to the development of automated de-identification systems and to manual redaction of PII. Yet the accuracy of annotations may vary considerably across individual annotators and annotation is costly. As such, the marginal benefit of incorporating additional annotators has not been well characterized. This study models the costs and benefits of incorporating increasing numbers of independent human annotators to identify the instances of PII in a corpus. We used a corpus with gold standard annotations to evaluate the performance of teams of annotators of increasing size. Four annotators independently identified PII in a 100-document corpus consisting of randomly selected clinical notes from Family Practice clinics in a large integrated health care system. These annotations were pooled and validated to generate a gold standard corpus for evaluation. Recall rates for all PII types ranged from 0.90 to 0.98 for individual annotators to 0.998 to 1.0 for teams of three, when meas-ured against the gold standard. Median cost per PII instance discovered during corpus annotation ranged from $ 0.71 for an individual annotator to $ 377 for annotations discovered only by a fourth annotator. Incorporating a second annotator into a PII annotation process reduces unredacted PII and improves the quality of annotations to 0.99 recall, yielding clear benefit at reasonable cost; the cost advantages of annotation teams larger than two diminish rapidly.

  7. Widowed Persons Service: Selected Annotated Bibliography.

    ERIC Educational Resources Information Center

    Bressler, Dawn, Comp.; And Others

    This document presents an annotated bibliography of books and articles on topics relevant to widowhood. These annotations are included: (1) 21 annotations on the grief process; (2) 11 annotations on personal observations about widowhood; (3) 16 annotations on practical problems surrounding widowhood, including legal and financial problems and job…

  8. Virulence factors of the Mycobacterium tuberculosis complex

    PubMed Central

    Forrellad, Marina A.; Klepp, Laura I.; Gioffré, Andrea; Sabio y García, Julia; Morbidoni, Hector R.; Santangelo, María de la Paz; Cataldi, Angel A.; Bigi, Fabiana

    2013-01-01

    The Mycobacterium tuberculosis complex (MTBC) consists of closely related species that cause tuberculosis in both humans and animals. This illness, still today, remains to be one of the leading causes of morbidity and mortality throughout the world. The mycobacteria enter the host by air, and, once in the lungs, are phagocytated by macrophages. This may lead to the rapid elimination of the bacillus or to the triggering of an active tuberculosis infection. A large number of different virulence factors have evolved in MTBC members as a response to the host immune reaction. The aim of this review is to describe the bacterial genes/proteins that are essential for the virulence of MTBC species, and that have been demonstrated in an in vivo model of infection. Knowledge of MTBC virulence factors is essential for the development of new vaccines and drugs to help manage the disease toward an increasingly more tuberculosis-free world. PMID:23076359

  9. Morphosyntactic Annotation of CHILDES Transcripts

    ERIC Educational Resources Information Center

    Sagae, Kenji; Davis, Eric; Lavie, Alon; MacWhinney, Brian; Wintner, Shuly

    2010-01-01

    Corpora of child language are essential for research in child language acquisition and psycholinguistics. Linguistic annotation of the corpora provides researchers with better means for exploring the development of grammatical constructions and their usage. We describe a project whose goal is to annotate the English section of the CHILDES database…

  10. An Annotated Bibliography on Children.

    ERIC Educational Resources Information Center

    Bureau of Libraries and Educational Technology (DHEW/OE), Washington, DC.

    This annotated bibliography is a highly selective list of materials published in the last five years on the major problems, trends, methodologies and achievements in the field of child development. It contains annotated references to approximately 500 books, periodicals, technical reports, government documents, legislative materials, professional…

  11. Drug Education: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Mathieson, Moira B.

    This bibliography consists of a total of 215 entries dealing with drug education, including curriculum guides, and drawn from documents in the ERIC system. There are two sections, the first containing 130 annotated citations of documents and journal articles, and the second containing 85 citations of journal articles without annotations, but with…

  12. Genetic Modulation of c-di-GMP Turnover Affects Multiple Virulence Traits and Bacterial Virulence in Rice Pathogen Dickeya zeae

    PubMed Central

    Chen, Yufan; Lv, Mingfa; Liao, Lisheng; Gu, Yanfang; Liang, Zhibin; Shi, Zurong; Liu, Shiyin; Zhou, Jianuan; Zhang, Lianhui

    2016-01-01

    The frequent outbreaks of rice foot rot disease caused by Dickeya zeae have become a significant concern in rice planting regions and countries, but the regulatory mechanisms that govern the virulence of this important pathogen remain vague. Given that the second messenger cyclic di-GMP (c-di-GMP) is associated with modulation of various virulence-related traits in various microorganisms, here we set to investigate the role of the genes encoding c-di-GMP metabolism in the regulation of the bacterial physiology and virulence by construction all in-frame deletion mutants targeting the annotated c-di-GMP turnover genes in D. zeae strain EC1. Phenotype analyses identified individual mutants showing altered production of exoenzymes and phytotoxins, biofilm formation and bacterial motilities. The results provide useful clues and a valuable toolkit for further characterization and dissection of the regulatory complex that modulates the pathogenesis and persistence of this important bacterial pathogen. PMID:27855163

  13. Morphosyntactic annotation of CHILDES transcripts*

    PubMed Central

    SAGAE, KENJI; DAVIS, ERIC; LAVIE, ALON; MACWHINNEY, BRIAN; WINTNER, SHULY

    2014-01-01

    Corpora of child language are essential for research in child language acquisition and psycholinguistics. Linguistic annotation of the corpora provides researchers with better means for exploring the development of grammatical constructions and their usage. We describe a project whose goal is to annotate the English section of the CHILDES database with grammatical relations in the form of labeled dependency structures. We have produced a corpus of over 18,800 utterances (approximately 65,000 words) with manually curated gold-standard grammatical relation annotations. Using this corpus, we have developed a highly accurate data-driven parser for the English CHILDES data, which we used to automatically annotate the remainder of the English section of CHILDES. We have also extended the parser to Spanish, and are currently working on supporting more languages. The parser and the manually and automatically annotated data are freely available for research purposes. PMID:20334720

  14. Towards Automated Annotation of Benthic Survey Images: Variability of Human Experts and Operational Modes of Automation

    PubMed Central

    Beijbom, Oscar; Edmunds, Peter J.; Roelfsema, Chris; Smith, Jennifer; Kline, David I.; Neal, Benjamin P.; Dunlap, Matthew J.; Moriarty, Vincent; Fan, Tung-Yung; Tan, Chih-Jui; Chan, Stephen; Treibitz, Tali; Gamst, Anthony; Mitchell, B. Greg; Kriegman, David

    2015-01-01

    Global climate change and other anthropogenic stressors have heightened the need to rapidly characterize ecological changes in marine benthic communities across large scales. Digital photography enables rapid collection of survey images to meet this need, but the subsequent image annotation is typically a time consuming, manual task. We investigated the feasibility of using automated point-annotation to expedite cover estimation of the 17 dominant benthic categories from survey-images captured at four Pacific coral reefs. Inter- and intra- annotator variability among six human experts was quantified and compared to semi- and fully- automated annotation methods, which are made available at coralnet.ucsd.edu. Our results indicate high expert agreement for identification of coral genera, but lower agreement for algal functional groups, in particular between turf algae and crustose coralline algae. This indicates the need for unequivocal definitions of algal groups, careful training of multiple annotators, and enhanced imaging technology. Semi-automated annotation, where 50% of the annotation decisions were performed automatically, yielded cover estimate errors comparable to those of the human experts. Furthermore, fully-automated annotation yielded rapid, unbiased cover estimates but with increased variance. These results show that automated annotation can increase spatial coverage and decrease time and financial outlay for image-based reef surveys. PMID:26154157

  15. Rapid and simple method by combining FTA(™) card DNA extraction with the adaptation of a two set multiplex PCR for simultaneous detection of non-O157 shiga-toxin producing Escherichia coli strains and virulence genes from food samples.

    PubMed

    Kim, Sun Ae; Park, Si Hong; Lee, Sang In; Ricke, Steven C

    2017-09-27

    The aim of this research was to optimize two multiplex polymerase chain reaction (PCR) assays that could simultaneously detect six non-O157 Shiga toxin-producing Escherichia coli (STEC) as well as the three virulence genes. We also investigated the potential of combining the FTA(™) card-based DNA extraction with the multiplex PCR assays. Two multiplex PCR assays were optimized using six primer pairs for each non-O157 STEC serogroup and three primer pairs for virulence genes respectively. Each STEC strain specific primer pair only amplified 155, 238, 321, 438, 587, and 750 bp product for O26, O45, O103, O111, O121, and O145 respectively. Three virulence genes were successfully multiplexed: 375 bp for eae, 655 bp for stx1, and 477 bp for stx2. When two multiplex PCR assays were validated with ground beef samples, distinctive bands were also successfully produced. Since the two multiplex PCR examined here can be conducted under the same PCR conditions, the six non-O157 STEC and their virulence genes could be concurrent detected with one run on the thermocycler. In addition, all bands clearly appeared to be amplified by FTA card DNA extraction in multiplex PCR assay from the ground beef sample, suggesting that an FTA card could be a viable sampling approach for rapid and simple DNA extraction to reduce time and labor and therefore may have practical use for the food industry. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  16. Draft Genome Sequence of the Virulent Clostridium chauvoei Reference Strain JF4335

    PubMed Central

    Calderon-Copete, Sandra P.; Frey, Joachim

    2013-01-01

    Clostridium chauvoei is the etiological agent of blackleg, a disease of cattle and sheep with high mortality rates, causing severe economic losses in livestock production. Here, we report the draft genome sequence of the virulent C. chauvoei strain JF4335 (2.8 Mbp and 28% G+C content) and the annotation of the genome. PMID:23950118

  17. Draft Genome Sequence of the Virulent Clostridium chauvoei Reference Strain JF4335.

    PubMed

    Falquet, Laurent; Calderon-Copete, Sandra P; Frey, Joachim

    2013-08-15

    Clostridium chauvoei is the etiological agent of blackleg, a disease of cattle and sheep with high mortality rates, causing severe economic losses in livestock production. Here, we report the draft genome sequence of the virulent C. chauvoei strain JF4335 (2.8 Mbp and 28% G+C content) and the annotation of the genome.

  18. Cold Shock Exoribonuclease R(VacB) is involved in Aeromonas hydrophila Virulence

    EPA Science Inventory

    In this study, we cloned and sequenced a virulence-associated gene (vacB) from a clinical isolate SSU of Aeromonas hydrophila. We identified this gene based on our recently annotated genome sequence of the environmental isolate ATCC 7966T of A. hydrophila and the vacB gene of Shi...

  19. Cold Shock Exoribonuclease R(VacB) is involved in Aeromonas hydrophila Virulence

    EPA Science Inventory

    In this study, we cloned and sequenced a virulence-associated gene (vacB) from a clinical isolate SSU of Aeromonas hydrophila. We identified this gene based on our recently annotated genome sequence of the environmental isolate ATCC 7966T of A. hydrophila and the vacB gene of Shi...

  20. Gene Ontology annotations and resources.

    PubMed

    Blake, J A; Dolan, M; Drabkin, H; Hill, D P; Li, Ni; Sitnikov, D; Bridges, S; Burgess, S; Buza, T; McCarthy, F; Peddinti, D; Pillai, L; Carbon, S; Dietze, H; Ireland, A; Lewis, S E; Mungall, C J; Gaudet, P; Chrisholm, R L; Fey, P; Kibbe, W A; Basu, S; Siegele, D A; McIntosh, B K; Renfro, D P; Zweifel, A E; Hu, J C; Brown, N H; Tweedie, S; Alam-Faruque, Y; Apweiler, R; Auchinchloss, A; Axelsen, K; Bely, B; Blatter, M -C; Bonilla, C; Bouguerleret, L; Boutet, E; Breuza, L; Bridge, A; Chan, W M; Chavali, G; Coudert, E; Dimmer, E; Estreicher, A; Famiglietti, L; Feuermann, M; Gos, A; Gruaz-Gumowski, N; Hieta, R; Hinz, C; Hulo, C; Huntley, R; James, J; Jungo, F; Keller, G; Laiho, K; Legge, D; Lemercier, P; Lieberherr, D; Magrane, M; Martin, M J; Masson, P; Mutowo-Muellenet, P; O'Donovan, C; Pedruzzi, I; Pichler, K; Poggioli, D; Porras Millán, P; Poux, S; Rivoire, C; Roechert, B; Sawford, T; Schneider, M; Stutz, A; Sundaram, S; Tognolli, M; Xenarios, I; Foulgar, R; Lomax, J; Roncaglia, P; Khodiyar, V K; Lovering, R C; Talmud, P J; Chibucos, M; Giglio, M Gwinn; Chang, H -Y; Hunter, S; McAnulla, C; Mitchell, A; Sangrador, A; Stephan, R; Harris, M A; Oliver, S G; Rutherford, K; Wood, V; Bahler, J; Lock, A; Kersey, P J; McDowall, D M; Staines, D M; Dwinell, M; Shimoyama, M; Laulederkind, S; Hayman, T; Wang, S -J; Petri, V; Lowry, T; D'Eustachio, P; Matthews, L; Balakrishnan, R; Binkley, G; Cherry, J M; Costanzo, M C; Dwight, S S; Engel, S R; Fisk, D G; Hitz, B C; Hong, E L; Karra, K; Miyasato, S R; Nash, R S; Park, J; Skrzypek, M S; Weng, S; Wong, E D; Berardini, T Z; Huala, E; Mi, H; Thomas, P D; Chan, J; Kishore, R; Sternberg, P; Van Auken, K; Howe, D; Westerfield, M

    2013-01-01

    The Gene Ontology (GO) Consortium (GOC, http://www.geneontology.org) is a community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over the past year, the GOC has implemented several processes to increase the quantity, quality and specificity of GO annotations. First, the number of manual, literature-based annotations has grown at an increasing rate. Second, as a result of a new 'phylogenetic annotation' process, manually reviewed, homology-based annotations are becoming available for a broad range of species. Third, the quality of GO annotations has been improved through a streamlined process for, and automated quality checks of, GO annotations deposited by different annotation groups. Fourth, the consistency and correctness of the ontology itself has increased by using automated reasoning tools. Finally, the GO has been expanded not only to cover new areas of biology through focused interaction with experts, but also to capture greater specificity in all areas of the ontology using tools for adding new combinatorial terms. The GOC works closely with other ontology developers to support integrated use of terminologies. The GOC supports its user community through the use of e-mail lists, social media and web-based resources.

  1. Metagenomic gene annotation by a homology-independent approach

    SciTech Connect

    Froula, Jeff; Zhang, Tao; Salmeen, Annette; Hess, Matthias; Kerfeld, Cheryl A.; Wang, Zhong; Du, Changbin

    2011-06-02

    Fully understanding the genetic potential of a microbial community requires functional annotation of all the genes it encodes. The recently developed deep metagenome sequencing approach has enabled rapid identification of millions of genes from a complex microbial community without cultivation. Current homology-based gene annotation fails to detect distantly-related or structural homologs. Furthermore, homology searches with millions of genes are very computational intensive. To overcome these limitations, we developed rhModeller, a homology-independent software pipeline to efficiently annotate genes from metagenomic sequencing projects. Using cellulases and carbonic anhydrases as two independent test cases, we demonstrated that rhModeller is much faster than HMMER but with comparable accuracy, at 94.5percent and 99.9percent accuracy, respectively. More importantly, rhModeller has the ability to detect novel proteins that do not share significant homology to any known protein families. As {approx}50percent of the 2 million genes derived from the cow rumen metagenome failed to be annotated based on sequence homology, we tested whether rhModeller could be used to annotate these genes. Preliminary results suggest that rhModeller is robust in the presence of missense and frameshift mutations, two common errors in metagenomic genes. Applying the pipeline to the cow rumen genes identified 4,990 novel cellulases candidates and 8,196 novel carbonic anhydrase candidates.In summary, we expect rhModeller to dramatically increase the speed and quality of metagnomic gene annotation.

  2. Cryptosporidium Pathogenicity and Virulence

    PubMed Central

    Bouzid, Maha; Chalmers, Rachel M.; Tyler, Kevin M.

    2013-01-01

    Cryptosporidium is a protozoan parasite of medical and veterinary importance that causes gastroenteritis in a variety of vertebrate hosts. Several studies have reported different degrees of pathogenicity and virulence among Cryptosporidium species and isolates of the same species as well as evidence of variation in host susceptibility to infection. The identification and validation of Cryptosporidium virulence factors have been hindered by the renowned difficulties pertaining to the in vitro culture and genetic manipulation of this parasite. Nevertheless, substantial progress has been made in identifying putative virulence factors for Cryptosporidium. This progress has been accelerated since the publication of the Cryptosporidium parvum and C. hominis genomes, with the characterization of over 25 putative virulence factors identified by using a variety of immunological and molecular techniques and which are proposed to be involved in aspects of host-pathogen interactions from adhesion and locomotion to invasion and proliferation. Progress has also been made in the contribution of host factors that are associated with variations in both the severity and risk of infection. Here we provide a review comprised of the current state of knowledge on Cryptosporidium infectivity, pathogenesis, and transmissibility in light of our contemporary understanding of microbial virulence. PMID:23297262

  3. Parasitoid wasp virulence

    PubMed Central

    Mortimer, Nathan T

    2013-01-01

    In nature, larvae of the fruit fly Drosophila melanogaster are commonly infected by parasitoid wasps. Following infection, flies mount an immune response termed cellular encapsulation in which fly immune cells form a multilayered capsule that covers and kills the wasp egg. Parasitoids have thus evolved virulence factors to suppress cellular encapsulation. To uncover the molecular mechanisms underlying the antiwasp response, we and others have begun identifying and functionally characterizing these virulence factors. Our recent work on the Drosophila parasitoid Ganaspis sp.1 has demonstrated that a virulence factor encoding a SERCA-type calcium pump plays an important role in Ganaspis sp.1 virulence. This venom SERCA antagonizes fly immune cell calcium signaling and thereby prevents the activation of the encapsulation response. In this way, the study of wasp virulence factors has revealed a novel aspect of fly immunity, namely a role for calcium signaling in fly immune cell activation, which is conserved with human immunity, again illustrating the marked conservation between fly and mammalian immune responses. Our findings demonstrate that the cellular encapsulation response can serve as a model of immune cell function and can also provide valuable insight into basic cell biological processes. PMID:24088661

  4. Microcomputers and the Media Specialist: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Miller, Inabeth

    An overview of the literature reflecting the rapid development of interest in microcomputer use in education since 1978 is followed by an annotated bibliography which lists books, articles, and ERIC documents in nine categories. The first section includes materials of general interest--historical background, guides to using computers in the…

  5. Growth rate, transmission mode and virulence in human pathogens

    PubMed Central

    Cornwallis, Charlie K.; Buckling, Angus; West, Stuart A.

    2017-01-01

    The harm that pathogens cause to hosts during infection, termed virulence, varies across species from negligible to a high likelihood of rapid death. Classic theory for the evolution of virulence is based on a trade-off between pathogen growth, transmission and host survival, which predicts that higher within-host growth causes increased transmission and higher virulence. However, using data from 61 human pathogens, we found the opposite correlation to the expected positive correlation between pathogen growth rate and virulence. We found that (i) slower growing pathogens are significantly more virulent than faster growing pathogens, (ii) inhaled pathogens and pathogens that infect via skin wounds are significantly more virulent than pathogens that are ingested, but (iii) there is no correlation between symptoms of infection that aid transmission (such as diarrhoea and coughing) and virulence. Overall, our results emphasize how virulence can be influenced by mechanistic life-history details, especially transmission mode, that determine how parasites infect and exploit their hosts. This article is part of the themed issue ‘Opening the black box: re-examining the ecology and evolution of parasite transmission’. PMID:28289261

  6. Annotated Bibliography on Religious Development.

    ERIC Educational Resources Information Center

    Bucher, Anton A.; Reich, K. Helmut

    1991-01-01

    Presents an annotated bibliography on religious development that covers the areas of psychology and religion, measurement of religiousness, religious development during the life cycle, religious experiences, conversion, religion and morality, and images of God. (Author/BB)

  7. Patient Education: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Simmons, Jeannette

    Topics included in this annotated bibliography on patient education are (1) background on development of patient education programs, (2) patient education interventions, (3) references for health professionals, and (4) research and evaluation in patient education. (TA)

  8. Hopi Linguistics: An Annotated Bibliography

    ERIC Educational Resources Information Center

    Seaman, P. David

    1977-01-01

    This is a preliminary research-oriented bibliography on the Hopi language. All known items, through mid-1976, are included, with an annotation for each item sketching its nature and/or possible value. (Author/RM)

  9. Butternut (Juglans cinerea) annotated bibliography.

    Treesearch

    M.E. Ostry; M.J. Moore; S.A.N. Worrall

    2003-01-01

    An annotated bibliography of the major literature related to butternut (Juglans cinerea) from 1890 to 2002. Includes 230 citations and a topical index. Topics include diseases, conservation, genetics, insect pests, silvics, nut production, propagation, silviculture, and utilization.

  10. Publication Production: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Firman, Anthony H.

    1994-01-01

    Offers brief annotations of 52 articles and papers on document production (from the Society for Technical Communication's journal and proceedings) on 9 topics: information processing, document design, using color, typography, tables, illustrations, photography, printing and binding, and production management. (SR)

  11. Publication Production: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Firman, Anthony H.

    1994-01-01

    Offers brief annotations of 52 articles and papers on document production (from the Society for Technical Communication's journal and proceedings) on 9 topics: information processing, document design, using color, typography, tables, illustrations, photography, printing and binding, and production management. (SR)

  12. NCBI prokaryotic genome annotation pipeline.

    PubMed

    Tatusova, Tatiana; DiCuccio, Michael; Badretdin, Azat; Chetvernin, Vyacheslav; Nawrocki, Eric P; Zaslavsky, Leonid; Lomsadze, Alexandre; Pruitt, Kim D; Borodovsky, Mark; Ostell, James

    2016-08-19

    Recent technological advances have opened unprecedented opportunities for large-scale sequencing and analysis of populations of pathogenic species in disease outbreaks, as well as for large-scale diversity studies aimed at expanding our knowledge across the whole domain of prokaryotes. To meet the challenge of timely interpretation of structure, function and meaning of this vast genetic information, a comprehensive approach to automatic genome annotation is critically needed. In collaboration with Georgia Tech, NCBI has developed a new approach to genome annotation that combines alignment based methods with methods of predicting protein-coding and RNA genes and other functional elements directly from sequence. A new gene finding tool, GeneMarkS+, uses the combined evidence of protein and RNA placement by homology as an initial map of annotation to generate and modify ab initio gene predictions across the whole genome. Thus, the new NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) relies more on sequence similarity when confident comparative data are available, while it relies more on statistical predictions in the absence of external evidence. The pipeline provides a framework for generation and analysis of annotation on the full breadth of prokaryotic taxonomy. For additional information on PGAP see https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ and the NCBI Handbook, https://www.ncbi.nlm.nih.gov/books/NBK174280/. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  13. Gene Ontology Annotations and Resources

    PubMed Central

    2013-01-01

    The Gene Ontology (GO) Consortium (GOC, http://www.geneontology.org) is a community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over the past year, the GOC has implemented several processes to increase the quantity, quality and specificity of GO annotations. First, the number of manual, literature-based annotations has grown at an increasing rate. Second, as a result of a new ‘phylogenetic annotation’ process, manually reviewed, homology-based annotations are becoming available for a broad range of species. Third, the quality of GO annotations has been improved through a streamlined process for, and automated quality checks of, GO annotations deposited by different annotation groups. Fourth, the consistency and correctness of the ontology itself has increased by using automated reasoning tools. Finally, the GO has been expanded not only to cover new areas of biology through focused interaction with experts, but also to capture greater specificity in all areas of the ontology using tools for adding new combinatorial terms. The GOC works closely with other ontology developers to support integrated use of terminologies. The GOC supports its user community through the use of e-mail lists, social media and web-based resources. PMID:23161678

  14. Quality of computationally inferred gene ontology annotations.

    PubMed

    Skunca, Nives; Altenhoff, Adrian; Dessimoz, Christophe

    2012-05-01

    Gene Ontology (GO) has established itself as the undisputed standard for protein function annotation. Most annotations are inferred electronically, i.e. without individual curator supervision, but they are widely considered unreliable. At the same time, we crucially depend on those automated annotations, as most newly sequenced genomes are non-model organisms. Here, we introduce a methodology to systematically and quantitatively evaluate electronic annotations. By exploiting changes in successive releases of the UniProt Gene Ontology Annotation database, we assessed the quality of electronic annotations in terms of specificity, reliability, and coverage. Overall, we not only found that electronic annotations have significantly improved in recent years, but also that their reliability now rivals that of annotations inferred by curators when they use evidence other than experiments from primary literature. This work provides the means to identify the subset of electronic annotations that can be relied upon-an important outcome given that >98% of all annotations are inferred without direct curation.

  15. AGeS: A Software System for Microbial Genome Sequence Annotation

    PubMed Central

    Kumar, Kamal; Desai, Valmik; Cheng, Li; Khitrov, Maxim; Grover, Deepak; Satya, Ravi Vijaya; Yu, Chenggang; Zavaljevski, Nela; Reifman, Jaques

    2011-01-01

    Background The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis. To address this need, we developed a standalone software application, the Annotation of microbial Genome Sequences (AGeS) system, which incorporates publicly available and in-house-developed bioinformatics tools and databases, many of which are parallelized for high-throughput performance. Methodology The AGeS system supports three main capabilities. The first is the storage of input contig sequences and the resulting annotation data in a central, customized database. The second is the annotation of microbial genomes using an integrated software pipeline, which first analyzes contigs from high-throughput sequencing by locating genomic regions that code for proteins, RNA, and other genomic elements through the Do-It-Yourself Annotation (DIYA) framework. The identified protein-coding regions are then functionally annotated using the in-house-developed Pipeline for Protein Annotation (PIPA). The third capability is the visualization of annotated sequences using GBrowse. To date, we have implemented these capabilities for bacterial genomes. AGeS was evaluated by comparing its genome annotations with those provided by three other methods. Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods. This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions. PMID:21408217

  16. AGeS: a software system for microbial genome sequence annotation.

    PubMed

    Kumar, Kamal; Desai, Valmik; Cheng, Li; Khitrov, Maxim; Grover, Deepak; Satya, Ravi Vijaya; Yu, Chenggang; Zavaljevski, Nela; Reifman, Jaques

    2011-03-07

    The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis. To address this need, we developed a standalone software application, the Annotation of microbial Genome Sequences (AGeS) system, which incorporates publicly available and in-house-developed bioinformatics tools and databases, many of which are parallelized for high-throughput performance. The AGeS system supports three main capabilities. The first is the storage of input contig sequences and the resulting annotation data in a central, customized database. The second is the annotation of microbial genomes using an integrated software pipeline, which first analyzes contigs from high-throughput sequencing by locating genomic regions that code for proteins, RNA, and other genomic elements through the Do-It-Yourself Annotation (DIYA) framework. The identified protein-coding regions are then functionally annotated using the in-house-developed Pipeline for Protein Annotation (PIPA). The third capability is the visualization of annotated sequences using GBrowse. To date, we have implemented these capabilities for bacterial genomes. AGeS was evaluated by comparing its genome annotations with those provided by three other methods. Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods. This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions.

  17. Virulence factors of medically important fungi.

    PubMed Central

    Hogan, L H; Klein, B S; Levitz, S M

    1996-01-01

    Human fungal pathogens have become an increasingly important medical problem with the explosion in the number of immunocompromised patients as a result of cancer, steroid therapy, chemotherapy, and AIDS. Additionally, the globalization of travel and expansion of humankind into previously undisturbed habitats have led to the reemergence of old fungi and new exposure to previously undescribed fungi. Until recently, relatively little was known about virulence factors for the medically important fungi. With the advent of molecular genetics, rapid progress has now been made in understanding the basis of pathogenicity for organisms such as Aspergillus species and Cryptococcus neoformans. The twin technologies of genetic transformation and "knockout" deletion construction allowed for genetic tests of virulence factors in these organisms. Such knowledge will prove invaluable for the rational design of antifungal therapies. Putative virulence factors and attributes are reviewed for Aspergillus species, C. neoformans, the dimorphic fungal pathogens, and others, with a focus upon a molecular genetic approach. Candida species are excluded from coverage, having been the subject of numerous recent reviews. This growing body of knowledge about fungal pathogens and their virulence factors will significantly aid efforts to treat the serious diseases they cause. PMID:8894347

  18. Novel Strategies to Combat Bacterial Virulence

    PubMed Central

    Lynch, S.V.; Wiener-Kronish, J.P.

    2010-01-01

    Purpose of review Incidences of antimicrobial resistant infections have increased dramatically over the past several decades and are associated with adverse patient outcomes. Alternative approaches to combat infection are critical, and have led to the development of more specific drugs targeted at particular bacterial virulence systems or essential regulatory pathways. The purpose of this review is to highlight the recent developments in anti-bacterial therapy and the novel approaches toward increasing our therapeutic armory against bacterial infection. Recent findings Although classic antibiotic development is not occurring rapidly, alternative therapeutics that target specific bacterial virulence systems are progressing from the discovery stage through the FDA approval process. Here we review novel antibodies that target specific virulence systems as well as a variety of newly discovered small molecules that block bacterial attachment, communication systems (quorum sensing) or important regulatory processes associated with virulence gene expression. Summary The success of novel therapeutics could significantly change clinical practice. Furthermore, the complications of collateral damage due to antibiotic administration e.g. suprainfections or decreased host immunity due to loss of synergistic bacterial communities, may be minimized using therapeutics that specifically target pathogenic behavior. PMID:18787455

  19. An Atlas of annotations of Hydra vulgaris transcriptome.

    PubMed

    Evangelista, Daniela; Tripathi, Kumar Parijat; Guarracino, Mario Rosario

    2016-09-22

    RNA sequencing takes advantage of the Next Generation Sequencing (NGS) technologies for analyzing RNA transcript counts with an excellent accuracy. Trying to interpret this huge amount of data in biological information is still a key issue, reason for which the creation of web-resources useful for their analysis is highly desiderable. Starting from a previous work, Transcriptator, we present the Atlas of Hydra's vulgaris, an extensible web tool in which its complete transcriptome is annotated. In order to provide to the users an advantageous resource that include the whole functional annotated transcriptome of Hydra vulgaris water polyp, we implemented the Atlas web-tool contains 31.988 accesible and downloadable transcripts of this non-reference model organism. Atlas, as a freely available resource, can be considered a valuable tool to rapidly retrieve functional annotation for transcripts differentially expressed in Hydra vulgaris exposed to the distinct experimental treatments. WEB RESOURCE URL: http://www-labgtp.na.icar.cnr.it/Atlas .

  20. Transient virulence of emerging pathogens.

    PubMed

    Bolker, Benjamin M; Nanda, Arjun; Shah, Dharmini

    2010-05-06

    Should emerging pathogens be unusually virulent? If so, why? Existing theories of virulence evolution based on a tradeoff between high transmission rates and long infectious periods imply that epidemic growth conditions will select for higher virulence, possibly leading to a transient peak in virulence near the beginning of an epidemic. This transient selection could lead to high virulence in emerging pathogens. Using a simple model of the epidemiological and evolutionary dynamics of emerging pathogens, along with rough estimates of parameters for pathogens such as severe acute respiratory syndrome, West Nile virus and myxomatosis, we estimated the potential magnitude and timing of such transient virulence peaks. Pathogens that are moderately evolvable, highly transmissible, and highly virulent at equilibrium could briefly double their virulence during an epidemic; thus, epidemic-phase selection could contribute significantly to the virulence of emerging pathogens. In order to further assess the potential significance of this mechanism, we bring together data from the literature for the shapes of tradeoff curves for several pathogens (myxomatosis, HIV, and a parasite of Daphnia) and the level of genetic variation for virulence for one (myxomatosis). We discuss the need for better data on tradeoff curves and genetic variance in order to evaluate the plausibility of various scenarios of virulence evolution.

  1. Transient virulence of emerging pathogens

    PubMed Central

    Bolker, Benjamin M.; Nanda, Arjun; Shah, Dharmini

    2010-01-01

    Should emerging pathogens be unusually virulent? If so, why? Existing theories of virulence evolution based on a tradeoff between high transmission rates and long infectious periods imply that epidemic growth conditions will select for higher virulence, possibly leading to a transient peak in virulence near the beginning of an epidemic. This transient selection could lead to high virulence in emerging pathogens. Using a simple model of the epidemiological and evolutionary dynamics of emerging pathogens, along with rough estimates of parameters for pathogens such as severe acute respiratory syndrome, West Nile virus and myxomatosis, we estimated the potential magnitude and timing of such transient virulence peaks. Pathogens that are moderately evolvable, highly transmissible, and highly virulent at equilibrium could briefly double their virulence during an epidemic; thus, epidemic-phase selection could contribute significantly to the virulence of emerging pathogens. In order to further assess the potential significance of this mechanism, we bring together data from the literature for the shapes of tradeoff curves for several pathogens (myxomatosis, HIV, and a parasite of Daphnia) and the level of genetic variation for virulence for one (myxomatosis). We discuss the need for better data on tradeoff curves and genetic variance in order to evaluate the plausibility of various scenarios of virulence evolution. PMID:19864267

  2. Genomic Data and Annotation from the SEED

    DOE Data Explorer

    Fonstein, Michael; Kogan, Yakov; Osterman, Andrei; Overbeek, Ross; Vonstein, Veronika The Fellowship for Interpretation of Genomes (FIG)

    The SEED Project is a cooperative effort to annotate ever-expanding genomic data so researchers can conduct effective comparative analyses of genomes. Launched in 2003 by the Fellowship for Interpretation of Genomes (FIG), the project is one of several initiatives in ongoing development of data curation systems. SEED is designed to be used by scientists from numerous centers and with varied research objectives. As such, several institutions have since joined FIG in a consortium, including the University of Chicago, DOE’s Argonne National Laboratory (ANL), the University of Illinois at Urbana-Champaign, and others. As one example, ANL has used SEED to develop the National Microbial Pathogen Data Resource. Other agencies and institutions have used the project to discover genome components and clarify gene functions such as metabolism. SEED also has enabled researchers to conduct comparative analyses of closely related genomes and has supported derivation of stoichiometric models to understand metabolic processes. The SEED Project has been extended to support metagenomic samples and concomitant analytical tools. Moreover, the number of genomes being introduced into SEED is growing very rapidly. Building a framework to support this growth while providing highly accurate annotations is centrally important to SEED. The project’s subsystem-based annotation strategy has become the technological foundation for addressing these challenges.(copied from Appendix 7 of Systems Biology Knowledgebase for a New Era in Biology, A Genomics:GTL Report from the May 2008 Workshop, DOE/SC-0113, Grequrick, S; Fredrickson, J.K.; Stevens, R., Pub March 1, 2009.)

  3. Mouse genome annotation by the RefSeq project.

    PubMed

    McGarvey, Kelly M; Goldfarb, Tamara; Cox, Eric; Farrell, Catherine M; Gupta, Tripti; Joardar, Vinita S; Kodali, Vamsi K; Murphy, Michael R; O'Leary, Nuala A; Pujar, Shashikant; Rajput, Bhanu; Rangwala, Sanjida H; Riddick, Lillian D; Webb, David; Wright, Mathew W; Murphy, Terence D; Pruitt, Kim D

    2015-10-01

    Complete and accurate annotation of the mouse genome is critical to the advancement of research conducted on this important model organism. The National Center for Biotechnology Information (NCBI) develops and maintains many useful resources to assist the mouse research community. In particular, the reference sequence (RefSeq) database provides high-quality annotation of multiple mouse genome assemblies using a combinatorial approach that leverages computation, manual curation, and collaboration. Implementation of this conservative and rigorous approach, which focuses on representation of only full-length and non-redundant data, produces high-quality annotation products. RefSeq records explicitly link sequences to current knowledge in a timely manner, updating public records regularly and rapidly in response to nomenclature updates, addition of new relevant publications, collaborator discussion, and user feedback. Whole genome re-annotation is also conducted at least every 12-18 months, and often more frequently in response to assembly updates or availability of informative data. This article highlights key features and advantages of RefSeq genome annotation products and presents an overview of NCBI processes to generate these data. Further discussion of NCBI's resources highlights useful features and the best methods for accessing our data.

  4. IMG ER: A System for Microbial Genome Annotation Expert Review and Curation

    SciTech Connect

    Markowitz, Victor M.; Mavromatis, Konstantinos; Ivanova, Natalia N.; Chen, I-Min A.; Chu, Ken; Kyrpides, Nikos C.

    2009-05-25

    A rapidly increasing number of microbial genomes are sequenced by organizations worldwide and are eventually included into various public genome data resources. The quality of the annotations depends largely on the original dataset providers, with erroneous or incomplete annotations often carried over into the public resources and difficult to correct. We have developed an Expert Review (ER) version of the Integrated Microbial Genomes (IMG) system, with the goal of supporting systematic and efficient revision of microbial genome annotations. IMG ER provides tools for the review and curation of annotations of both new and publicly available microbial genomes within IMG's rich integrated genome framework. New genome datasets are included into IMG ER prior to their public release either with their native annotations or with annotations generated by IMG ER's annotation pipeline. IMG ER tools allow addressing annotation problems detected with IMG's comparative analysis tools, such as genes missed by gene prediction pipelines or genes without an associated function. Over the past year, IMG ER was used for improving the annotations of about 150 microbial genomes.

  5. Virulence evolution at the front line of spreading epidemics.

    PubMed

    Griette, Quentin; Raoul, Gaël; Gandon, Sylvain

    2015-11-01

    Understanding and predicting the spatial spread of emerging pathogens is a major challenge for the public health management of infectious diseases. Theoretical epidemiology shows that the speed of an epidemic is governed by the life-history characteristics of the pathogen and its ability to disperse. Rapid evolution of these traits during the invasion may thus affect the speed of epidemics. Here we study the influence of virulence evolution on the spatial spread of an epidemic. At the edge of the invasion front, we show that more virulent and transmissible genotypes are expected to win the competition with other pathogens. Behind the front line, however, more prudent exploitation strategies outcompete virulent pathogens. Crucially, even when the presence of the virulent mutant is limited to the edge of the front, the invasion speed can be dramatically altered by pathogen evolution. We support our analysis with individual-based simulations and we discuss the additional effects of demographic stochasticity taking place at the front line on virulence evolution. We confirm that an increase of virulence can occur at the front, but only if the carrying capacity of the invading pathogen is large enough. These results are discussed in the light of recent empirical studies examining virulence evolution at the edge of spreading epidemics.

  6. Updating annotations with the distributed annotation system and the automated sequence annotation pipeline

    PubMed Central

    Speier, William; Ochs, Michael F.

    2012-01-01

    Summary: The integration between BioDAS ProServer and Automated Sequence Annotation Pipeline (ASAP) provides an interface for querying diverse annotation sources, chaining and linking results, and standardizing the output using the Distributed Annotation System (DAS) protocol. This interface allows pipeline plans in ASAP to be integrated into any system using HTTP and also allows the information returned by ASAP to be included in the DAS registry for use in any DAS-aware system. Three example implementations have been developed: the first accesses TRANSFAC information to automatically create gene sets for the Coordinated Gene Activity in Pattern Sets (CoGAPS) algorithm; the second integrates annotations from multiple array platforms and provides unified annotations in an R environment; and the third wraps the UniProt database for integration with the SPICE DAS client. Availability: Source code for ASAP 2.7 and the DAS 1.6 interface is available under the GNU public license. Proserver 2.20 is free software available from SourceForge. Scripts for installation and configuration on Linux are provided at our website: http://www.rits.onc.jhmi.edu/dbb/custom/A6/ Contact: Speier@mii.ucla.edu or mfo@jhu.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22945787

  7. Automatic annotation of outdoor photographs

    NASA Astrophysics Data System (ADS)

    Cusano, Claudio; Schettini, Raimondo

    2011-01-01

    We propose here a strategy for the automatic annotation of outdoor photographs. Images are segmented in homogeneous regions which may be then assigned to seven different classes: sky, vegetation, snow, water, ground, street, and sand. These categories allows for content-aware image processing strategies. Our annotation strategy uses a normalized cut segmentation to identify the regions to be classified by a multi-class Support Vector Machine. The strategy has been evaluated on a set of images taken from the LabelMe dataset.

  8. Alignment-Annotator web server: rendering and annotating sequence alignments.

    PubMed

    Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas

    2014-07-01

    Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Alignment-Annotator web server: rendering and annotating sequence alignments

    PubMed Central

    Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas

    2014-01-01

    Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. Availability: http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. PMID:24813445

  10. Sonoran Pronghorn Literature: An Annotated Bibliography

    USGS Publications Warehouse

    Krausman, Paul R.; Morgart, John R.; Harris, Lisa K.; O'Brien, Chantal S.; Cain, James W.; Rosenstock, Steve S.

    2005-01-01

    EXECUTIVE SUMMARY The Sonoran pronghorn (Antilocapra americana sonoriensis) is 1 of 5 subspecies of pronghorn in North America. Sonoran pronghorn historically ranged from eastern California into southeastern Arizona and south to Sonora, Mexico. Sonoran pronghorn currently inhabit the Sonoran Desert in Southwestern Arizona and northern Sonora, Mexico. Unfortunately, their future in North America is uncertain. In the United States, as of December 2004, there were <51 freeranging individual Sonoran pronghorn. This subspecies has been listed as endangered by the United States Fish and Wildlife Service since 1967. Because of the rapid decline in population size, biologists and managers increased management efforts to reverse the downward spiral to extinction. To assist with enhanced management we have compiled an annotated bibliography of most of the works published on Sonoran pronghorn including peer-reviewed papers (n = 31, including submitted manuscripts), books (n = 26), theses and dissertations (n = 5), conferences, proceedings and symposiums (n = 31), reports (n = 84), abstracts (n = 14), popular articles (n = 41), and others (n = 4). These are the same categories under which we list annotations. Most of the articles involve A. a. sonoriensis. We present the scientific name of other pronghorn when clarification is needed.

  11. Interactive Display of Scenes with Annotations

    NASA Technical Reports Server (NTRS)

    Vona, Marsette; Powell, Mark; Backes, Paul; Norris, Jeffrey; Steinke, Robert

    2005-01-01

    ThreeDView is a computer program that enables high-performance interactive display of real-world scenes with annotations. ThreeDView was developed primarily as a component of the Science Activity Planner (SAP) software, wherein it is to be used to display annotated images of terrain acquired by exploratory robots on Mars and possibly other remote planets. The images can be generated from sets of multiple-texture image data in the Visible Scalable Terrain (ViSTa) format, which was described in "Format for Interchange and Display of 3D Terrain Data" (NPO-30600) NASA Tech Briefs, Vol. 28, No. 12 (December 2004), page 25. In ThreeDView, terrain data can be loaded rapidly, the geometric level of detail and texture resolution can be selected, false colors can be used to represent scientific data mapped onto terrain, and the user can select among navigation modes. ThreeDView consists largely of modular Java software components that can easily be reused and extended to produce new high-performance, application-specific software systems for displaying images of three-dimensional real-world scenes.

  12. Correlation between antimicrobial resistance and virulence in Klebsiella pneumoniae.

    PubMed

    Hennequin, C; Robin, F

    2016-03-01

    Klebsiella pneumoniae is responsible for a wide range of infections, including urinary tract infections, pneumonia, bacteremia, and liver abscesses. In addition to susceptible clinical isolates involved in nosocomial infections, multidrug-resistant (MDR) and hypervirulent (hvKP) strains have evolved separately in distinct clonal groups. The rapid geographic spread of these isolates is of particular concern. However, we still know little about the virulence of K. pneumoniae except for hvKP, whose secrets are beginning to be revealed. The treatment of K. pneumoniae infections is threatened by the emergence of antimicrobial resistance. The dissemination of resistance is associated with genetic mobile elements, such as plasmids that may also carry virulence determinants. A proficient pathogen should be virulent, resistant to antibiotics, and epidemic. However, the interplay between resistance and virulence is poorly understood. Here, we review current knowledge on the topic.

  13. Optimizing high performance computing workflow for protein functional annotation.

    PubMed

    Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene

    2014-09-10

    Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data.

  14. Optimizing high performance computing workflow for protein functional annotation

    PubMed Central

    Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene

    2014-01-01

    Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data. PMID:25313296

  15. Riboregulators: Fine-Tuning Virulence in Shigella.

    PubMed

    Fris, Megan E; Murphy, Erin R

    2016-01-01

    Within the past several years, RNA-mediated regulation (ribo-regulation) has become increasingly recognized for its importance in controlling critical bacterial processes. Regulatory RNA molecules, or riboregulators, are perpetually responsive to changes within the micro-environment of a bacterium. Notably, several characterized riboregulators control virulence in pathogenic bacteria, as is the case for each riboregulator characterized to date in Shigella. The timing of virulence gene expression and the ability of the pathogen to adapt to rapidly changing environmental conditions is critical to the establishment and progression of infection by Shigella species; ribo-regulators mediate each of these important processes. This mini review will present the current state of knowledge regarding RNA-mediated regulation in Shigella by detailing the characterization and function of each identified riboregulator in these pathogens.

  16. Riboregulators: Fine-Tuning Virulence in Shigella

    PubMed Central

    Fris, Megan E.; Murphy, Erin R.

    2016-01-01

    Within the past several years, RNA-mediated regulation (ribo-regulation) has become increasingly recognized for its importance in controlling critical bacterial processes. Regulatory RNA molecules, or riboregulators, are perpetually responsive to changes within the micro-environment of a bacterium. Notably, several characterized riboregulators control virulence in pathogenic bacteria, as is the case for each riboregulator characterized to date in Shigella. The timing of virulence gene expression and the ability of the pathogen to adapt to rapidly changing environmental conditions is critical to the establishment and progression of infection by Shigella species; ribo-regulators mediate each of these important processes. This mini review will present the current state of knowledge regarding RNA-mediated regulation in Shigella by detailing the characterization and function of each identified riboregulator in these pathogens. PMID:26858941

  17. Preserving sequence annotations across reference sequences.

    PubMed

    Tatum, Zuotian; Roos, Marco; Gibson, Andrew P; Taschner, Peter Em; Thompson, Mark; Schultes, Erik A; Laros, Jeroen Fj

    2014-01-01

    Matching and comparing sequence annotations of different reference sequences is vital to genomics research, yet many annotation formats do not specify the reference sequence types or versions used. This makes the integration of annotations from different sources difficult and error prone. As part of our effort to create linked data for interoperable sequence annotations, we present an RDF data model for sequence annotation using the ontological framework established by the OBO Foundry ontologies and the Basic Formal Ontology (BFO). We defined reference sequences as the common domain of integration for sequence annotations, and identified three semantic relationships between sequence annotations. In doing so, we created the Reference Sequence Annotation to compensate for gaps in the SO and in its mapping to BFO, particularly for annotations that refer to versions of consensus reference sequences. Moreover, we present three integration models for sequence annotations using different reference assemblies. We demonstrated a working example of a sequence annotation instance, and how this instance can be linked to other annotations on different reference sequences. Sequence annotations in this format are semantically rich and can be integrated easily with different assemblies. We also identify other challenges of modeling reference sequences with the BFO.

  18. Preserving sequence annotations across reference sequences

    PubMed Central

    2014-01-01

    Background Matching and comparing sequence annotations of different reference sequences is vital to genomics research, yet many annotation formats do not specify the reference sequence types or versions used. This makes the integration of annotations from different sources difficult and error prone. Results As part of our effort to create linked data for interoperable sequence annotations, we present an RDF data model for sequence annotation using the ontological framework established by the OBO Foundry ontologies and the Basic Formal Ontology (BFO). We defined reference sequences as the common domain of integration for sequence annotations, and identified three semantic relationships between sequence annotations. In doing so, we created the Reference Sequence Annotation to compensate for gaps in the SO and in its mapping to BFO, particularly for annotations that refer to versions of consensus reference sequences. Moreover, we present three integration models for sequence annotations using different reference assemblies. Conclusions We demonstrated a working example of a sequence annotation instance, and how this instance can be linked to other annotations on different reference sequences. Sequence annotations in this format are semantically rich and can be integrated easily with different assemblies. We also identify other challenges of modeling reference sequences with the BFO. PMID:25093075

  19. Genomic Correlates of Virulence Attenuation in the Deadly Amphibian Chytrid Fungus, Batrachochytrium dendrobatidis

    PubMed Central

    Refsnider, Jeanine M.; Poorten, Thomas J.; Langhammer, Penny F.; Burrowes, Patricia A.; Rosenblum, Erica Bree

    2015-01-01

    Emerging infectious diseasespose a significant threat to global health, but predicting disease outcomes for particular species can be complicated when pathogen virulence varies across space, time, or hosts. The pathogenic chytrid fungus Batrachochytrium dendrobatidis (Bd) has caused worldwide declines in frog populations. Not only do Bd isolates from wild populations vary in virulence, but virulence shifts can occur over short timescales when Bd is maintained in the laboratory. We leveraged changes in Bd virulence over multiple generations of passage to better understand mechanisms of pathogen virulence. We conducted whole-genome resequencing of two samples of the same Bd isolate, differing only in passage history, to identify genomic processes associated with virulence attenuation. The isolate with shorter passage history (and greater virulence) had greater chromosome copy numbers than the isolate maintained in culture for longer, suggesting that virulence attenuation may be associated with loss of chromosome copies. Our results suggest that genomic processes proposed as mechanisms for rapid evolution in Bd are correlated with virulence attenuation in laboratory culture within a single lineage of Bd. Moreover, these genomic processes can occur over extremely short timescales. On a practical level, our results underscore the importance of immediately cryo-archiving new Bd isolates and using fresh isolates, rather than samples cultured in the laboratory for long periods, for laboratory infection experiments. Finally, when attempting to predict disease outcomes for this ecologically important pathogen, it is critical to consider existing variation in virulence among isolates and the potential for shifts in virulence over short timescales. PMID:26333840

  20. A semi-automated 3-D annotation method for breast ultrasound imaging: system development and feasibility study on phantoms.

    PubMed

    Jiang, Wei-wei; Li, An-hua; Zheng, Yong-Ping

    2014-02-01

    Spatial annotation is an essential step in breast ultrasound imaging, because the follow-up diagnosis and treatment are based on this annotation. However, the current method for annotation is manual and highly dependent on the operator's experience. Moreover, important spatial information, such as the probe tilt angle, cannot be indicated in the clinical 2-D annotations. To solve these problems, we developed a semi-automated 3-D annotation method for breast ultrasound imaging. A spatial sensor was fixed on an ultrasound probe to obtain the image spatial data. Three-dimensional virtual models of breast and probe were used to annotate image locations. After the reference points were recorded, this system displayed the image annotations automatically. Compared with the conventional manual annotation method, this new annotation system has higher accuracy as indicated by the phantom test results. In addition, this new annotation method has good repeatability, with intra-class correlation coefficients of 0.907 (average variation: ≤3.45%) and 0.937 (average variation: ≤2.85%) for the intra-rater and inter-rater tests, respectively. Breast phantom experiments simulating clinical breast scanning further indicated the feasibility of this system for clinical applications. This new annotation method is expected to facilitate more accurate, intuitive and rapid breast ultrasound diagnosis.

  1. Annotated Bibliography of Professional Socialization.

    ERIC Educational Resources Information Center

    Rogers, John M.

    This bibliography contains annotations of 49 articles on the topic of professional socialization. The articles were identified using the Educational Resources Information Center (ERIC), Sociological Abstracts, Medline, and Cumulative Index of Nursing and Allied Health Literature data bases. A bias exists in the selection process towards items…

  2. MSDAC Resource Library Annotated Bibliography.

    ERIC Educational Resources Information Center

    Schlee, Phillip F., Comp.; And Others

    The Midwest Sex Discrimination Assistance Center presents an annotated bibliography of 56 monographs and 11 other media materials relating to women and sex discrimination for use in public schools. Media materials include slides, films, filmstrips, audio recordings, and posters. The bibliography is organized by subject and each annotation…

  3. Workforce Reductions. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Hickok, Thomas A.; Hickok, Thomas A.

    This report, which is based on a review of practitioner-oriented sources and scholarly journals, uses a three-part framework to organize annotated bibliographies that, together, list a total of 104 sources that provide the following three perspectives on work force reduction issues: organizational, organizational-individual relationship, and…

  4. Meaningful Assessment: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Thrond, Mary A.

    The annotated bibliography contains citations of nine references on alternative student assessment methods in second language programs, particularly at the secondary school level. The references include a critique of conventional reading comprehension assessment, a discussion of performance assessment, a proposal for a multi-trait, multi-method…

  5. Annotated Videography. Part 3. [Revised].

    ERIC Educational Resources Information Center

    United States Holocaust Memorial Museum, Washington, DC.

    This annotated videography has been designed to identify videotapes addressing Holocaust history that have been used effectively in classrooms and are available readily to most communities. The guide is divided into 15 topical categories, including: life before the Holocaust; perpetrators; propaganda; racism; antisemitism; mosaic of victims;…

  6. Hispanic Heritage. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Denver Univ., CO. School of Education.

    This annotated bibliography of a wide range of materials for the social studies teacher is concerned with the Hispano heritage. The sections are introduced by a brief description. The sections are: 1) general materials, 2) the land and the people, 3) the European background, 4) Spain's colonial system, 5) the Spanish borderlands, 6) the Anglo…

  7. Annotated Bibliography on Humanistic Education

    ERIC Educational Resources Information Center

    Ganung, Cynthia

    1975-01-01

    Part I of this annotated bibliography deals with books and articles on such topics as achievement motivation, process education, transactional analysis, discipline without punishment, role-playing, interpersonal skills, self-acceptance, moral education, self-awareness, values clarification, and non-verbal communication. Part II focuses on…

  8. English Language Learners: Annotated Bibliography

    ERIC Educational Resources Information Center

    Hector-Mason, Anestine; Bardack, Sarah

    2010-01-01

    This annotated bibliography represents a first step toward compiling a comprehensive overview of current research on issues related to English language learners (ELLs). It is intended to be a resource for researchers, policymakers, administrators, and educators who are engaged in efforts to bridge the divide between research, policy, and practice…

  9. Migrant Education: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Palmer, Barbara C., Comp.

    Materials selected for inclusion in the annotated bibliography of 139 publications from 1970 to 1980 give a general understanding of the lives of migrant children, their educational needs and problems, and various attempts made to meet those needs. The bibliography, a valuable tool for researchers and teachers in migrant education, includes books,…

  10. Nikos Kazantzakis: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Qiu, Kui

    This research paper consists of an annotated bibliography about Nikos Kazantzakis, one of the major modern Greek writers and author of "The Last Temptation of Christ,""Zorba the Greek," and many other works. Because of Kazantzakis' position in world literature there are many critical works about him; however, bibliographical…

  11. Radiocarbon Dating: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Fortine, Suellen

    This selective annotated bibliography covers various sources of information on the radiocarbon dating method, including journal articles, conference proceedings, and reports, reflecting the most important and useful sources of the last 25 years. The bibliography is divided into five parts--general background on radiocarbon, radiocarbon dating,…

  12. MSDAC Resource Library Annotated Bibliography.

    ERIC Educational Resources Information Center

    Watson, Cristel; And Others

    This annotated bibliography lists books, films, filmstrips, recordings, and booklets on sex equity. Entries are arranged according to the following topics: career resources, curriculum resources, management, sex equity, sex roles, women's studies, student activities, and sex-fair fiction. Included in each entry are name of author, editor or…

  13. Radiocarbon Dating: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Fortine, Suellen

    This selective annotated bibliography covers various sources of information on the radiocarbon dating method, including journal articles, conference proceedings, and reports, reflecting the most important and useful sources of the last 25 years. The bibliography is divided into five parts--general background on radiocarbon, radiocarbon dating,…

  14. Peaceful Peoples: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Bonta, Bruce D.

    This annotated bibliography includes 438 selected references to books, journal articles, essays within edited volumes, and dissertations that provide significant information about peaceful societies. Peaceful societies are groups that have developed harmonious social structures that allow them to get along with each other, and with outsiders,…

  15. Oral History: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Friedman, Paul G.

    Defining oral history as a method of inquiry by which the memories of individuals are elicited, preserved in interview transcripts or on tape recordings, and then used to enrich understanding of individuals' lives and the events in which they participated, this annotated bibliography provides a broad overview and a sampling of the resources…

  16. Music Analysis: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Fink, Michael

    One hundred and forty citations comprise this annotated bibliography of books, articles, and selected dissertations that encompass trends in music theory and k-16 music education since the late 19th century. Special emphasis is upon writings since the 1950's. During earlier development, music analysts concentrated upon the elements of music (i.e.,…

  17. Teacher Aides; An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Marin County Public Schools, Corte Madera, CA.

    This annotated bibliography lists 40 items, published between 1966 and 1971, that have to do with teacher aides. The listing is arranged alphabetically by author. In addition to the abstract and standard bibliographic information, addresses where the material can be purchased are often included. The items cited include handbooks, research studies,…

  18. Staff Differentiation. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Marin County Superintendent of Schools, Corte Madera, CA.

    This annotated bibliography reviews selected literature focusing on the concept of staff differentiation. Included are 62 items (dated 1966-1970), along with a list of mailing addresses where copies of individual items can be obtained. Also a list of 31 staff differentiation projects receiving financial assistance from the U.S. Office of Education…

  19. Rural Education: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Massey, Sara

    The 120-item annotated bibliography was compiled to facilitate the development of a recently approved course entitled "Topics in Rural Education" at the University of Maine at Machias. Although the dates range from 1964 to 1982, most of the materials were prepared in the 1970s and 1980s. The interrelatedness of the issues makes categorization…

  20. Annotated Selected Puerto Rican Bibliography.

    ERIC Educational Resources Information Center

    Bravo, Enrique R., Comp.

    This work represents an effort on the part of The Urban Center to come one step closer to the realization of its goal to further the growth of ethnic studies. After extensive consultation with educationists from within and without the Puerto Rican community, it was decided that an annotated bilingual bibliography should be published to assist and…

  1. Vietnamese Amerasians: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Johnson, Mark C.; And Others

    This annotated bibliography on Vietnamese Amerasians includes primary and secondary sources as well as reviews of three documentary films. Sources were selected in order to provide an overview of the historical and political context of Amerasian resettlement and a review of the scant available research on coping and adaptation with this…

  2. Vietnamese Amerasians: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Johnson, Mark C.; And Others

    This annotated bibliography on Vietnamese Amerasians includes primary and secondary sources as well as reviews of three documentary films. Sources were selected in order to provide an overview of the historical and political context of Amerasian resettlement and a review of the scant available research on coping and adaptation with this…

  3. Workforce Reductions. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Hickok, Thomas A.; Hickok, Thomas A.

    This report, which is based on a review of practitioner-oriented sources and scholarly journals, uses a three-part framework to organize annotated bibliographies that, together, list a total of 104 sources that provide the following three perspectives on work force reduction issues: organizational, organizational-individual relationship, and…

  4. Aging Awareness: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Grant, Rugh; And Others

    This annotated bibliography cites books and articles on aging. The bibliography was compiled by a resource team who are helping teachers and elderly volunteers create classroom environments in which the strengths and uniqueness of these volunteers are recognized. The books in the first section "Aging in Society" describe the problems, aspirations,…

  5. Annotated Selected Puerto Rican Bibliography.

    ERIC Educational Resources Information Center

    Bravo, Enrique R., Comp.

    This work represents an effort on the part of The Urban Center to come one step closer to the realization of its goal to further the growth of ethnic studies. After extensive consultation with educationists from within and without the Puerto Rican community, it was decided that an annotated bilingual bibliography should be published to assist and…

  6. Infant Feeding: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Crowhurst, Christine Marie, Comp.; Kumer, Bonnie Lee, Comp.

    Intended for parents, health professionals and allied health workers, and others involved in caring for infants and young children, this annotated bibliography brings together in one selective listing a review of over 700 current publications related to infant feeding. Reflecting current knowledge in infant feeding, the bibliography has as its…

  7. Appalachian Women. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Hamm, Mary Margo

    This bibliography compiles annotations of 178 books, journal articles, ERIC documents, and dissertations on Appalachian women and their social, cultural, and economic environment. Entries were published 1966-93 and are listed in the following categories: (1) authors and literary criticism; (2) bibliographies and resource guides; (3) economics,…

  8. Teacher Evaluation: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    McKenna, Bernard H.; And Others

    In his introduction to the 86-item annotated bibliography by Mueller and Poliakoff, McKenna discusses his views on teacher evaluation and his impressions of the documents cited. He observes, in part, that the current concern is with the process of evaluation and that most researchers continue to believe that student achievement is the most…

  9. Annotated Bibliography, Grades K-6.

    ERIC Educational Resources Information Center

    Massachusetts Dept. of Education, Boston. Bureau of Nutrition Education and School Food Services.

    This annotated bibliography on nutrition is for the use of teachers at the elementary grade level. It contains a list of books suitable for reading about nutrition and foods for pupils from kindergarten through the sixth grade. Films and audiovisual presentations for classroom use are also listed. The names and addresses from which these materials…

  10. ANNOTATED BIBLIOGRAPHY OF GEOLOGICAL EDUCATION.

    ERIC Educational Resources Information Center

    BERG, J. ROBERT; AND OTHERS

    ARTICLES ABOUT GEOLOGICAL EDUCATION WRITTEN DURING THE PERIOD 1919-62 ARE INCLUDED IN THIS ANNOTATED BIBLIOGRAPHY. RECOMMENDATIONS OF INDIVIDUAL EDUCATORS AND PROFESSIONAL GROUPS FOR THE UNDERGRADUATE AND GRADUATE PREPARATION OF GEOLOGISTS ARE CONTAINED IN MOST OF THE ITEMS. THE ARTICLES WERE ORIGINALLY PUBLISHED IN PROFESSIONAL JOURNALS OR…

  11. Systems Theory and Communication. Annotated Bibliography.

    ERIC Educational Resources Information Center

    Covington, William G., Jr.

    This annotated bibliography presents annotations of 31 books and journal articles dealing with systems theory and its relation to organizational communication, marketing, information theory, and cybernetics. Materials were published between 1963 and 1992 and are listed alphabetically by author. (RS)

  12. Virulence of two strains of Mycobacterium bovis in cattle following aerosol infection

    USDA-ARS?s Scientific Manuscript database

    Background Over the past two decades, highly virulent strains of Mycobacterium tuberculosis have emerged and spread rapidly in humans, suggesting a selective advantage based upon virulence. A similar scenario has not been described for Mycobacterium bovis infection in cattle (i.e., Bovine Tuberculos...

  13. Smoke Engulfs Singapore [annotated

    NASA Image and Video Library

    2017-09-28

    On June 19, 2013, NASA’s Aqua satellite captured a striking image of smoke billowing from illegal wildfires on the Indonesian island of Sumatra. The smoke blew east toward southern Malaysia and Singapore, and news media reported that thick clouds of haze had descended on Singapore, pushing pollution levels to record levels. Singapore’s primary measure of pollution, the Pollutant Standards Index (PSI)—a uniform measure of key pollutants similar to the Air Quality Index (AQI) used by the U.S. Environmental Protection Agency—spiked to 371 on the afternoon of June 20, 2013, the highest level ever recorded. The previous record occurred in 1997, when the index hit 226. Health experts consider any level above 300 to be “hazardous” to human health. Levels above 200 are considered “very unhealthy.” The image above was captured by the Moderate Resolution Imaging Spectroradiometer (MODIS), an instrument that observes the entire surface of Earth’s every 1 to 2 days. The image was captured during the afternoon at 6:30 UTC (2:30 p.m. local time). Though local laws prohibit it, farmers in Sumatra often burn forests during the dry season to prepare soil for new crops. The BBC reported that Singapore’s Prime Minister Lee Hsien Loong warned that the haze could “easily last for several weeks and quite possibly longer until the dry season ends in Sumatra.” NASA image by Jeff Schmaltz, LANCE/EOSDIS Rapid Response. Caption by Adam Voiland. Credit: NASA Earth Observatory Instrument: Aqua - MODIS NASA image use policy. NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Like us on Facebook Find us on Instagram

  14. Annotation and Classification of Argumentative Writing Revisions

    ERIC Educational Resources Information Center

    Zhang, Fan; Litman, Diane

    2015-01-01

    This paper explores the annotation and classification of students' revision behaviors in argumentative writing. A sentence-level revision schema is proposed to capture why and how students make revisions. Based on the proposed schema, a small corpus of student essays and revisions was annotated. Studies show that manual annotation is reliable with…

  15. Annotation and Classification of Argumentative Writing Revisions

    ERIC Educational Resources Information Center

    Zhang, Fan; Litman, Diane

    2015-01-01

    This paper explores the annotation and classification of students' revision behaviors in argumentative writing. A sentence-level revision schema is proposed to capture why and how students make revisions. Based on the proposed schema, a small corpus of student essays and revisions was annotated. Studies show that manual annotation is reliable with…

  16. Genome re-annotation: a wiki solution?

    PubMed Central

    Salzberg, Steven L

    2007-01-01

    The annotation of most genomes becomes outdated over time, owing in part to our ever-improving knowledge of genomes and in part to improvements in bioinformatics software. Unfortunately, annotation is rarely if ever updated and resources to support routine reannotation are scarce. Wiki software, which would allow many scientists to edit each genome's annotation, offers one possible solution. PMID:17274839

  17. 3D annotation and manipulation of medical anatomical structures

    NASA Astrophysics Data System (ADS)

    Vitanovski, Dime; Schaller, Christian; Hahn, Dieter; Daum, Volker; Hornegger, Joachim

    2009-02-01

    Although the medical scanners are rapidly moving towards a three-dimensional paradigm, the manipulation and annotation/labeling of the acquired data is still performed in a standard 2D environment. Editing and annotation of three-dimensional medical structures is currently a complex task and rather time-consuming, as it is carried out in 2D projections of the original object. A major problem in 2D annotation is the depth ambiguity, which requires 3D landmarks to be identified and localized in at least two of the cutting planes. Operating directly in a three-dimensional space enables the implicit consideration of the full 3D local context, which significantly increases accuracy and speed. A three-dimensional environment is as well more natural optimizing the user's comfort and acceptance. The 3D annotation environment requires the three-dimensional manipulation device and display. By means of two novel and advanced technologies, Wii Nintendo Controller and Philips 3D WoWvx display, we define an appropriate 3D annotation tool and a suitable 3D visualization monitor. We define non-coplanar setting of four Infrared LEDs with a known and exact position, which are tracked by the Wii and from which we compute the pose of the device by applying a standard pose estimation algorithm. The novel 3D renderer developed by Philips uses either the Z-value of a 3D volume, or it computes the depth information out of a 2D image, to provide a real 3D experience without having some special glasses. Within this paper we present a new framework for manipulation and annotation of medical landmarks directly in three-dimensional volume.

  18. MAC: identifying and correcting annotation for multi-nucleotide variations.

    PubMed

    Wei, Lei; Liu, Lu T; Conroy, Jacob R; Hu, Qiang; Conroy, Jeffrey M; Morrison, Carl D; Johnson, Candace S; Wang, Jianmin; Liu, Song

    2015-08-01

    Next-Generation Sequencing (NGS) technologies have rapidly advanced our understanding of human variation in cancer. To accurately translate the raw sequencing data into practical knowledge, annotation tools, algorithms and pipelines must be developed that keep pace with the rapidly evolving technology. Currently, a challenge exists in accurately annotating multi-nucleotide variants (MNVs). These tandem substitutions, when affecting multiple nucleotides within a single protein codon of a gene, result in a translated amino acid involving all nucleotides in that codon. Most existing variant callers report a MNV as individual single-nucleotide variants (SNVs), often resulting in multiple triplet codon sequences and incorrect amino acid predictions. To correct potentially misannotated MNVs among reported SNVs, a primary challenge resides in haplotype phasing which is to determine whether the neighboring SNVs are co-located on the same chromosome. Here we describe MAC (Multi-Nucleotide Variant Annotation Corrector), an integrative pipeline developed to correct potentially mis-annotated MNVs. MAC was designed as an application that only requires a SNV file and the matching BAM file as data inputs. Using an example data set containing 3024 SNVs and the corresponding whole-genome sequencing BAM files, we show that MAC identified eight potentially mis-annotated SNVs, and accurately updated the amino acid predictions for seven of the variant calls. MAC can identify and correct amino acid predictions that result from MNVs affecting multiple nucleotides within a single protein codon, which cannot be handled by most existing SNV-based variant pipelines. The MAC software is freely available and represents a useful tool for the accurate translation of genomic sequence to protein function.

  19. The automatic annotation of bacterial genomes

    PubMed Central

    Richardson, Emily J.

    2013-01-01

    With the development of ultra-high-throughput technologies, the cost of sequencing bacterial genomes has been vastly reduced. As more genomes are sequenced, less time can be spent manually annotating those genomes, resulting in an increased reliance on automatic annotation pipelines. However, automatic pipelines can produce inaccurate genome annotation and their results often require manual curation. Here, we discuss the automatic and manual annotation of bacterial genomes, identify common problems introduced by the current genome annotation process and suggests potential solutions. PMID:22408191

  20. Automatic annotation of organellar genomes with DOGMA

    SciTech Connect

    Wyman, Stacia; Jansen, Robert K.; Boore, Jeffrey L.

    2004-06-01

    Dual Organellar GenoMe Annotator (DOGMA) automates the annotation of extra-nuclear organellar (chloroplast and animal mitochondrial) genomes. It is a web-based package that allows the use of comparative BLAST searches to identify and annotate genes in a genome. DOGMA presents a list of putative genes to the user in a graphical format for viewing and editing. Annotations are stored on our password-protected server. Complete annotations can be extracted for direct submission to GenBank. Furthermore, intergenic regions of specified length can be extracted, as well the nucleotide sequences and amino acid sequences of the genes.

  1. FunctionAnnotator, a versatile and efficient web tool for non-model organism annotation.

    PubMed

    Chen, Ting-Wen; Gan, Ruei-Chi; Fang, Yi-Kai; Chien, Kun-Yi; Liao, Wei-Chao; Chen, Chia-Chun; Wu, Timothy H; Chang, Ian Yi-Feng; Yang, Chi; Huang, Po-Jung; Yeh, Yuan-Ming; Chiu, Cheng-Hsun; Huang, Tzu-Wen; Tang, Petrus

    2017-09-05

    Along with the constant improvement in high-throughput sequencing technology, an increasing number of transcriptome sequencing projects are carried out in organisms without decoded genome information and even on environmental biological samples. To study the biological functions of novel transcripts, the very first task is to identify their potential functions. We present a web-based annotation tool, FunctionAnnotator, which offers comprehensive annotations, including GO term assignment, enzyme annotation, domain/motif identification and predictions for subcellular localization. To accelerate the annotation process, we have optimized the computation processes and used parallel computing for all annotation steps. Moreover, FunctionAnnotator is designed to be versatile, and it generates a variety of useful outputs for facilitating other analyses. Here, we demonstrate how FunctionAnnotator can be helpful in annotating non-model organisms. We further illustrate that FunctionAnnotator can estimate the taxonomic composition of environmental samples and assist in the identification of novel proteins by combining RNA-Seq data with proteomics technology. In summary, FunctionAnnotator can efficiently annotate transcriptomes and greatly benefits studies focusing on non-model organisms or metatranscriptomes. FunctionAnnotator, a comprehensive annotation web-service tool, is freely available online at: http://fa.cgu.edu.tw/ . This new web-based annotator will shed light on field studies involving organisms without a reference genome.

  2. VESPA: Software to Facilitate Genomic Annotation of Prokaryotic Organisms Through Integration of Proteomic and Transcriptomic Data

    SciTech Connect

    Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.; Jensen, Jeffrey L.; Walker, Julia; Kobold, Mark A.; Webb, Samantha R.; Payne, Samuel H.; Ansong, Charles; Adkins, Joshua N.; Cannon, William R.; Webb-Robertson, Bobbie-Jo M.

    2012-04-25

    Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.

  3. Bacterial proteases and virulence.

    PubMed

    Frees, Dorte; Brøndsted, Lone; Ingmer, Hanne

    2013-01-01

    Bacterial pathogens rely on proteolysis for variety of purposes during the infection process. In the cytosol, the main proteolytic players are the conserved Clp and Lon proteases that directly contribute to virulence through the timely degradation of virulence regulators and indirectly by providing tolerance to adverse conditions such as those experienced in the host. In the membrane, HtrA performs similar functions whereas the extracellular proteases, in close contact with host components, pave the way for spreading infections by degrading host matrix components or interfering with host cell signalling to short-circuit host cell processes. Common to both intra- and extracellular proteases is the tight control of their proteolytic activities. In general, substrate recognition by the intracellular proteases is highly selective which is, in part, attributed to the chaperone activity associated with the proteases either encoded within the same polypeptide or on separate subunits. In contrast, substrate recognition by extracellular proteases is less selective and therefore these enzymes are generally expressed as zymogens to prevent premature proteolytic activity that would be detrimental to the cell. These extracellular proteases are activated in complex cascades involving auto-processing and proteolytic maturation. Thus, proteolysis has been adopted by bacterial pathogens at multiple levels to ensure the success of the pathogen in contact with the human host.

  4. Brucella, nitrogen and virulence.

    PubMed

    Ronneau, Severin; Moussa, Simon; Barbier, Thibault; Conde-Álvarez, Raquel; Zuniga-Ripa, Amaia; Moriyon, Ignacio; Letesson, Jean-Jacques

    2016-08-01

    The brucellae are α-Proteobacteria causing brucellosis, an important zoonosis. Although multiplying in endoplasmic reticulum-derived vacuoles, they cause no cell death, suggesting subtle but efficient use of host resources. Brucellae are amino-acid prototrophs able to grow with ammonium or use glutamate as the sole carbon-nitrogen source in vitro. They contain more than twice amino acid/peptide/polyamine uptake genes than the amino-acid auxotroph Legionella pneumophila, which multiplies in a similar vacuole, suggesting a different nutritional strategy. During these two last decades, many mutants of key actors in nitrogen metabolism (transporters, enzymes, regulators, etc.) have been described to be essential for full virulence of brucellae. Here, we review the genomic and experimental data on Brucella nitrogen metabolism and its connection with virulence. An analysis of various aspects of this metabolism (transport, assimilation, biosynthesis, catabolism, respiration and regulation) has highlighted differences and similarities in nitrogen metabolism with other α-Proteobacteria. Together, these data suggest that, during their intracellular life cycle, the brucellae use various nitrogen sources for biosynthesis, catabolism and respiration following a strategy that requires prototrophy and a tight regulation of nitrogen use.

  5. Virulence of enterococci.

    PubMed Central

    Jett, B D; Huycke, M M; Gilmore, M S

    1994-01-01

    Enterococci are commensal organisms well suited to survival in intestinal and vaginal tracts and the oral cavity. However, as for most bacteria described as causing human disease, enterococci also possess properties that can be ascribed roles in pathogenesis. The natural ability of enterococci to readily acquire, accumulate, and share extrachromosomal elements encoding virulence traits or antibiotic resistance genes lends advantages to their survival under unusual environmental stresses and in part explains their increasing importance as nosocomial pathogens. This review discusses the current understanding of enterococcal virulence relating to (i) adherence to host tissues, (ii) invasion and abscess formation, (iii) factors potentially relevant to modulation of host inflammatory responses, and (iv) potentially toxic secreted products. Aggregation substance, surface carbohydrates, or fibronectin-binding moieties may facilitate adherence to host tissues. Enterococcus faecalis appears to have the capacity to translocate across intact intestinal mucosa in models of antibiotic-induced superinfection. Extracellular toxins such as cytolysin can induce tissue damage as shown in an endophthalmitis model, increase mortality in combination with aggregation substance in an endocarditis model, and cause systemic toxicity in a murine peritonitis model. Finally, lipoteichoic acid, superoxide production, or pheromones and corresponding peptide inhibitors each may modulate local inflammatory reactions. Images PMID:7834601

  6. Oncotator: cancer variant annotation tool.

    PubMed

    Ramos, Alex H; Lichtenstein, Lee; Gupta, Manaswi; Lawrence, Michael S; Pugh, Trevor J; Saksena, Gordon; Meyerson, Matthew; Getz, Gad

    2015-04-01

    Oncotator is a tool for annotating genomic point mutations and short nucleotide insertions/deletions (indels) with variant- and gene-centric information relevant to cancer researchers. This information is drawn from 14 different publicly available resources that have been pooled and indexed, and we provide an extensible framework to add additional data sources. Annotations linked to variants range from basic information, such as gene names and functional classification (e.g. missense), to cancer-specific data from resources such as the Catalogue of Somatic Mutations in Cancer (COSMIC), the Cancer Gene Census, and The Cancer Genome Atlas (TCGA). For local use, Oncotator is freely available as a python module hosted on Github (https://github.com/broadinstitute/oncotator). Furthermore, Oncotator is also available as a web service and web application at http://www.broadinstitute.org/oncotator/.

  7. Evaluating Hierarchical Structure in Music Annotations

    PubMed Central

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M.; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement. PMID:28824514

  8. Evaluating Hierarchical Structure in Music Annotations.

    PubMed

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  9. The causes and consequences of changes in virulence following pathogen host shifts.

    PubMed

    Longdon, Ben; Hadfield, Jarrod D; Day, Jonathan P; Smith, Sophia C L; McGonigle, John E; Cogni, Rodrigo; Cao, Chuan; Jiggins, Francis M

    2015-03-01

    Emerging infectious diseases are often the result of a host shift, where the pathogen originates from a different host species. Virulence--the harm a pathogen does to its host-can be extremely high following a host shift (for example Ebola, HIV, and SARs), while other host shifts may go undetected as they cause few symptoms in the new host. Here we examine how virulence varies across host species by carrying out a large cross infection experiment using 48 species of Drosophilidae and an RNA virus. Host shifts resulted in dramatic variation in virulence, with benign infections in some species and rapid death in others. The change in virulence was highly predictable from the host phylogeny, with hosts clustering together in distinct clades displaying high or low virulence. High levels of virulence are associated with high viral loads, and this may determine the transmission rate of the virus.

  10. Targeting virulence not viability in the search for future antibacterials

    PubMed Central

    Heras, Begoña; Scanlon, Martin J; Martin, Jennifer L

    2015-01-01

    New antibacterials need new approaches to overcome the problem of rapid antibiotic resistance. Here we review the development of potential new antibacterial drugs that do not kill bacteria or inhibit their growth, but combat disease instead by targeting bacterial virulence. PMID:24552512

  11. Toward an Improved Laboratory Definition of Listeria monocytogenes Virulence

    USDA-ARS?s Scientific Manuscript database

    Listeria monocytogenes is an opportunistic foodborne pathogen that encompasses a diversity of strains with varied virulence. The ability to rapidly determine the pathogenic potential of L. monocytogenes strains is integral to the control and prevention campaign against listeriosis. Early methods for...

  12. Marky: a tool supporting annotation consistency in multi-user and iterative document annotation projects.

    PubMed

    Pérez-Pérez, Martín; Glez-Peña, Daniel; Fdez-Riverola, Florentino; Lourenço, Anália

    2015-02-01

    Document annotation is a key task in the development of Text Mining methods and applications. High quality annotated corpora are invaluable, but their preparation requires a considerable amount of resources and time. Although the existing annotation tools offer good user interaction interfaces to domain experts, project management and quality control abilities are still limited. Therefore, the current work introduces Marky, a new Web-based document annotation tool equipped to manage multi-user and iterative projects, and to evaluate annotation quality throughout the project life cycle. At the core, Marky is a Web application based on the open source CakePHP framework. User interface relies on HTML5 and CSS3 technologies. Rangy library assists in browser-independent implementation of common DOM range and selection tasks, and Ajax and JQuery technologies are used to enhance user-system interaction. Marky grants solid management of inter- and intra-annotator work. Most notably, its annotation tracking system supports systematic and on-demand agreement analysis and annotation amendment. Each annotator may work over documents as usual, but all the annotations made are saved by the tracking system and may be further compared. So, the project administrator is able to evaluate annotation consistency among annotators and across rounds of annotation, while annotators are able to reject or amend subsets of annotations made in previous rounds. As a side effect, the tracking system minimises resource and time consumption. Marky is a novel environment for managing multi-user and iterative document annotation projects. Compared to other tools, Marky offers a similar visually intuitive annotation experience while providing unique means to minimise annotation effort and enforce annotation quality, and therefore corpus consistency. Marky is freely available for non-commercial use at http://sing.ei.uvigo.es/marky. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  13. Feeling Expression Using Avatars and Its Consistency for Subjective Annotation

    NASA Astrophysics Data System (ADS)

    Ito, Fuyuko; Sasaki, Yasunari; Hiroyasu, Tomoyuki; Miki, Mitsunori

    Consumer Generated Media(CGM) is growing rapidly and the amount of content is increasing. However, it is often difficult for users to extract important contents and the existence of contents recording their experiences can easily be forgotten. As there are no methods or systems to indicate the subjective value of the contents or ways to reuse them, subjective annotation appending subjectivity, such as feelings and intentions, to contents is needed. Representation of subjectivity depends on not only verbal expression, but also nonverbal expression. Linguistically expressed annotation, typified by collaborative tagging in social bookmarking systems, has come into widespread use, but there is no system of nonverbally expressed annotation on the web. We propose the utilization of controllable avatars as a means of nonverbal expression of subjectivity, and confirmed the consistency of feelings elicited by avatars over time for an individual and in a group. In addition, we compared the expressiveness and ease of subjective annotation between collaborative tagging and controllable avatars. The result indicates that the feelings evoked by avatars are consistent in both cases, and using controllable avatars is easier than collaborative tagging for representing feelings elicited by contents that do not express meaning, such as photos.

  14. Precision annotation of digital samples in NCBI's gene expression omnibus.

    PubMed

    Hadley, Dexter; Pan, James; El-Sayed, Osama; Aljabban, Jihad; Aljabban, Imad; Azad, Tej D; Hadied, Mohamad O; Raza, Shuaib; Rayikanti, Benjamin Abhishek; Chen, Bin; Paik, Hyojung; Aran, Dvir; Spatz, Jordan; Himmelstein, Daniel; Panahiazar, Maryam; Bhattacharya, Sanchita; Sirota, Marina; Musen, Mark A; Butte, Atul J

    2017-09-19

    The Gene Expression Omnibus (GEO) contains more than two million digital samples from functional genomics experiments amassed over almost two decades. However, individual sample meta-data remains poorly described by unstructured free text attributes preventing its largescale reanalysis. We introduce the Search Tag Analyze Resource for GEO as a web application (http://STARGEO.org) to curate better annotations of sample phenotypes uniformly across different studies, and to use these sample annotations to define robust genomic signatures of disease pathology by meta-analysis. In this paper, we target a small group of biomedical graduate students to show rapid crowd-curation of precise sample annotations across all phenotypes, and we demonstrate the biological validity of these crowd-curated annotations for breast cancer. STARGEO.org makes GEO data findable, accessible, interoperable and reusable (i.e., FAIR) to ultimately facilitate knowledge discovery. Our work demonstrates the utility of crowd-curation and interpretation of open 'big data' under FAIR principles as a first step towards realizing an ideal paradigm of precision medicine.

  15. BioDEAL: community generation of biological annotations

    PubMed Central

    Breimyer, Paul; Green, Nathan; Kumar, Vinay; Samatova, Nagiza F

    2009-01-01

    Background Publication databases in biomedicine (e.g., PubMed, MEDLINE) are growing rapidly in size every year, as are public databases of experimental biological data and annotations derived from the data. Publications often contain evidence that confirm or disprove annotations, such as putative protein functions, however, it is increasingly difficult for biologists to identify and process published evidence due to the volume of papers and the lack of a systematic approach to associate published evidence with experimental data and annotations. Natural Language Processing (NLP) tools can help address the growing divide by providing automatic high-throughput detection of simple terms in publication text. However, NLP tools are not mature enough to identify complex terms, relationships, or events. Results In this paper we present and extend BioDEAL, a community evidence annotation system that introduces a feedback loop into the database-publication cycle to allow scientists to connect data-driven biological concepts to publications. Conclusion BioDEAL may change the way biologists relate published evidence with experimental data. Instead of biologists or research groups searching and managing evidence independently, the community can collectively build and share this knowledge. PMID:19891799

  16. Virulence Factors of Erwinia amylovora: A Review

    PubMed Central

    Piqué, Núria; Miñana-Galbis, David; Merino, Susana; Tomás, Juan M.

    2015-01-01

    Erwinia amylovora, a Gram negative bacteria of the Enterobacteriaceae family, is the causal agent of fire blight, a devastating plant disease affecting a wide range of host species within Rosaceae and a major global threat to commercial apple and pear production. Among the limited number of control options currently available, prophylactic application of antibiotics during the bloom period appears the most effective. Pathogen cells enter plants through the nectarthodes of flowers and other natural openings, such as wounds, and are capable of rapid movement within plants and the establishment of systemic infections. Many virulence determinants of E. amylovora have been characterized, including the Type III secretion system (T3SS), the exopolysaccharide (EPS) amylovoran, biofilm formation, and motility. To successfully establish an infection, E. amylovora uses a complex regulatory network to sense the relevant environmental signals and coordinate the expression of early and late stage virulence factors involving two component signal transduction systems, bis-(3′-5′)-cyclic di-GMP (c-di-GMP) and quorum sensing. The LPS biosynthetic gene cluster is one of the relatively few genetic differences observed between Rubus- and Spiraeoideae-infecting genotypes of E. amylovora. Other differential factors, such as the presence and composition of an integrative conjugative element associated with the Hrp T3SS (hrp genes encoding the T3SS apparatus), have been recently described. In the present review, we present the recent findings on virulence factors research, focusing on their role in bacterial pathogenesis and indicating other virulence factors that deserve future research to characterize them. PMID:26057748

  17. Virulence Factors of Erwinia amylovora: A Review.

    PubMed

    Piqué, Núria; Miñana-Galbis, David; Merino, Susana; Tomás, Juan M

    2015-06-05

    Erwinia amylovora, a Gram negative bacteria of the Enterobacteriaceae family, is the causal agent of fire blight, a devastating plant disease affecting a wide range of host species within Rosaceae and a major global threat to commercial apple and pear production. Among the limited number of control options currently available, prophylactic application of antibiotics during the bloom period appears the most effective. Pathogen cells enter plants through the nectarthodes of flowers and other natural openings, such as wounds, and are capable of rapid movement within plants and the establishment of systemic infections. Many virulence determinants of E. amylovora have been characterized, including the Type III secretion system (T3SS), the exopolysaccharide (EPS) amylovoran, biofilm formation, and motility. To successfully establish an infection, E. amylovora uses a complex regulatory network to sense the relevant environmental signals and coordinate the expression of early and late stage virulence factors involving two component signal transduction systems, bis-(3'-5')-cyclic di-GMP (c-di-GMP) and quorum sensing. The LPS biosynthetic gene cluster is one of the relatively few genetic differences observed between Rubus- and Spiraeoideae-infecting genotypes of E. amylovora. Other differential factors, such as the presence and composition of an integrative conjugative element associated with the Hrp T3SS (hrp genes encoding the T3SS apparatus), have been recently described. In the present review, we present the recent findings on virulence factors research, focusing on their role in bacterial pathogenesis and indicating other virulence factors that deserve future research to characterize them.

  18. Investigating the ?Trojan Horse? Mechanism of Yersinia pestis Virulence

    SciTech Connect

    McCutchen-Maloney, S L; Fitch, J P

    2005-02-08

    Yersinia pestis, the etiological agent of plague, is a Gram-negative, highly communicable, enteric bacterium that has been responsible for three historic plague pandemics. Currently, several thousand cases of plague are reported worldwide annually, and Y. pestis remains a considerable threat from a biodefense perspective. Y. pestis infection can manifest in three forms: bubonic, septicemic, and pneumonic plague. Of these three forms, pneumonic plague has the highest fatality rate ({approx}100% if left untreated), the shortest intervention time ({approx}24 hours), and is highly contagious. Currently, there are no rapid, widely available vaccines for plague and though plague may be treated with antibiotics, the emergence of both naturally occurring and potentially engineered antibiotic resistant strains makes the search for more effective therapies and vaccines for plague of pressing concern. The virulence mechanism of this deadly bacterium involves induction of a Type III secretion system, a syringe-like apparatus that facilitates the injection of virulence factors, termed Yersinia outer membrane proteins (Yops), into the host cell. These virulence factors inhibit phagocytosis and cytokine secretion, and trigger apoptosis of the host cell. Y. pestis virulence factors and the Type III secretion system are induced thermally, when the bacterium enters the mammalian host from the flea vector, and through host cell contact (or conditions of low Ca{sup 2+} in vitro). Apart from the temperature increase from 26 C to 37 C and host cell contact (or low Ca{sup 2+} conditions), other molecular mechanisms that influence virulence induction in Y. pestis are largely uncharacterized. This project focused on characterizing two novel mechanisms that regulate virulence factor induction in Y. pestis, immunoglobulin G (IgG) binding and quorum sensing, using a real-time reporter system to monitor induction of virulence. Incorporating a better understanding of the mechanisms of virulence

  19. Sequencing and annotation of the Ophiostoma ulmi genome

    PubMed Central

    2013-01-01

    Background The ascomycete fungus Ophiostoma ulmi was responsible for the initial pandemic of the massively destructive Dutch elm disease in Europe and North America in early 1910. Dutch elm disease has ravaged the elm tree population globally and is a major threat to the remaining elm population. O. ulmi is also associated with valuable biomaterials applications. It was recently discovered that proteins from O. ulmi can be used for efficient transformation of amylose in the production of bioplastics. Results We have sequenced the 31.5 Mb genome of O.ulmi using Illumina next generation sequencing. Applying both de novo and comparative genome annotation methods, we predict a total of 8639 gene models. The quality of the predicted genes was validated using a variety of data sources consisting of EST data, mRNA-seq data and orthologs from related fungal species. Sequence-based computational methods were used to identify candidate virulence-related genes. Metabolic pathways were reconstructed and highlight specific enzymes that may play a role in virulence. Conclusions This genome sequence will be a useful resource for further research aimed at understanding the molecular mechanisms of pathogenicity by O. ulmi. It will also facilitate the identification of enzymes necessary for industrial biotransformation applications. PMID:23496816

  20. Virulence and Pathogen Multiplication: A Serial Passage Experiment in the Hypervirulent Bacterial Insect-Pathogen Xenorhabdus nematophila

    PubMed Central

    Chapuis, Élodie; Pagès, Sylvie; Emelianoff, Vanya; Givaudan, Alain; Ferdy, Jean-Baptiste

    2011-01-01

    The trade-off hypothesis proposes that the evolution of pathogens' virulence is shaped by a link between virulence and contagiousness. This link is often assumed to come from the fact that pathogens are contagious only if they can reach high parasitic load in the infected host. In this paper we present an experimental test of the hypothesis that selection on fast replication can affect virulence. In a serial passage experiment, we selected 80 lines of the bacterial insect-pathogen Xenorhabdus nematophila to multiply fast in an artificial culture medium. This selection resulted in shortened lag phase in our selected bacteria. We then injected these bacteria into insects and observed an increase in virulence. This could be taken as a sign that virulence in Xenorhabdus is linked to fast multiplication. But we found, among the selected lineages, either no link or a positive correlation between lag duration and virulence: the most virulent bacteria were the last to start multiplying. We then surveyed phenotypes that are under the control of the flhDC super regulon, which has been shown to be involved in Xenorhabdus virulence. We found that, in one treatment, the flhDC regulon has evolved rapidly, but that the changes we observed were not connected to virulence. All together, these results indicate that virulence is, in Xenorhabdus as in many other pathogens, a multifactorial trait. Being able to grow fast is one way to be virulent. But other ways exist which renders the evolution of virulence hard to predict. PMID:21305003

  1. Cancer Survivorship for Primary Care Annotated Bibliography.

    PubMed

    Westfall, Matthew Y; Overholser, Linda; Zittleman, Linda; Westfall, John M

    2015-06-01

    Long-term cancer survivorship care is a relatively new and rapidly advancing field of research. Increasing cancer survivorship rates have created a huge population of long-term cancer survivors whose cancer-specific needs challenge healthcare infrastructure and highlight a significant deficit of knowledge and guidelines in transitional care from treatment to normalcy/prolonged survivorship. As the paradigm of cancer care has changed from a fixation on the curative to the maintenance on long-term overall quality of life, so to, has the delineation of responsibility between oncologists and primary care physicians (PCPs). As more patients enjoy long-term survival, PCPs play a more comprehensive role in cancer care following acute treatment. To this end, this annotated bibliography was written to provide PCPs and other readers with an up-to-date and robust base of knowledge on long-term cancer survivorship, including definitions and epidemiological information as well as specific considerations and recommendations on physical, psychosocial, sexual, and comorbidity needs of survivors. Additionally, significant information is included on survivorship care, specifically Survivorship Care Plans (SPCs) and their evolution, utilization by oncologists and PCPs, and current gaps, as well as an introduction to patient navigation programs. Given rapid advancements in cancer research, this bibliography is meant to serve as current baseline reference outlining the state of the science.

  2. Semantic annotation in biomedicine: the current landscape.

    PubMed

    Jovanović, Jelena; Bagheri, Ebrahim

    2017-09-22

    The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.

  3. Quality of Computationally Inferred Gene Ontology Annotations

    PubMed Central

    Škunca, Nives; Altenhoff, Adrian; Dessimoz, Christophe

    2012-01-01

    Gene Ontology (GO) has established itself as the undisputed standard for protein function annotation. Most annotations are inferred electronically, i.e. without individual curator supervision, but they are widely considered unreliable. At the same time, we crucially depend on those automated annotations, as most newly sequenced genomes are non-model organisms. Here, we introduce a methodology to systematically and quantitatively evaluate electronic annotations. By exploiting changes in successive releases of the UniProt Gene Ontology Annotation database, we assessed the quality of electronic annotations in terms of specificity, reliability, and coverage. Overall, we not only found that electronic annotations have significantly improved in recent years, but also that their reliability now rivals that of annotations inferred by curators when they use evidence other than experiments from primary literature. This work provides the means to identify the subset of electronic annotations that can be relied upon—an important outcome given that >98% of all annotations are inferred without direct curation. PMID:22693439

  4. Collaborative Design of an Image Annotation Tool for Oceanographic Imaging Systems

    NASA Astrophysics Data System (ADS)

    Futrelle, J.; York, A.

    2012-12-01

    We present a design for a web-based image annotation interface developed to assist in supervised classification of organisms and substrate for habitat assessment from multiple, heterogeneous oceanographic imaging systems. The interface enables human image annotators to count, identify, and measure targets and classify substrate in a variety of kinds of imagery including benthic surveys and imaging flow cytometry. These annotations are then used to build training sets for supervised classification algorithms for purposes of characterizing community structure and habitat assessment. The Ocean Imaging Informatics team at WHOI used the Tetherless World Constellation's collaborative design methodology to develop shared formal information model and system design that applies to a variety of image annotation use cases. Because the information model represents consensus between researchers with differing instrumentation and science needs, it assists with rapid prototyping and establishes a baseline against which existing and forthcoming image annotation tools can be evaluated. A technology review suggested that there are few general-purpose image annotation tools suitable for annotation of high-volume oceanographic imagery. Most tools require too many steps for operations that must be repeated thousands of times, and/or lack critical features such as display of instrument metadata, QA/QC, and management of annotator tasks. While some of these problems are user interface limitations, others suggest that existing tools are missing critically important concepts. For example, QA/QC appears in our information model as an "activity stream" associated with each image annotation, consisting of events indicating review status, specific image quality issues, etc. The model also includes "identification modes" that contextualize annotations according to the annotator's assigned task, assisting both with interpreting annotations and with providing contextual user interface shortcuts

  5. Virulence of the zoonotic agent of leptospirosis: still terra incognita?

    PubMed

    Picardeau, Mathieu

    2017-05-01

    Pathogenic leptospires are the bacterial agents of leptospirosis, which is an emerging zoonotic disease that affects animals and humans worldwide. The success of leptospires as pathogens is explained by their spiral shape and endoflagellar motility (which enable these spirochetes to rapidly cross connective tissues and barriers), as well as by their ability to escape or hijack the host immune system. However, the basic biology and virulence factors of leptospires remain poorly characterized. In this Review, we discuss the recent advances in our understanding of the epidemiology, taxonomy, genomics and the molecular basis of virulence in leptospires, and how these properties contribute to the mechanism of pathogenesis of leptospirosis.

  6. The Staphylococcus aureus RNome and Its Commitment to Virulence

    PubMed Central

    Felden, Brice; Vandenesch, François; Bouloc, Philippe; Romby, Pascale

    2011-01-01

    Staphylococcus aureus is a major human pathogen causing a wide spectrum of nosocomial and community-associated infections with high morbidity and mortality. S. aureus generates a large number of virulence factors whose timing and expression levels are precisely tuned by regulatory proteins and RNAs. The aptitude of bacteria to use RNAs to rapidly modify gene expression, including virulence factors in response to stress or environmental changes, and to survive in a host is an evolving concept. Here, we focus on the recently inventoried S. aureus regulatory RNAs, with emphasis on those with identified functions, two of which are directly involved in pathogenicity. PMID:21423670

  7. Curation, integration and visualization of bacterial virulence factors in PATRIC

    PubMed Central

    Mao, Chunhong; Abraham, David; Wattam, Alice R.; Wilson, Meredith J.C.; Shukla, Maulik; Yoo, Hyun Seung; Sobral, Bruno W.

    2015-01-01

    Motivation: We’ve developed a highly curated bacterial virulence factor (VF) library in PATRIC (Pathosystems Resource Integration Center, www.patricbrc.org) to support infectious disease research. Although several VF databases are available, there is still a need to incorporate new knowledge found in published experimental evidence and integrate these data with other information known for these specific VF genes, including genomic and other omics data. This integration supports the identification of VFs, comparative studies and hypothesis generation, which facilitates the understanding of virulence and pathogenicity. Results: We have manually curated VFs from six prioritized NIAID (National Institute of Allergy and Infectious Diseases) category A–C bacterial pathogen genera, Mycobacterium, Salmonella, Escherichia, Shigella, Listeria and Bartonella, using published literature. This curated information on virulence has been integrated with data from genomic functional annotations, trancriptomic experiments, protein–protein interactions and disease information already present in PATRIC. Such integration gives researchers access to a broad array of information about these individual genes, and also to a suite of tools to perform comparative genomic and transcriptomics analysis that are available at PATRIC. Availability and implementation: All tools and data are freely available at PATRIC (http://patricbrc.org). Contact: cmao@vbi.vt.edu. Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25273106

  8. Prevalence and significance of plasmid maintenance functions in the virulence plasmids of pathogenic bacteria.

    PubMed

    Sengupta, Manjistha; Austin, Stuart

    2011-07-01

    Virulence functions of pathogenic bacteria are often encoded on large extrachromosomal plasmids. These plasmids are maintained at low copy number to reduce the metabolic burden on their host. Low-copy-number plasmids risk loss during cell division. This is countered by plasmid-encoded systems that ensure that each cell receives at least one plasmid copy. Plasmid replication and recombination can produce plasmid multimers that hinder plasmid segregation. These are removed by multimer resolution systems. Equitable distribution of the resulting monomers to daughter cells is ensured by plasmid partition systems that actively segregate plasmid copies to daughter cells in a process akin to mitosis in higher organisms. Any plasmid-free cells that still arise due to occasional failures of replication, multimer resolution, or partition are eliminated by plasmid-encoded postsegregational killing systems. Here we argue that all of these three systems are essential for the stable maintenance of large low-copy-number plasmids. Thus, they should be found on all large virulence plasmids. Where available, well-annotated sequences of virulence plasmids confirm this. Indeed, virulence plasmids often appear to contain more than one example conforming to each of the three system classes. Since these systems are essential for virulence, they can be regarded as ubiquitous virulence factors. As such, they should be informative in the search for new antibacterial agents and drug targets.

  9. Virulence Markers of Dengue Viruses

    DTIC Science & Technology

    1988-06-10

    AD VIRULENCE MARKERS OF DENGUE VIRUSES 00 ANNUAL REPORT 0 James L. Hardy and Srisakul C. Kliks June 10, 1988 Supported by U.S. ARMY MEDICAL RESEARCH...Virulence Markers of Dengue Viruses (U) 12. PERSONAL AUTHOR(S) James L. Hardy ind Sriqakul.C. Klik,,q 13a. TYPE OF REPORT 13b. TIME COVERED 14. DATE OF...TERMS (Continue on reverse it necessary and identify by block number) FIELD GROUP SUB-GROUP Dengue viruses, dengue hemorrhagic fever, virulence, U3

  10. TriAnnot: A Versatile and High Performance Pipeline for the Automated Annotation of Plant Genomes

    PubMed Central

    Leroy, Philippe; Guilhot, Nicolas; Sakai, Hiroaki; Bernard, Aurélien; Choulet, Frédéric; Theil, Sébastien; Reboux, Sébastien; Amano, Naoki; Flutre, Timothée; Pelegrin, Céline; Ohyanagi, Hajime; Seidel, Michael; Giacomoni, Franck; Reichstadt, Mathieu; Alaux, Michael; Gicquello, Emmanuelle; Legeai, Fabrice; Cerutti, Lorenzo; Numa, Hisataka; Tanaka, Tsuyoshi; Mayer, Klaus; Itoh, Takeshi; Quesneville, Hadi; Feuillet, Catherine

    2012-01-01

    In support of the international effort to obtain a reference sequence of the bread wheat genome and to provide plant communities dealing with large and complex genomes with a versatile, easy-to-use online automated tool for annotation, we have developed the TriAnnot pipeline. Its modular architecture allows for the annotation and masking of transposable elements, the structural, and functional annotation of protein-coding genes with an evidence-based quality indexing, and the identification of conserved non-coding sequences and molecular markers. The TriAnnot pipeline is parallelized on a 712 CPU computing cluster that can run a 1-Gb sequence annotation in less than 5 days. It is accessible through a web interface for small scale analyses or through a server for large scale annotations. The performance of TriAnnot was evaluated in terms of sensitivity, specificity, and general fitness using curated reference sequence sets from rice and wheat. In less than 8 h, TriAnnot was able to predict more than 83% of the 3,748 CDS from rice chromosome 1 with a fitness of 67.4%. On a set of 12 reference Mb-sized contigs from wheat chromosome 3B, TriAnnot predicted and annotated 93.3% of the genes among which 54% were perfectly identified in accordance with the reference annotation. It also allowed the curation of 12 genes based on new biological evidences, increasing the percentage of perfect gene prediction to 63%. TriAnnot systematically showed a higher fitness than other annotation pipelines that are not improved for wheat. As it is easily adaptable to the annotation of other plant genomes, TriAnnot should become a useful resource for the annotation of large and complex genomes in the future. PMID:22645565

  11. Identification of Virulence Determinants in Influenza Viruses

    PubMed Central

    2015-01-01

    To date there is no rapid method to screen for highly pathogenic avian influenza strains that may be indicators of future pandemics. We report here the first development of an oligonucleotide-based spectroscopic assay to rapidly and sensitively detect a N66S mutation in the gene coding for the PB1-F2 protein associated with increased virulence in highly pathogenic pandemic influenza viruses. 5′-Thiolated ssDNA oligonucleotides were employed as probes to capture RNA isolated from six influenza viruses, three having N66S mutations, two without the N66S mutation, and one deletion mutant not encoding the PB1-F2 protein. Hybridization was detected without amplification or labeling using the intrinsic surfaced-enhanced Raman spectrum of the DNA-RNA complex. Multivariate analysis identified target RNA binding from noncomplementary sequences with 100% sensitivity, 100% selectivity, and 100% correct classification in the test data set. These results establish that optical-based diagnostic methods are able to directly identify diagnostic indicators of virulence linked to highly pathogenic pandemic influenza viruses without amplification or labeling. PMID:24937567

  12. Omics data management and annotation.

    PubMed

    Harel, Arye; Dalah, Irina; Pietrokovski, Shmuel; Safran, Marilyn; Lancet, Doron

    2011-01-01

    Technological Omics breakthroughs, including next generation sequencing, bring avalanches of data which need to undergo effective data management to ensure integrity, security, and maximal knowledge-gleaning. Data management system requirements include flexible input formats, diverse data entry mechanisms and views, user friendliness, attention to standards, hardware and software platform definition, as well as robustness. Relevant solutions elaborated by the scientific community include Laboratory Information Management Systems (LIMS) and standardization protocols facilitating data sharing and managing. In project planning, special consideration has to be made when choosing relevant Omics annotation sources, since many of them overlap and require sophisticated integration heuristics. The data modeling step defines and categorizes the data into objects (e.g., genes, articles, disorders) and creates an application flow. A data storage/warehouse mechanism must be selected, such as file-based systems and relational databases, the latter typically used for larger projects. Omics project life cycle considerations must include the definition and deployment of new versions, incorporating either full or partial updates. Finally, quality assurance (QA) procedures must validate data and feature integrity, as well as system performance expectations. We illustrate these data management principles with examples from the life cycle of the GeneCards Omics project (http://www.genecards.org), a comprehensive, widely used compendium of annotative information about human genes. For example, the GeneCards infrastructure has recently been changed from text files to a relational database, enabling better organization and views of the growing data. Omics data handling benefits from the wealth of Web-based information, the vast amount of public domain software, increasingly affordable hardware, and effective use of data management and annotation principles as outlined in this chapter.

  13. Identification of two substrates of FTS_1067 protein - An essential virulence factor of Francisella tularensis.

    PubMed

    Spidlova, Petra; Senitkova, Iva; Link, Marek; Stulik, Jiri

    2016-11-15

    Francisella tularensis is a highly virulent intracellular pathogen with the capacity to infect a variety of hosts including humans. One of the most important proteins involved in F. tularensis virulence and pathogenesis is the protein DsbA. This protein is annotated as a lipoprotein with disulfide oxidoreductase/isomerase activity. Therefore, its interactions with different substrates, including probable virulence factors, to assist in their proper folding are anticipated. We aimed to use the immunopurification approach to find DsbA (gene locus FTS_1067) interacting partners in F. tularensis subsp. holarctica strain FSC200 and compare the identified substrates with proteins which were found in our previous comparative proteome analysis. As a result of our work two FTS_1067 substrates, D-alanyl-D-alanine carboxypeptidase family protein and HlyD family secretion protein, were identified. Bacterial two-hybrid systems were further used to test their relevance in confirming FTS_1067 protein interactions.

  14. Annotating Socio-Cultural Structures in Text

    DTIC Science & Technology

    2012-10-31

    from the traditional k-Nearest Neighbor (kNN) algorithm. Using experiments on three different multi-label learning problems, i.e. Yeast gene ...annotated NP/ VP Pane: Shows the sentence parsed using the Parts of Speech tagger Document View Pane: Specifies the document (being annotated) in three...used to annotate the document. In the current application we use the Level 1, Level 2 taxonomy. New concepts may be added to or deleted from the

  15. Crowd Control: Effectively Utilizing Unscreened Crowd Workers for Biomedical Data Annotation.

    PubMed

    Cocos, Anne; Qian, Ting; Callison-Burch, Chris; Masino, Aaron J

    2017-04-04

    Annotating unstructured texts in Electronic Health Records data is usually a necessary step for conducting machine learning research on such datasets. Manual annotation by domain experts provides data of the best quality, but has become increasingly impractical given the rapid increase in the volume of EHR data. In this article, we examine the effectiveness of crowdsourcing with unscreened online workers as an alternative for transforming unstructured texts in EHRs into annotated data that are directly usable in supervised learning models. We find the crowdsourced annotation data to be just as effective as expert data in training a sentence classification model to detect the mentioning of abnormal ear anatomy in radiology reports of audiology. Furthermore, we have discovered that enabling workers to self-report a confidence level associated with each annotation can help researchers pinpoint less-accurate annotations requiring expert scrutiny. Our findings suggest that even crowd workers without specific domain knowledge can contribute effectively to the task of annotating unstructured EHR datasets.

  16. Alga-PrAS (Algal Protein Annotation Suite): A Database of Comprehensive Annotation in Algal Proteomes

    PubMed Central

    Kurotani, Atsushi; Yamada, Yutaka

    2017-01-01

    Algae are smaller organisms than land plants and offer clear advantages in research over terrestrial species in terms of rapid production, short generation time and varied commercial applications. Thus, studies investigating the practical development of effective algal production are important and will improve our understanding of both aquatic and terrestrial plants. In this study we estimated multiple physicochemical and secondary structural properties of protein sequences, the predicted presence of post-translational modification (PTM) sites, and subcellular localization using a total of 510,123 protein sequences from the proteomes of 31 algal and three plant species. Algal species were broadly selected from green and red algae, glaucophytes, oomycetes, diatoms and other microalgal groups. The results were deposited in the Algal Protein Annotation Suite database (Alga-PrAS; http://alga-pras.riken.jp/), which can be freely accessed online. PMID:28069893

  17. Alga-PrAS (Algal Protein Annotation Suite): A Database of Comprehensive Annotation in Algal Proteomes.

    PubMed

    Kurotani, Atsushi; Yamada, Yutaka; Sakurai, Tetsuya

    2017-01-01

    Algae are smaller organisms than land plants and offer clear advantages in research over terrestrial species in terms of rapid production, short generation time and varied commercial applications. Thus, studies investigating the practical development of effective algal production are important and will improve our understanding of both aquatic and terrestrial plants. In this study we estimated multiple physicochemical and secondary structural properties of protein sequences, the predicted presence of post-translational modification (PTM) sites, and subcellular localization using a total of 510,123 protein sequences from the proteomes of 31 algal and three plant species. Algal species were broadly selected from green and red algae, glaucophytes, oomycetes, diatoms and other microalgal groups. The results were deposited in the Algal Protein Annotation Suite database (Alga-PrAS; http://alga-pras.riken.jp/), which can be freely accessed online. © The Author 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.

  18. A beginner's guide to eukaryotic genome annotation.

    PubMed

    Yandell, Mark; Ence, Daniel

    2012-04-18

    The falling cost of genome sequencing is having a marked impact on the research community with respect to which genomes are sequenced and how and where they are annotated. Genome annotation projects have generally become small-scale affairs that are often carried out by an individual laboratory. Although annotating a eukaryotic genome assembly is now within the reach of non-experts, it remains a challenging task. Here we provide an overview of the genome annotation process and the available tools and describe some best-practice approaches.

  19. Variation in virulence of Beauveria bassiana and B. pseudobassiana to the pine weevil Pissodes nemorensis in relation to mycelium characteristics and virulence genes.

    PubMed

    Romón, Pedro; Hatting, Hardus; Goldarazena, Arturo; Iturrondobeitia, Juan Carlos

    2017-02-01

    Entomopathogenic fungi such as Beauveria spp. have potential applications in the biocontrol of insect pests but little is known regarding their infectivity to the pine weevil Pissodes nemorensis. In this study, five isolates of Beauveria pseudobassiana and five isolates of Beauveria bassiana were tested for characteristics correlating with virulence on P. nemorensis. Isolate UAMH301 had the lowest mean lethal concentration value whereas the highest value was obtained with isolate LRC137. Growth rate was negatively correlated with virulence in B. bassiana, because isolate LRC137, the least virulent isolate, grew much more rapidly than the other B. bassiana isolates on SDYA. In contrast, its growth on a hyperosmotic medium was the slowest. Sporulation rate and conidial area were not correlated with virulence. Mycelial cell density was positively correlated with virulence in both species, and the four tested genes appear to be one-copy genes. Bbchit1 and Bbhog1, genes respectively encoding a chitinase and a protein kinase, induced relative expression levels were positively correlated with virulence in B. pseudobassiana. We discuss in terms of previous morphological, physiological and genetic parameters related to virulence in Beauveria and the importance of testing the expression of putative virulence genes in comparison with their basal transcript levels.

  20. Parallel Patterns of Increased Virulence in a Recently Emerged Wildlife Pathogen

    PubMed Central

    Hawley, Dana M.; Osnas, Erik E.; Dobson, Andrew P.; Hochachka, Wesley M.; Ley, David H.; Dhondt, André A.

    2013-01-01

    The evolution of higher virulence during disease emergence has been predicted by theoretical models, but empirical studies of short-term virulence evolution following pathogen emergence remain rare. Here we examine patterns of short-term virulence evolution using archived isolates of the bacterium Mycoplasma gallisepticum collected during sequential emergence events in two geographically distinct populations of the host, the North American house finch (Haemorhous [formerly Carpodacus] mexicanus). We present results from two complementary experiments, one that examines the trend in pathogen virulence in eastern North American isolates over the course of the eastern epidemic (1994–2008), and the other a parallel experiment on Pacific coast isolates of the pathogen collected after M. gallisepticum established itself in western North American house finch populations (2006–2010). Consistent with theoretical expectations regarding short-term or dynamic evolution of virulence, we show rapid increases in pathogen virulence on both coasts following the pathogen's establishment in each host population. We also find evidence for positive genetic covariation between virulence and pathogen load, a proxy for transmission potential, among isolates of M. gallisepticum. As predicted by theory, indirect selection for increased transmission likely drove the evolutionary increase in virulence in both geographic locations. Our results provide one of the first empirical examples of rapid changes in virulence following pathogen emergence, and both the detected pattern and mechanism of positive genetic covariation between virulence and pathogen load are consistent with theoretical expectations. Our study provides unique empirical insight into the dynamics of short-term virulence evolution that are likely to operate in other emerging pathogens of wildlife and humans. PMID:23723736

  1. Evolution of virulence in heterogeneous host communities under multiple trade-offs.

    PubMed

    Osnas, Erik E; Dobson, Andrew P

    2012-02-01

    Many pathogens and parasites are transmitted through hosts that differ in species, sex, genotype, or immune status. In addition, virulence (here defined as disease-induced mortality) and transmission can vary during the infectious period within hosts of different state. Most models of virulence evolution assume that transmission and virulence are constant over the infectious period and that the host population is homogenous. Here, we examine a multispecies susceptible-infected-recovered (SIR) model where transmission occurs within and between species, and transmission and virulence varied during the infectious period. This allows us to understand virulence evolution in a broader range of situations that characterize many emerging diseases. Because emerging pathogens are by definition new to their host populations, they should be expected to rapidly adapt after emergence. We illustrate these evolutionary effects using the framework of adaptive dynamics to examine how virulence evolves after emergence in response to the relative strength of selection on pathogen fitness and mutational variance for virulence. We illustrate the role of evolution by simulating adaptive walks to an evolutionarily stable virulence. We found that the magnitude of between-species transmission and the relative timing of transmission and mortality across species were of primary importance for determining the evolutionarily stable virulence. © 2011 The Author(s). Evolution© 2011 The Society for the Study of Evolution.

  2. The quick and the deadly: growth vs virulence in a seed bank pathogen.

    PubMed

    Meyer, Susan E; Stewart, Thomas E; Clement, Suzette

    2010-07-01

    *We studied the relationship between virulence (ability to kill nondormant Bromus tectorum seeds) and mycelial growth index in the necrotrophic seed pathogen Pyrenophora semeniperda. Seed pathosystems involving necrotrophs differ from those commonly treated in traditional evolution-of-virulence models in that host death increases pathogen fitness by preventing germination, thereby increasing available resources. Because fast-germinating, nondormant B. tectorum seeds commonly escape mortality, we expected virulence to be positively correlated with mycelial growth index. *We performed seed inoculations using conidia from 78 pathogen isolates and scored subsequent mortality. For a subset of 40 of these isolates, representing a range of virulence phenotypes, we measured mycelial growth index. *Virulence varied over a wide range (3-43% seed mortality) and was significantly negatively correlated with mycelial growth index (R(2) = 0.632). More virulent isolates grew more slowly than less virulent isolates. *We concluded that there is an apparent tradeoff between virulence and growth in this pathogen, probably because the production of toxins necessary for necrotrophic pathogenesis competes with metabolic processes associated with growth. Variation in both virulence and growth rate in this pathosystem may be maintained in part by seasonal variation in the relative abundance of rapidly germinating vs dormant host seeds available to the pathogen.

  3. Quorum sensing in bacterial virulence.

    PubMed

    Antunes, L Caetano M; Ferreira, Rosana B R; Buckner, Michelle M C; Finlay, B Brett

    2010-08-01

    Bacteria communicate through the production of diffusible signal molecules termed autoinducers. The molecules are produced at basal levels and accumulate during growth. Once a critical concentration has been reached, autoinducers can activate or repress a number of target genes. Because the control of gene expression by autoinducers is cell-density-dependent, this phenomenon has been called quorum sensing. Quorum sensing controls virulence gene expression in numerous micro-organisms. In some cases, this phenomenon has proven relevant for bacterial virulence in vivo. In this article, we provide a few examples to illustrate how quorum sensing can act to control bacterial virulence in a multitude of ways. Several classes of autoinducers have been described to date and we present examples of how each of the major types of autoinducer can be involved in bacterial virulence. As quorum sensing controls virulence, it has been considered an attractive target for the development of new therapeutic strategies. We discuss some of the new strategies to combat bacterial virulence based on the inhibition of bacterial quorum sensing systems.

  4. Genetic Regulation of Virulence and Antibiotic Resistance in Acinetobacter baumannii.

    PubMed

    Kröger, Carsten; Kary, Stefani C; Schauer, Kristina; Cameron, Andrew D S

    2016-12-28

    Multidrug resistant microorganisms are forecast to become the single biggest challenge to medical care in the 21st century. Over the last decades, members of the genus Acinetobacter have emerged as bacterial opportunistic pathogens, in particular as challenging nosocomial pathogens because of the rapid evolution of antimicrobial resistances. Although we lack fundamental biological insight into virulence mechanisms, an increasing number of researchers are working to identify virulence factors and to study antibiotic resistance. Here, we review current knowledge regarding the regulation of virulence genes and antibiotic resistance in Acinetobacter baumannii. A survey of the two-component systems AdeRS, BaeSR, GacSA and PmrAB explains how each contributes to antibiotic resistance and virulence gene expression, while BfmRS regulates cell envelope structures important for pathogen persistence. A. baumannii uses the transcription factors Fur and Zur to sense iron or zinc depletion and upregulate genes for metal scavenging as a critical survival tool in an animal host. Quorum sensing, nucleoid-associated proteins, and non-classical transcription factors such as AtfA and small regulatory RNAs are discussed in the context of virulence and antibiotic resistance.

  5. Genes involved in virulence of the entomopathogenic fungus Beauveria bassiana.

    PubMed

    Valero-Jiménez, Claudio A; Wiegers, Harm; Zwaan, Bas J; Koenraadt, Constantianus J M; van Kan, Jan A L

    2016-01-01

    Pest insects cause severe damage to global crop production and pose a threat to human health by transmitting diseases. Traditionally, chemical pesticides (insecticides) have been used to control such pests and have proven to be effective only for a limited amount of time because of the rapid spread of genetic insecticide resistance. The basis of this resistance is mostly caused by (co)dominant mutations in single genes, which explains why insecticide use alone is an unsustainable solution. Therefore, robust solutions for insect pest control need to be sought in alternative methods such as biological control agents for which single-gene resistance is less likely to evolve. The entomopathogenic fungus Beauveria bassiana has shown potential as a biological control agent of insects, and insight into the mechanisms of virulence is essential to show the robustness of its use. With the recent availability of the whole genome sequence of B. bassiana, progress in understanding the genetics that constitute virulence toward insects can be made more quickly. In this review we divide the infection process into distinct steps and provide an overview of what is currently known about genes and mechanisms influencing virulence in B. bassiana. We also discuss the need for novel strategies and experimental methods to better understand the infection mechanisms deployed by entomopathogenic fungi. Such knowledge can help improve biocontrol agents, not only by selecting the most virulent genotypes, but also by selecting the genotypes that use combinations of virulence mechanisms for which resistance in the insect host is least likely to develop.

  6. Genetic Regulation of Virulence and Antibiotic Resistance in Acinetobacter baumannii

    PubMed Central

    Kröger, Carsten; Kary, Stefani C.; Schauer, Kristina; Cameron, Andrew D. S.

    2016-01-01

    Multidrug resistant microorganisms are forecast to become the single biggest challenge to medical care in the 21st century. Over the last decades, members of the genus Acinetobacter have emerged as bacterial opportunistic pathogens, in particular as challenging nosocomial pathogens because of the rapid evolution of antimicrobial resistances. Although we lack fundamental biological insight into virulence mechanisms, an increasing number of researchers are working to identify virulence factors and to study antibiotic resistance. Here, we review current knowledge regarding the regulation of virulence genes and antibiotic resistance in Acinetobacter baumannii. A survey of the two-component systems AdeRS, BaeSR, GacSA and PmrAB explains how each contributes to antibiotic resistance and virulence gene expression, while BfmRS regulates cell envelope structures important for pathogen persistence. A. baumannii uses the transcription factors Fur and Zur to sense iron or zinc depletion and upregulate genes for metal scavenging as a critical survival tool in an animal host. Quorum sensing, nucleoid-associated proteins, and non-classical transcription factors such as AtfA and small regulatory RNAs are discussed in the context of virulence and antibiotic resistance. PMID:28036056

  7. BBP: Brucella genome annotation with literature mining and curation.

    PubMed

    Xiang, Zuoshuang; Zheng, Wenjie; He, Yongqun

    2006-07-16

    Brucella species are Gram-negative, facultative intracellular bacteria that cause brucellosis in humans and animals. Sequences of four Brucella genomes have been published, and various Brucella gene and genome data and analysis resources exist. A web gateway to integrate these resources will greatly facilitate Brucella research. Brucella genome data in current databases is largely derived from computational analysis without experimental validation typically found in peer-reviewed publications. It is partially due to the lack of a literature mining and curation system able to efficiently incorporate the large amount of literature data into genome annotation. It is further hypothesized that literature-based Brucella gene annotation would increase understanding of complicated Brucella pathogenesis mechanisms. The Brucella Bioinformatics Portal (BBP) is developed to integrate existing Brucella genome data and analysis tools with literature mining and curation. The BBP InterBru database and Brucella Genome Browser allow users to search and analyze genes of 4 currently available Brucella genomes and link to more than 20 existing databases and analysis programs. Brucella literature publications in PubMed are extracted and can be searched by a TextPresso-powered natural language processing method, a MeSH browser, a keywords search, and an automatic literature update service. To efficiently annotate Brucella genes using the large amount of literature publications, a literature mining and curation system coined Limix is developed to integrate computational literature mining methods with a PubSearch-powered manual curation and management system. The Limix system is used to quickly find and confirm 107 Brucella gene mutations including 75 genes shown to be essential for Brucella virulence. The 75 genes are further clustered using COG. In addition, 62 Brucella genetic interactions are extracted from literature publications. These results make possible more comprehensive

  8. Pooling annotated corpora for clinical concept extraction

    PubMed Central

    2013-01-01

    Background The availability of annotated corpora has facilitated the application of machine learning algorithms to concept extraction from clinical notes. However, high expenditure and labor are required for creating the annotations. A potential alternative is to reuse existing corpora from other institutions by pooling with local corpora, for training machine taggers. In this paper we have investigated the latter approach by pooling corpora from 2010 i2b2/VA NLP challenge and Mayo Clinic Rochester, to evaluate taggers for recognition of medical problems. The corpora were annotated for medical problems, but with different guidelines. The taggers were constructed using an existing tagging system MedTagger that consisted of dictionary lookup, part of speech (POS) tagging and machine learning for named entity prediction and concept extraction. We hope that our current work will be a useful case study for facilitating reuse of annotated corpora across institutions. Results We found that pooling was effective when the size of the local corpus was small and after some of the guideline differences were reconciled. The benefits of pooling, however, diminished as more locally annotated documents were included in the training data. We examined the annotation guidelines to identify factors that determine the effect of pooling. Conclusions The effectiveness of pooling corpora, is dependent on several factors, which include compatibility of annotation guidelines, distribution of report types and size of local and foreign corpora. Simple methods to rectify some of the guideline differences can facilitate pooling. Our findings need to be confirmed with further studies on different corpora. To facilitate the pooling and reuse of annotated corpora, we suggest that – i) the NLP community should develop a standard annotation guideline that addresses the potential areas of guideline differences that are partly identified in this paper; ii) corpora should be annotated with a two

  9. New in protein structure and function annotation: hotspots, single nucleotide polymorphisms and the 'Deep Web'.

    PubMed

    Bromberg, Yana; Yachdav, Guy; Ofran, Yanay; Schneider, Reinhard; Rost, Burkhard

    2009-05-01

    The rapidly increasing quantity of protein sequence data continues to widen the gap between available sequences and annotations. Comparative modeling suggests some aspects of the 3D structures of approximately half of all known proteins; homology- and network-based inferences annotate some aspect of function for a similar fraction of the proteome. For most known protein sequences, however, there is detailed knowledge about neither their function nor their structure. Comprehensive efforts towards the expert curation of sequence annotations have failed to meet the demand of the rapidly increasing number of available sequences. Only the automated prediction of protein function in the absence of homology can close the gap between available sequences and annotations in the foreseeable future. This review focuses on two novel methods for automated annotation, and briefly presents an outlook on how modern web software may revolutionize the field of protein sequence annotation. First, predictions of protein binding sites and functional hotspots, and the evolution of these into the most successful type of prediction of protein function from sequence will be discussed. Second, a new tool, comprehensive in silico mutagenesis, which contributes important novel predictions of function and at the same time prepares for the onset of the next sequencing revolution, will be described. While these two new sub-fields of protein prediction represent the breakthroughs that have been achieved methodologically, it will then be argued that a different development might further change the way biomedical researchers benefit from annotations: modern web software can connect the worldwide web in any browser with the 'Deep Web' (ie, proprietary data resources). The availability of this direct connection, and the resulting access to a wealth of data, may impact drug discovery and development more than any existing method that contributes to protein annotation.

  10. Sophia: A Expedient UMLS Concept Extraction Annotator

    PubMed Central

    Divita, Guy; Zeng, Qing T; Gundlapalli, Adi V.; Duvall, Scott; Nebeker, Jonathan; Samore, Matthew H.

    2014-01-01

    An opportunity exists for meaningful concept extraction and indexing from large corpora of clinical notes in the Veterans Affairs (VA) electronic medical record. Currently available tools such as MetaMap, cTAKES and HITex do not scale up to address this big data need. Sophia, a rapid UMLS concept extraction annotator was developed to fulfill a mandate and address extraction where high throughput is needed while preserving performance. We report on the development, testing and benchmarking of Sophia against MetaMap and cTAKEs. Sophia demonstrated improved performance on recall as compared to cTAKES and MetaMap (0.71 vs 0.66 and 0.38). The overall f-score was similar to cTAKES and an improvement over MetaMap (0.53 vs 0.57 and 0.43). With regard to speed of processing records, we noted Sophia to be several fold faster than cTAKES and the scaled-out MetaMap service. Sophia offers a viable alternative for high-throughput information extraction tasks. PMID:25954351

  11. Harnessing Collaborative Annotations on Online Formative Assessments

    ERIC Educational Resources Information Center

    Lin, Jian-Wei; Lai, Yuan-Cheng

    2013-01-01

    This paper harnesses collaborative annotations by students as learning feedback on online formative assessments to improve the learning achievements of students. Through the developed Web platform, students can conduct formative assessments, collaboratively annotate, and review historical records in a convenient way, while teachers can generate…

  12. Annotated Catalog of Bilingual Vocational Training Materials.

    ERIC Educational Resources Information Center

    Miranda (L.) and Associates, Bethesda, MD.

    This catalog contains annotations for 170 bilingual vocational training materials. Most of the materials are written in English, but materials written in 13 source languages and directed toward speakers of 17 target languages are provided. Annotations are provided for the following different types of documents: administrative, assessment and…

  13. Interactive Electronic Technical Manuals (IETMs) Annotated Bibliography

    DTIC Science & Technology

    2002-10-22

    Copyright 2002, Carnegie Mellon University October 2002 1 Interactive Electronic Technical Manuals (IETMs) Annotated Bibliography... Interactive Electronic Technical Manuals (IETMs). It focuses especially on natural language dialog and speech recognition for use in tutoring, training...DATES COVERED 00-00-2002 to 00-00-2002 4. TITLE AND SUBTITLE Interactive Electronic Technical Manuals (IETMs) Annotated Biblioigraphy 5a

  14. Elementary Social Studies. Authorized Resources Annotated List.

    ERIC Educational Resources Information Center

    Alberta Dept. of Education, Edmonton. Curriculum Standards Branch.

    This comprehensive, annotated resource list is designed to assist in selecting resources authorized by the Alberta (Canada) Education Department for the elementary social studies classroom. Within each grade and topic, annotated entries for basic learning resources are listed, followed by support learning resources and authorized teaching…

  15. Elementary Health: Authorized Resources Annotated List.

    ERIC Educational Resources Information Center

    Alberta Dept. of Education, Edmonton. Curriculum Standards Branch.

    This comprehensive, annotated resource list is designed to assist in selecting resources authorized by the Alberta (Canada) Education Department for the elementary health classroom (Grades 1-6). Within each grade and topic, annotated entries for basic learning resources are listed, followed by support learning resources and authorized teaching…

  16. Annotation as an Index to Critical Writing

    ERIC Educational Resources Information Center

    Liu, Keming

    2006-01-01

    The differences in the ability to write critical and analytical essays among students with individual annotation styles were investigated. Critical and analytical writing was determined by the writer's ability to respond to a text with logical and critical analysis and attention to its thematic argument. Annotation styles were determined by ways…

  17. Language Intensity: A Comprehensive, Annotated Bibliography.

    ERIC Educational Resources Information Center

    Preiss, Raymond W.

    Noting that message variables offer communication scholars a conceptually rich body of information, this 30-item annotated bibliography reflects the diversity of research conducted in the area of language intensity. The journal articles, conference papers, and chapters of books in the annotated bibliography are divided into sections on general…

  18. Black English Annotations for Elementary Reading Programs.

    ERIC Educational Resources Information Center

    Prasad, Sandre

    This report describes a program that uses annotations in the teacher's editions of existing reading programs to indicate the characteristics of black English that may interfere with the reading process of black children. The first part of the report provides a rationale for the annotation approach, explaining that the discrepancy between written…

  19. Harnessing Collaborative Annotations on Online Formative Assessments

    ERIC Educational Resources Information Center

    Lin, Jian-Wei; Lai, Yuan-Cheng

    2013-01-01

    This paper harnesses collaborative annotations by students as learning feedback on online formative assessments to improve the learning achievements of students. Through the developed Web platform, students can conduct formative assessments, collaboratively annotate, and review historical records in a convenient way, while teachers can generate…

  20. Assisted annotation of medical free text using RapTAT.

    PubMed

    Gobbel, Glenn T; Garvin, Jennifer; Reeves, Ruth; Cronin, Robert M; Heavirland, Julia; Williams, Jenifer; Weaver, Allison; Jayaramaraja, Shrimalini; Giuse, Dario; Speroff, Theodore; Brown, Steven H; Xu, Hua; Matheny, Michael E

    2014-01-01

    To determine whether assisted annotation using interactive training can reduce the time required to annotate a clinical document corpus without introducing bias. A tool, RapTAT, was designed to assist annotation by iteratively pre-annotating probable phrases of interest within a document, presenting the annotations to a reviewer for correction, and then using the corrected annotations for further machine learning-based training before pre-annotating subsequent documents. Annotators reviewed 404 clinical notes either manually or using RapTAT assistance for concepts related to quality of care during heart failure treatment. Notes were divided into 20 batches of 19-21 documents for iterative annotation and training. The number of correct RapTAT pre-annotations increased significantly and annotation time per batch decreased by ~50% over the course of annotation. Annotation rate increased from batch to batch for assisted but not manual reviewers. Pre-annotation F-measure increased from 0.5 to 0.6 to >0.80 (relative to both assisted reviewer and reference annotations) over the first three batches and more slowly thereafter. Overall inter-annotator agreement was significantly higher between RapTAT-assisted reviewers (0.89) than between manual reviewers (0.85). The tool reduced workload by decreasing the number of annotations needing to be added and helping reviewers to annotate at an increased rate. Agreement between the pre-annotations and reference standard, and agreement between the pre-annotations and assisted annotations, were similar throughout the annotation process, which suggests that pre-annotation did not introduce bias. Pre-annotations generated by a tool capable of interactive training can reduce the time required to create an annotated document corpus by up to 50%. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  1. Assisted annotation of medical free text using RapTAT

    PubMed Central

    Gobbel, Glenn T; Garvin, Jennifer; Reeves, Ruth; Cronin, Robert M; Heavirland, Julia; Williams, Jenifer; Weaver, Allison; Jayaramaraja, Shrimalini; Giuse, Dario; Speroff, Theodore; Brown, Steven H; Xu, Hua; Matheny, Michael E

    2014-01-01

    Objective To determine whether assisted annotation using interactive training can reduce the time required to annotate a clinical document corpus without introducing bias. Materials and methods A tool, RapTAT, was designed to assist annotation by iteratively pre-annotating probable phrases of interest within a document, presenting the annotations to a reviewer for correction, and then using the corrected annotations for further machine learning-based training before pre-annotating subsequent documents. Annotators reviewed 404 clinical notes either manually or using RapTAT assistance for concepts related to quality of care during heart failure treatment. Notes were divided into 20 batches of 19–21 documents for iterative annotation and training. Results The number of correct RapTAT pre-annotations increased significantly and annotation time per batch decreased by ∼50% over the course of annotation. Annotation rate increased from batch to batch for assisted but not manual reviewers. Pre-annotation F-measure increased from 0.5 to 0.6 to >0.80 (relative to both assisted reviewer and reference annotations) over the first three batches and more slowly thereafter. Overall inter-annotator agreement was significantly higher between RapTAT-assisted reviewers (0.89) than between manual reviewers (0.85). Discussion The tool reduced workload by decreasing the number of annotations needing to be added and helping reviewers to annotate at an increased rate. Agreement between the pre-annotations and reference standard, and agreement between the pre-annotations and assisted annotations, were similar throughout the annotation process, which suggests that pre-annotation did not introduce bias. Conclusions Pre-annotations generated by a tool capable of interactive training can reduce the time required to create an annotated document corpus by up to 50%. PMID:24431336

  2. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop.

    PubMed

    Brister, James Rodney; Bao, Yiming; Kuiken, Carla; Lefkowitz, Elliot J; Le Mercier, Philippe; Leplae, Raphael; Madupu, Ramana; Scheuermann, Richard H; Schobel, Seth; Seto, Donald; Shrivastava, Susmita; Sterk, Peter; Zeng, Qiandong; Klimke, William; Tatusova, Tatiana

    2010-10-01

    Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world's biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.

  3. Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop

    PubMed Central

    Brister, James Rodney; Bao, Yiming; Kuiken, Carla; Lefkowitz, Elliot J.; Le Mercier, Philippe; Leplae, Raphael; Madupu, Ramana; Scheuermann, Richard H.; Schobel, Seth; Seto, Donald; Shrivastava, Susmita; Sterk, Peter; Zeng, Qiandong; Klimke, William; Tatusova, Tatiana

    2010-01-01

    Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world’s biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop. PMID:21994619

  4. Controlled annotations for systems biology.

    PubMed

    Juty, Nick; Laibe, Camille; Le Novère, Nicolas

    2013-01-01

    The aim of this chapter is to provide sufficient information to enable a reader, new to the subject of Systems Biology, to create and use effectively controlled annotations, using resolvable Identifiers.org Uniform Resource Identifiers (URIs). The text details the underlying requirements that have led to the development of such an identification scheme and infrastructure, the principles that underpin its syntax and the benefits derived through its use. It also places into context the relationship with other standardization efforts, how it differs from other pre-existing identification schemes, recent improvements to the system, as well as those that are planned in the future. Throughout, the reader is provided with explicit examples of use and directed to supplementary information where necessary.

  5. Francisella tularensis novicida proteomic and transcriptomic data integration and annotation based on semantic web technologies

    PubMed Central

    Anwar, Nadia; Hunt, Ela

    2009-01-01

    Background This paper summarises the lessons and experiences gained from a case study of the application of semantic web technologies to the integration of data from the bacterial species Francisella tularensis novicida (Fn). Fn data sources are disparate and heterogeneous, as multiple laboratories across the world, using multiple technologies, perform experiments to understand the mechanism of virulence. It is hard to integrate these data sources in a flexible manner that allows new experimental data to be added and compared when required. Results Public domain data sources were combined in RDF. Using this connected graph of database cross references, we extended the annotations of an experimental data set by superimposing onto it the annotation graph. Identifiers used in the experimental data automatically resolved and the data acquired annotations in the rest of the RDF graph. This happened without the expensive manual annotation that would normally be required to produce these links. This graph of resolved identifiers was then used to combine two experimental data sets, a proteomics experiment and a transcriptomic experiment studying the mechanism of virulence through the comparison of wildtype Fn with an avirulent mutant strain. Conclusion We produced a graph of Fn cross references which enabled the combination of two experimental datasets. Through combination of these data we are able to perform queries that compare the results of the two experiments. We found that data are easily combined in RDF and that experimental results are easily compared when the data are integrated. We conclude that semantic data integration offers a convenient, simple and flexible solution to the integration of published and unpublished experimental data. PMID:19796400

  6. Evaluation of mycobacterial virulence using rabbit skin liquefaction model

    PubMed Central

    Zhang, Guoping; Shi, Wanliang; Wang, Mingzhu; Da, Zejiao

    2010-01-01

    Liquefaction is an important pathological process that can subsequently lead to cavitation where large numbers of bacilli can be coughed up which in turn causes spread of tuberculosis in humans. Current animal models to study the liquefaction process and to evaluate virulence of mycobacteria are tedious. In this study, we evaluated a rabbit skin model as a rapid model for liquefaction and virulence assessment using M. bovis BCG, M. tuberculosis avirulent strain H37Ra, M. smegmatis, and the H37Ra strains complemented with selected genes from virulent M. tuberculosis strain H37Rv. We found that with prime and/or boosting immunization, all of these live bacteria at enough high number could induce liquefaction, and the boosting induced stronger liquefaction and more severe lesions in shorter time compared with the prime injection. The skin lesions caused by high dose live BCG (5 × 106 CFU) were the most severe followed by live M. tuberculosis H37Ra with M. smegmatis being the least pathogenic. It is of interest to note that none of the above heat-killed mycobacteria induced liquefaction. When H37Ra was complemented with certain wild type genes of H37Rv, some of the complemented H37Ra strains produced more severe skin lesions than H37Ra. These results suggest that the rabbit skin liquefaction model can be a more visual, convenient, rapid and useful model to evaluate virulence of different mycobacteria and to study the mechanisms of liquefaction. PMID:21178434

  7. Evaluation of mycobacterial virulence using rabbit skin liquefaction model.

    PubMed

    Zhang, Guoping; Zhu, Bingdong; Shi, Wanliang; Wang, Mingzhu; Da, Zejiao; Zhang, Ying

    2010-01-01

    Liquefaction is an important pathological process that can subsequently lead to cavitation where large numbers of bacilli can be coughed up which in turn causes spread of tuberculosis in humans. Current animal models to study the liquefaction process and to evaluate virulence of mycobacteria are tedious. In this study, we evaluated a rabbit skin model as a rapid model for liquefaction and virulence assessment using M. bovis BCG, M. tuberculosis avirulent strain H37Ra, M. smegmatis, and the H37Ra strains complemented with selected genes from virulent M. tuberculosis strain H37Rv. We found that with prime and/or boosting immunization, all of these live bacteria at enough high number could induce liquefaction, and the boosting induced stronger liquefaction and more severe lesions in shorter time compared with the prime injection. The skin lesions caused by high dose live BCG (5×10 (6) ) were the most severe followed by live M. tuberculosis H37Ra with M. smegmatis being the least pathogenic. It is of interest to note that none of the above heat-killed mycobacteria induced liquefaction. When H37Ra was complemented with certain wild type genes of H37Rv, some of the complemented H37Ra strains produced more severe skin lesions than H37Ra. These results suggest that the rabbit skin liquefaction model can be a more visual, convenient, rapid and useful model to evaluate virulence of different mycobacteria and to study the mechanisms of liquefaction.

  8. A transmission-virulence evolutionary trade-off explains attenuation of HIV-1 in Uganda.

    PubMed

    Blanquart, François; Grabowski, Mary Kate; Herbeck, Joshua; Nalugoda, Fred; Serwadda, David; Eller, Michael A; Robb, Merlin L; Gray, Ronald; Kigozi, Godfrey; Laeyendecker, Oliver; Lythgoe, Katrina A; Nakigozi, Gertrude; Quinn, Thomas C; Reynolds, Steven J; Wawer, Maria J; Fraser, Christophe

    2016-11-05

    Evolutionary theory hypothesizes that intermediate virulence maximizes pathogen fitness as a result of a trade-off between virulence and transmission, but empirical evidence remains scarce. We bridge this gap using data from a large and long-standing HIV-1 prospective cohort, in Uganda. We use an epidemiological-evolutionary model parameterised with this data to derive evolutionary predictions based on analysis and detailed individual-based simulations. We robustly predict stabilising selection towards a low level of virulence, and rapid attenuation of the virus. Accordingly, set-point viral load, the most common measure of virulence, has declined in the last 20 years. Our model also predicts that subtype A is slowly outcompeting subtype D, with both subtypes becoming less virulent, as observed in the data. Reduction of set-point viral loads should have resulted in a 20% reduction in incidence, and a three years extension of untreated asymptomatic infection, increasing opportunities for timely treatment of infected individuals.

  9. Concept annotation in the CRAFT corpus

    PubMed Central

    2012-01-01

    Background Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text. Results This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement. Conclusions As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http

  10. Teaching and Learning Communities through Online Annotation

    NASA Astrophysics Data System (ADS)

    van der Pluijm, B.

    2016-12-01

    What do colleagues do with your assigned textbook? What they say or think about the material? Want students to be more engaged in their learning experience? If so, online materials that complement standard lecture format provide new opportunity through managed, online group annotation that leverages the ubiquity of internet access, while personalizing learning. The concept is illustrated with the new online textbook "Processes in Structural Geology and Tectonics", by Ben van der Pluijm and Stephen Marshak, which offers a platform for sharing of experiences, supplementary materials and approaches, including readings, mathematical applications, exercises, challenge questions, quizzes, alternative explanations, and more. The annotation framework used is Hypothes.is, which offers a free, open platform markup environment for annotation of websites and PDF postings. The annotations can be public, grouped or individualized, as desired, including export access and download of annotations. A teacher group, hosted by a moderator/owner, limits access to members of a user group of teachers, so that its members can use, copy or transcribe annotations for their own lesson material. Likewise, an instructor can host a student group that encourages sharing of observations, questions and answers among students and instructor. Also, the instructor can create one or more closed groups that offers study help and hints to students. Options galore, all of which aim to engage students and to promote greater responsibility for their learning experience. Beyond new capacity, the ability to analyze student annotation supports individual learners and their needs. For example, student notes can be analyzed for key phrases and concepts, and identify misunderstandings, omissions and problems. Also, example annotations can be shared to enhance notetaking skills and to help with studying. Lastly, online annotation allows active application to lecture posted slides, supporting real-time notetaking

  11. Virulence determinants of equine infectious anemia virus.

    PubMed

    Payne, Susan L; Fuller, Frederick J

    2010-01-01

    Equine infectious anemia virus (EIAV) is a macrophage-tropic lentivirus that rapidly Induces disease in experimentally infected horses. Because EIAV infection and replication is centered on the monocyte/macrophage and has a pronounced acute disease stage, it is a useful model system for understanding the contribution of monocyte/macrophages to other lentivirus-induced diseases. Genetic mapping studies utilizing chimeric proviruses in which parental viruses are acutely virulent or avirulent have allowed the identification of important regions that influence acute virulence. U3 regions in the viral LTR, surface envelope (SU) protein and the accessory S2 gene strongly influence acute disease expression. While the chimeric proviruses provide insight into genes or genome regions that affect viral pathogenesis, it is then necessary to further dissect those regions to focus on specific virus-host mechanisms that lead to disease expression. The V6 region of the viral env protein is an example of one identified region that may interact with the ELR-1 receptor in an important way and we are currently identifying S2 protein motifs required for disease expression.

  12. GRAIL and GenQuest Sequence Annotation Tools

    SciTech Connect

    Xu, Ying; Shah, Manesh B.; Einstein, J. Ralph; Parang, Morey; Snoddy, Jay; Petrov, Sergey; Olman, Victor; Zhang, Ge; Mural, Richard J.; Uberbacher, Edward C.

    1997-12-31

    Our goal is to develop and implement an integrated intelligent system which can recognize biologically significant features in DNA sequence and provide insight into the organization and function of regions of genomic DNA. GRAIL is a modular expert system which facilitates the recognition of gene features and provides an environment for the construction of sequence annotation. The last several years have seen a rapid evolution of the technology for analyzing genomic DNA sequences. The current GRAIL systems (including the e-mail, XGRAIL, JAVA-GRAIL and genQuest systems) are perhaps the most widely used, comprehensive, and user friendly systems available for computational characterization of genomic DNA sequence.

  13. Genomic Correlates of Virulence Attenuation in the Deadly Amphibian Chytrid Fungus, Batrachochytrium dendrobatidis.

    PubMed

    Refsnider, Jeanine M; Poorten, Thomas J; Langhammer, Penny F; Burrowes, Patricia A; Rosenblum, Erica Bree

    2015-09-01

    Emerging infectious diseasespose a significant threat to global health, but predicting disease outcomes for particular species can be complicated when pathogen virulence varies across space, time, or hosts. The pathogenic chytrid fungus Batrachochytrium dendrobatidis (Bd) has caused worldwide declines in frog populations. Not only do Bd isolates from wild populations vary in virulence, but virulence shifts can occur over short timescales when Bd is maintained in the laboratory. We leveraged changes in Bd virulence over multiple generations of passage to better understand mechanisms of pathogen virulence. We conducted whole-genome resequencing of two samples of the same Bd isolate, differing only in passage history, to identify genomic processes associated with virulence attenuation. The isolate with shorter passage history (and greater virulence) had greater chromosome copy numbers than the isolate maintained in culture for longer, suggesting that virulence attenuation may be associated with loss of chromosome copies. Our results suggest that genomic processes proposed as mechanisms for rapid evolution in Bd are correlated with virulence attenuation in laboratory culture within a single lineage of Bd. Moreover, these genomic processes can occur over extremely short timescales. On a practical level, our results underscore the importance of immediately cryo-archiving new Bd isolates and using fresh isolates, rather than samples cultured in the laboratory for long periods, for laboratory infection experiments. Finally, when attempting to predict disease outcomes for this ecologically important pathogen, it is critical to consider existing variation in virulence among isolates and the potential for shifts in virulence over short timescales. Copyright © 2015 Refsnider et al.

  14. Corpus annotation for mining biomedical events from literature.

    PubMed

    Kim, Jin-Dong; Ohta, Tomoko; Tsujii, Jun'ichi

    2008-01-08

    Advanced Text Mining (TM) such as semantic enrichment of papers, event or relation extraction, and intelligent Question Answering have increasingly attracted attention in the bio-medical domain. For such attempts to succeed, text annotation from the biological point of view is indispensable. However, due to the complexity of the task, semantic annotation has never been tried on a large scale, apart from relatively simple term annotation. We have completed a new type of semantic annotation, event annotation, which is an addition to the existing annotations in the GENIA corpus. The corpus has already been annotated with POS (Parts of Speech), syntactic trees, terms, etc. The new annotation was made on half of the GENIA corpus, consisting of 1,000 Medline abstracts. It contains 9,372 sentences in which 36,114 events are identified. The major challenges during event annotation were (1) to design a scheme of annotation which meets specific requirements of text annotation, (2) to achieve biology-oriented annotation which reflect biologists' interpretation of text, and (3) to ensure the homogeneity of annotation quality across annotators. To meet these challenges, we introduced new concepts such as Single-facet Annotation and Semantic Typing, which have collectively contributed to successful completion of a large scale annotation. The resulting event-annotated corpus is the largest and one of the best in quality among similar annotation efforts. We expect it to become a valuable resource for NLP (Natural Language Processing)-based TM in the bio-medical domain.

  15. The sensor kinase MprB is required for Rhodococcus equi virulence.

    PubMed

    MacArthur, Iain; Parreira, Valeria R; Lepp, Dion; Mutharia, Lucy M; Vazquez-Boland, José A; Prescott, John F

    2011-01-10

    Rhodococcus equi is a soil bacterium and, like Mycobacterium tuberculosis, a member of the mycolata. Through possession of a virulence plasmid, it has the ability to infect the alveolar macrophages of foals, resulting in pyogranulomatous bronchopneumonia. The virulence plasmid has an orphan two-component system (TCS) regulatory gene, orf8, mutation of which completely attenuates virulence. This study attempted to find the cognate sensor kinase (SK) of orf8. Annotation of the R. equi strain 103 genome identified 23 TCSs encoded on the chromosome, which were used in a DNA microarray to compare TCS gene transcription in murine macrophage-like cells to growth in vitro. This identified six SKs as significantly up-regulated during growth in macrophages. Mutants of these SKs were constructed and their ability to persist in macrophages was determined with one SK, MprB, found to be required for intracellular survival. The attenuation of the mprB- mutant, and its complementation, was confirmed in a mouse virulence assay. In silico analysis of the R. equi genome sequence identified an MprA binding box motif homologous to that of M. tuberculosis, on mprA, pepD, sigB and sigE. The results of this study also show that R. equi responds to the macrophage environment differently from M. tuberculosis. MprB is the first SK identified as required for R. equi virulence and intracellular survival. Copyright © 2010 Elsevier B.V. All rights reserved.

  16. Comparative genomic analysis of three white spot syndrome virus isolates of different virulence.

    PubMed

    Li, Fang; Gao, Meiling; Xu, Limei; Yang, Feng

    2017-04-01

    Three white spot syndrome virus (WSSV) isolates of different virulence were identified in our previous study, the high-virulent strain WSSV-CN01, the moderate-virulent strain WSSV-CN02 and the low-virulent strain WSSV-CN03. In this study, the genomes of these three WSSV isolates were sequenced, annotated and compared. The genome sizes for WSSV-CN01, WSSV-CN02, and WSSV-CN03 are 309,286, 294,261, and 284,148 bp, bearing 177, 164, and 154 putative protein-coding genes, respectively. The genomic variations including insertions, deletions, and substitutions were investigated. Thirty four genes show >20% variation in their sequences in WSSV-CN02 or WSSV-CN03, in comparison with WSSV-CN01, including six envelope protein genes (wsv237/vp41A, wsv238/vp52A, wsv338/vp62, wsv339/vp39, wsv077/vp36A, and wsv242/vp41B), and two immediate-early genes (wsv108 and wsv178). The genomic variations among WSSV isolates of different virulence, especially those in the coding regions, certainly provide new insight into the understanding of the molecular basis of WSSV pathogenesis.

  17. AIGO: Towards a unified framework for the Analysis and the Inter-comparison of GO functional annotations

    PubMed Central

    2011-01-01

    Background In response to the rapid growth of available genome sequences, efforts have been made to develop automatic inference methods to functionally characterize them. Pipelines that infer functional annotation are now routinely used to produce new annotations at a genome scale and for a broad variety of species. These pipelines differ widely in their inference algorithms, confidence thresholds and data sources for reasoning. This heterogeneity makes a comparison of the relative merits of each approach extremely complex. The evaluation of the quality of the resultant annotations is also challenging given there is often no existing gold-standard against which to evaluate precision and recall. Results In this paper, we present a pragmatic approach to the study of functional annotations. An ensemble of 12 metrics, describing various aspects of functional annotations, is defined and implemented in a unified framework, which facilitates their systematic analysis and inter-comparison. The use of this framework is demonstrated on three illustrative examples: analysing the outputs of state-of-the-art inference pipelines, comparing electronic versus manual annotation methods, and monitoring the evolution of publicly available functional annotations. The framework is part of the AIGO library (http://code.google.com/p/aigo) for the Analysis and the Inter-comparison of the products of Gene Ontology (GO) annotation pipelines. The AIGO library also provides functionalities to easily load, analyse, manipulate and compare functional annotations and also to plot and export the results of the analysis in various formats. Conclusions This work is a step toward developing a unified framework for the systematic study of GO functional annotations. This framework has been designed so that new metrics on GO functional annotations can be added in a very straightforward way. PMID:22054122

  18. AIGO: towards a unified framework for the analysis and the inter-comparison of GO functional annotations.

    PubMed

    Defoin-Platel, Michael; Hindle, Matthew M; Lysenko, Artem; Powers, Stephen J; Habash, Dimah Z; Rawlings, Christopher J; Saqi, Mansoor

    2011-11-03

    In response to the rapid growth of available genome sequences, efforts have been made to develop automatic inference methods to functionally characterize them. Pipelines that infer functional annotation are now routinely used to produce new annotations at a genome scale and for a broad variety of species. These pipelines differ widely in their inference algorithms, confidence thresholds and data sources for reasoning. This heterogeneity makes a comparison of the relative merits of each approach extremely complex. The evaluation of the quality of the resultant annotations is also challenging given there is often no existing gold-standard against which to evaluate precision and recall. In this paper, we present a pragmatic approach to the study of functional annotations. An ensemble of 12 metrics, describing various aspects of functional annotations, is defined and implemented in a unified framework, which facilitates their systematic analysis and inter-comparison. The use of this framework is demonstrated on three illustrative examples: analysing the outputs of state-of-the-art inference pipelines, comparing electronic versus manual annotation methods, and monitoring the evolution of publicly available functional annotations. The framework is part of the AIGO library (http://code.google.com/p/aigo) for the Analysis and the Inter-comparison of the products of Gene Ontology (GO) annotation pipelines. The AIGO library also provides functionalities to easily load, analyse, manipulate and compare functional annotations and also to plot and export the results of the analysis in various formats. This work is a step toward developing a unified framework for the systematic study of GO functional annotations. This framework has been designed so that new metrics on GO functional annotations can be added in a very straightforward way.

  19. Making web annotations persistent over time

    SciTech Connect

    Sanderson, Robert; Van De Sompel, Herbert

    2010-01-01

    As Digital Libraries (DL) become more aligned with the web architecture, their functional components need to be fundamentally rethought in terms of URIs and HTTP. Annotation, a core scholarly activity enabled by many DL solutions, exhibits a clearly unacceptable characteristic when existing models are applied to the web: due to the representations of web resources changing over time, an annotation made about a web resource today may no longer be relevant to the representation that is served from that same resource tomorrow. We assume the existence of archived versions of resources, and combine the temporal features of the emerging Open Annotation data model with the capability offered by the Memento framework that allows seamless navigation from the URI of a resource to archived versions of that resource, and arrive at a solution that provides guarantees regarding the persistence of web annotations over time. More specifically, we provide theoretical solutions and proof-of-concept experimental evaluations for two problems: reconstructing an existing annotation so that the correct archived version is displayed for all resources involved in the annotation, and retrieving all annotations that involve a given archived version of a web resource.

  20. Genotator: A Workbench for Sequence Annotation

    SciTech Connect

    Harris, N.L.

    1997-05-01

    Sequencing centers such as the Human Genome Center at LBNL are producing an ever-increasing flood of genetic data. Annotation can greatly enhance the biological value of these sequences. Useful annotations include possible gene locations, homologies to known genes, and gene signals such as promoters and splice sites. Genotator is a workbench for automated sequence annotation and annotation browsing. The back end runs a series of sequence analysis tools on a DNA sequence, handling the various input and output formats required by the tools. Genotator currently runs five different gene finding programs, three homology searches, and searches for promoters, splice sites, and ORFs. The results of the analyses run by Genotator can be viewed with the interactive graphical browser. The browser displays color-coded sequence annotations on a canvas that can be scrolled and zoomed, allowing the annotated sequence to be explored at multiple levels of detail. The user can view the actual DNA sequence in a separate window; when a region is selected in the map display, it is automatically highlighted in the sequence display, and vice-versa. By displaying the output of all of the sequence analyses, Genotator provides an intuitive way to identify the significant regions (for example, probable exons) in a sequence. Users can interactively add personal annotations to label regions of interest. Additional capabilities of Genotator include primer design and pattern searching.

  1. Genotator: A Workbench for Sequence Annotation

    PubMed Central

    Harris, Nomi L.

    1997-01-01

    Sequencing centers such as the Human Genome Center at LBNL are producing an ever-increasing flood of genetic data. Annotation can greatly enhance the biological value of these sequences. Useful annotations include possible gene locations, homologies to known genes, and gene signals such as promoters and splice sites. Genotator is a workbench for automated sequence annotation and annotation browsing. The back end runs a series of sequence analysis tools on a DNA sequence, handling the various input and output formats required by the tools. Genotator currently runs five different gene-finding programs, three homology searches, and searches for promoters, splice sites, and ORFs. The results of the analyses run by Genotator can be viewed with the interactive graphical browser. The browser displays color-coded sequence annotations on a canvas that can be scrolled and zoomed, allowing the annotated sequence to be explored at multiple levels of detail. The user can view the actual DNA sequence in a separate window; when a region is selected in the map display, it is highlighted automatically in the sequence display, and vice versa. By displaying the output of all of the sequence analyses, Genotator provides an intuitive way to identify the significant regions (for example, probable exons) in a sequence. Users can interactively add personal annotations to label regions of interest. Additional capabilities of Genotator include primer design and pattern searching. [Further details for obtaining Genotator are available at http://www.cshl.org/gr.] PMID:9253604

  2. Do pathogens become more virulent as they spread? Evidence from the amphibian declines in Central America.

    PubMed

    Phillips, Ben L; Puschendorf, Robert

    2013-09-07

    The virulence of a pathogen can vary strongly through time. While cyclical variation in virulence is regularly observed, directional shifts in virulence are less commonly observed and are typically associated with decreasing virulence of biological control agents through coevolution. It is increasingly appreciated, however, that spatial effects can lead to evolutionary trajectories that differ from standard expectations. One such possibility is that, as a pathogen spreads through a naive host population, its virulence increases on the invasion front. In Central America, there is compelling evidence for the recent spread of pathogenic Batrachochytrium dendrobatidis (Bd) and for its strong impact on amphibian populations. Here, we re-examine data on Bd prevalence and amphibian population decline across 13 sites from southern Mexico through Central America, and show that, in the initial phases of the Bd invasion, amphibian population decline lagged approximately 9 years behind the arrival of the pathogen, but that this lag diminished markedly over time. In total, our analysis suggests an increase in Bd virulence as it spread southwards, a pattern consistent with rapid evolution of increased virulence on Bd's invading front. The impact of Bd on amphibians might therefore be driven by rapid evolution in addition to more proximate environmental drivers.

  3. Do pathogens become more virulent as they spread? Evidence from the amphibian declines in Central America

    PubMed Central

    Phillips, Ben L.; Puschendorf, Robert

    2013-01-01

    The virulence of a pathogen can vary strongly through time. While cyclical variation in virulence is regularly observed, directional shifts in virulence are less commonly observed and are typically associated with decreasing virulence of biological control agents through coevolution. It is increasingly appreciated, however, that spatial effects can lead to evolutionary trajectories that differ from standard expectations. One such possibility is that, as a pathogen spreads through a naive host population, its virulence increases on the invasion front. In Central America, there is compelling evidence for the recent spread of pathogenic Batrachochytrium dendrobatidis (Bd) and for its strong impact on amphibian populations. Here, we re-examine data on Bd prevalence and amphibian population decline across 13 sites from southern Mexico through Central America, and show that, in the initial phases of the Bd invasion, amphibian population decline lagged approximately 9 years behind the arrival of the pathogen, but that this lag diminished markedly over time. In total, our analysis suggests an increase in Bd virulence as it spread southwards, a pattern consistent with rapid evolution of increased virulence on Bd's invading front. The impact of Bd on amphibians might therefore be driven by rapid evolution in addition to more proximate environmental drivers. PMID:23843393

  4. BioBuilder as a database development and functional annotation platform for proteins

    PubMed Central

    Navarro, J Daniel; Talreja, Naveen; Peri, Suraj; Vrushabendra, BM; Rashmi, BP; Padma, N; Surendranath, Vineeth; Jonnalagadda, Chandra Kiran; Kousthub, PS; Deshpande, Nandan; Shanker, K; Pandey, Akhilesh

    2004-01-01

    Background The explosion in biological information creates the need for databases that are easy to develop, easy to maintain and can be easily manipulated by annotators who are most likely to be biologists. However, deployment of scalable and extensible databases is not an easy task and generally requires substantial expertise in database development. Results BioBuilder is a Zope-based software tool that was developed to facilitate intuitive creation of protein databases. Protein data can be entered and annotated through web forms along with the flexibility to add customized annotation features to protein entries. A built-in review system permits a global team of scientists to coordinate their annotation efforts. We have already used BioBuilder to develop Human Protein Reference Database , a comprehensive annotated repository of the human proteome. The data can be exported in the extensible markup language (XML) format, which is rapidly becoming as the standard format for data exchange. Conclusions As the proteomic data for several organisms begins to accumulate, BioBuilder will prove to be an invaluable platform for functional annotation and development of customizable protein centric databases. BioBuilder is open source and is available under the terms of LGPL. PMID:15099404

  5. COGNATE: comparative gene annotation characterizer.

    PubMed

    Wilbrandt, Jeanne; Misof, Bernhard; Niehuis, Oliver

    2017-07-17

    The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https

  6. FACTORS RELATING TO THE VIRULENCE OF STAPHYLOCOCCI

    PubMed Central

    Koenig, M. Glenn; Melly, Marian Ann; Rogers, David E.

    1962-01-01

    Four clumping factor-negative strains of Staphylococcus aureus were found to closely resemble the diffuse colonial variant of the Smith strain. All produced fatal intraperitoneal infections in mice, all grew in diffuse, streaming colonies in plasma or serum soft agar, and all behaved like encapsulated microorganisms in in vitro opsonic systems. These staphylococci were resistant to phagocytosis in the peritoneal cavities of normal mice. When mice were immunized with heat-killed vaccines prepared from the Smith diffuse variant these strains were rapidly ingested by peritoneal leukocytes and the animals survived. This observation suggests that these strains share the same or a similar phagocytosis-retarding antigen. While most pathogenic staphylococci isolated from human material do not behave like these unusual mouse-virulent strains, indirect evidence is cited to support the suggestion that other staphylococci may acquire similar phagocytosis-resisting characteristics during in vivo multiplication. Studies to support or refute this thesis are in progress. PMID:14034137

  7. Automated Knowledge Annotation for Dynamic Collaborative Environments

    SciTech Connect

    Cowell, Andrew J.; Gregory, Michelle L.; Marshall, Eric J.; McGrath, Liam R.

    2009-05-19

    This paper describes the Knowledge Encapsulation Framework (KEF), a suite of tools to enable automated knowledge annotation for modeling and simulation projects. This framework can be used to capture evidence (e.g., facts extracted from journal articles and government reports), discover new evidence (from similar peer-reviewed material as well as social media), enable discussions surrounding domain-specific topics and provide automatically generated semantic annotations for improved corpus investigation. The current KEF implementation is presented within a wiki environment, providing a simple but powerful collaborative space for team members to review, annotate, discuss and align evidence with their modeling frameworks.

  8. Annotating user-defined abstractions for optimization

    SciTech Connect

    Quinlan, D; Schordan, M; Vuduc, R; Yi, Q

    2005-12-05

    This paper discusses the features of an annotation language that we believe to be essential for optimizing user-defined abstractions. These features should capture semantics of function, data, and object-oriented abstractions, express abstraction equivalence (e.g., a class represents an array abstraction), and permit extension of traditional compiler optimizations to user-defined abstractions. Our future work will include developing a comprehensive annotation language for describing the semantics of general object-oriented abstractions, as well as automatically verifying and inferring the annotated semantics.

  9. Transcriptome analysis of fat bodies from two brown planthopper (Nilaparvata lugens) populations with different virulence levels in rice.

    PubMed

    Yu, Haixin; Ji, Rui; Ye, Wenfeng; Chen, Hongdan; Lai, Wenxiang; Fu, Qiang; Lou, Yonggen

    2014-01-01

    The brown planthopper (BPH), Nilaparvata lugens (Stål), one of the most serious rice insect pests in Asia, can quickly overcome rice resistance by evolving new virulent populations. The insect fat body plays essential roles in the life cycles of insects and in plant-insect interactions. However, whether differences in fat body transcriptomes exist between insect populations with different virulence levels and whether the transcriptomic differences are related to insect virulence remain largely unknown. In this study, we performed transcriptome-wide analyses on the fat bodies of two BPH populations with different virulence levels in rice. The populations were derived from rice variety TN1 (TN1 population) and Mudgo (M population). In total, 33,776 and 32,332 unigenes from the fat bodies of TN1 and M populations, respectively, were generated using Illumina technology. Gene ontology annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology classifications indicated that genes related to metabolism and immunity were significantly active in the fat bodies. In addition, a total of 339 unigenes showed homology to genes of yeast-like symbionts (YLSs) from 12 genera and endosymbiotic bacteria Wolbachia. A comparative analysis of the two transcriptomes generated 7,860 differentially expressed genes. GO annotations and enrichment analysis of KEGG pathways indicated these differentially expressed transcripts might be involved in metabolism and immunity. Finally, 105 differentially expressed genes from YLSs and Wolbachia were identified, genes which might be associated with the formation of different virulent populations. This study was the first to compare the fat-body transcriptomes of two BPH populations having different virulence traits and to find genes that may be related to this difference. Our findings provide a molecular resource for future investigations of fat bodies and will be useful in examining the interactions between the fat body and virulence

  10. Differences in virulence of Naegleria fowleri.

    PubMed

    De Jonckheere, J

    1979-10-01

    All pathogenic Naegleria fowleri isolated from the environment were highly virulent to mice when instilled intranasally. Axenic cultivation gradually decreased virulence of highly virulent strains. This decrease was most pronounced in environmental isolates and of minor importance in N. fowleri isolated from human cerebrospinal fluid. The low virulent strains obtained by continuous axenic cultivation appeared after clonation to consist of individuals with different virulence. Virulence could be enhanced in low virulent strains by brain passage and passages in Vero cell cultures, but could not be induced by these methods in nonvirulent strains isolated from the environment. Different mice strains showed different sensitivities to infection with pathogenic Naegleria. In addition, older mice were less sensitive than younger animals to low virulent strains.

  11. Bioinformatics for Diagnostics, Forensics, and Virulence Characterization and Detection

    SciTech Connect

    Gardner, S; Slezak, T

    2005-04-05

    We summarize four of our group's high-risk/high-payoff research projects funded by the Intelligence Technology Innovation Center (ITIC) in conjunction with our DHS-funded pathogen informatics activities. These are (1) quantitative assessment of genomic sequencing needs to predict high quality DNA and protein signatures for detection, and comparison of draft versus finished sequences for diagnostic signature prediction; (2) development of forensic software to identify SNP and PCR-RFLP variations from a large number of viral pathogen sequences and optimization of the selection of markers for maximum discrimination of those sequences; (3) prediction of signatures for the detection of virulence, antibiotic resistance, and toxin genes and genetic engineering markers in bacteria; (4) bioinformatic characterization of virulence factors to rapidly screen genomic data for potential genes with similar functions and to elucidate potential health threats in novel organisms. The results of (1) are being used by policy makers to set national sequencing priorities. Analyses from (2) are being used in collaborations with the CDC to genotype and characterize many variola strains, and reports from these collaborations have been made to the President. We also determined SNPs for serotype and strain discrimination of 126 foot and mouth disease virus (FMDV) genomes. For (3), currently >1000 probes have been predicted for the specific detection of >4000 virulence, antibiotic resistance, and genetic engineering vector sequences, and we expect to complete the bioinformatic design of a comprehensive ''virulence detection chip'' by August 2005. Results of (4) will be a system to rapidly predict potential virulence pathways and phenotypes in organisms based on their genomic sequences.

  12. Improving genome annotation of enterotoxigenic Escherichia coli TW10598 by a label-free quantitative MS/MS approach.

    PubMed

    Pettersen, Veronika Kuchařová; Steinsland, Hans; Wiker, Harald G

    2015-11-01

    The most commonly used genome annotation processes are to a great extent based on computational methods. However, those can only predict genes that have been described earlier or that have sequence signatures indicative of a gene function. Here, we report a synonymous proteogenomic approach for experimentally improving microbial genome annotation based on label-free quantitative MS/MS. The approach is exemplified by analysis of cell extracts from in vitro cultured enterotoxigenic Escherichia coli (ETEC) strain TW10598, as part of an effort to create a new reference ETEC genome sequence. The proteomic analysis yielded identification of 2060 proteins, out of which 312 proteins were originally described as hypothetical. For 84% of the identified proteins we have provided description of their relative quantitative levels, among others, for 20 abundantly expressed ETEC virulence factors. Proteogenomic mapping supported the existence of four protein-coding genes that had not been annotated, and led to correction of translation start positions of another nine. The addition of the proteomic analysis into TW10598 genome re-annotation project improved quality of the annotation, and provided experimental evidence for a significant portion of ETEC expressed proteome. Data are available via ProteomeXchange with identifier PXD002473 (http://proteomecentral.proteomexchange.org/dataset/PXD002473).

  13. Insights into Entamoeba histolytica virulence modulation.

    PubMed

    Padilla-Vaca, F; Anaya-Velázquez, F

    2010-08-01

    Entamoeba histolytica is able to invade human tissues by means of several molecules and biological properties related to the virulence. Pathogenic amebas use three major virulence factors, Gal/GalNAc lectin, amebapore and proteases, for lyse, phagocytose, kill and destroy a variety of cells and tissues in the host. Responses of the parasite to host components such as mucins and bacterial flora influence the behavior of pathogenic amebas altering their expression of virulence factors. The relative virulence of different strains of E. histolytica has been shown to vary as a consequence of changes in conditions of in vitro cultivation which implies substantial changes in basic metabolic aspects and factors directly and indirectly related to amebic virulence. Comparison of E. histolytica strains with different virulence phenotypes and under different conditions of growth will help to identify new virulence factor candidates and define the interplay between virulence factors and invasive phenotype. Virulence attenuate mutants of E. histolytica are useful also to uncover novel virulence determinants. The comparison of biological properties and virulence factors between E. histolytica and E. dispar, a non-pathogenic species, has been a useful approach to investigate the key factors involved in the experimental presentation of amebiasis and its complex regulation. The molecular mechanisms that regulate these variations in virulence are not yet known. Their elucidation will help us to better understand the gene expression plasticity that enables the effective adaptation of the ameba to changes in growth culture conditions and host factors.

  14. WormBase: Annotating many nematode genomes.

    PubMed

    Howe, Kevin; Davis, Paul; Paulini, Michael; Tuli, Mary Ann; Williams, Gary; Yook, Karen; Durbin, Richard; Kersey, Paul; Sternberg, Paul W

    2012-01-01

    WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase's role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.

  15. Annotation and retrieval in protein interaction databases

    NASA Astrophysics Data System (ADS)

    Cannataro, Mario; Hiram Guzzi, Pietro; Veltri, Pierangelo

    2014-06-01

    Biological databases have been developed with a special focus on the efficient retrieval of single records or the efficient computation of specialized bioinformatics algorithms against the overall database, such as in sequence alignment. The continuos production of biological knowledge spread on several biological databases and ontologies, such as Gene Ontology, and the availability of efficient techniques to handle such knowledge, such as annotation and semantic similarity measures, enable the development on novel bioinformatics applications that explicitly use and integrate such knowledge. After introducing the annotation process and the main semantic similarity measures, this paper shows how annotations and semantic similarity can be exploited to improve the extraction and analysis of biologically relevant data from protein interaction databases. As case studies, the paper presents two novel software tools, OntoPIN and CytoSeVis, both based on the use of Gene Ontology annotations, for the advanced querying of protein interaction databases and for the enhanced visualization of protein interaction networks.

  16. Communication and Gender: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Shermis, Michael

    1990-01-01

    Presents an 18-item annotated bibliography of recent research reports and conference papers concerning the role gender plays in communication. Includes aspects of organizational communication, interpersonal communication, and communication in the media. (SR)

  17. An Informally Annotated Bibliography of Sociolinguistics.

    ERIC Educational Resources Information Center

    Tannen, Deborah

    This annotated bibliography of sociolinguistics is divided into the following sections: speech events, ethnography of speaking and anthropological approaches to analysis of conversation; discourse analysis (including analysis of conversation and narrative), ethnomethodology and nonverbal communication; sociolinguistics; pragmatics (including…

  18. SASL: A Semantic Annotation System for Literature

    NASA Astrophysics Data System (ADS)

    Yuan, Pingpeng; Wang, Guoyin; Zhang, Qin; Jin, Hai

    Due to ambiguity, search engines for scientific literatures may not return right search results. One efficient solution to the problems is to automatically annotate literatures and attach the semantic information to them. Generally, semantic annotation requires identifying entities before attaching semantic information to them. However, due to abbreviation and other reasons, it is very difficult to identify entities correctly. The paper presents a Semantic Annotation System for Literature (SASL), which utilizes Wikipedia as knowledge base to annotate literatures. SASL mainly attaches semantic to terminology, academic institutions, conferences, and journals etc. Many of them are usually abbreviations, which induces ambiguity. Here, SASL uses regular expressions to extract the mapping between full name of entities and their abbreviation. Since full names of several entities may map to a single abbreviation, SASL introduces Hidden Markov Model to implement name disambiguation. Finally, the paper presents the experimental results, which confirm SASL a good performance.

  19. THERMOCHROMISM AND PHOTOTROPISM: AN ANNOTATED BIBLIOGRAPHY,

    DTIC Science & Technology

    This annotated bibliography contains 151 selected references to materials that exhibit the phenomena of thermochromism (reversible color change...removed. Thermochromic coatings whose color is dependent on temperature can be used for thermal control of space satellites. (Author)

  20. GRADUATE AND PROFESSIONAL EDUCATION, AN ANNOTATED BIBLIOGRAPHY.

    ERIC Educational Resources Information Center

    HEISS, ANN M.; AND OTHERS

    THIS ANNOTATED BIBLIOGRAPHY CONTAINS REFERENCES TO GENERAL GRADUATE EDUCATION AND TO EDUCATION FOR THE FOLLOWING PROFESSIONAL FIELDS--ARCHITECTURE, BUSINESS, CLINICAL PSYCHOLOGY, DENTISTRY, ENGINEERING, LAW, LIBRARY SCIENCE, MEDICINE, NURSING, SOCIAL WORK, TEACHING, AND THEOLOGY. (HW)

  1. Virulence Mechanisms of Enteroinvasive Pathogens

    DTIC Science & Technology

    1988-01-01

    of Plasmid Gene Products otactic response facilitates the establishment of in Virulence a Salmonella infection (47). The action of pan- Anucleate...1967. Electron microscope studies of 36:615-620. experimental salmonella infection . 1. Penetration into 35. Rout, W. R., S. B. Formal, G. J. Dammin

  2. Citrate uptake into Pectobacterium atrosepticum is critical for bacterial virulence.

    PubMed

    Urbany, Claude; Neuhaus, H Ekkehard

    2008-05-01

    To analyze whether metabolite import into Pectobacterium atrosepticum cells affects bacterial virulence, we investigated the function of a carrier which exhibits significant structural homology to characterized carboxylic-acid transport proteins. The corresponding gene, ECA3984, previously annotated as coding for a Na(+)/sulphate carrier, in fact encodes a highly specific citrate transporter (Cit1) which is energized by the proton-motive force. Expression of the cit1 gene is stimulated by the presence of citrate in the growth medium and is substantial during growth of P. atrosepticum on potato tuber tissue. Infection of tuber tissue with P. atrosepticum leads to reduced citrate levels. P. atrosepticum insertion mutants, lacking the functional Cit1 protein, did not grow in medium containing citrate as the sole carbon source, showed a substantially reduced ability to macerate potato tuber tissue, and did not provoke reduced citrate levels in the plant tissue upon infection. We propose that citrate uptake into P. atrosepticum is critical for full bacterial virulence.

  3. Genepi: a blackboard framework for genome annotation.

    PubMed

    Descorps-Declère, Stéphane; Ziébelin, Danielle; Rechenmann, François; Viari, Alain

    2006-10-12

    Genome annotation can be viewed as an incremental, cooperative, data-driven, knowledge-based process that involves multiple methods to predict gene locations and structures. This process might have to be executed more than once and might be subjected to several revisions as the biological (new data) or methodological (new methods) knowledge evolves. In this context, although a lot of annotation platforms already exist, there is still a strong need for computer systems which take in charge, not only the primary annotation, but also the update and advance of the associated knowledge. In this paper, we propose to adopt a blackboard architecture for designing such a system We have implemented a blackboard framework (called Genepi) for developing automatic annotation systems. The system is not bound to any specific annotation strategy. Instead, the user will specify a blackboard structure in a configuration file and the system will instantiate and run this particular annotation strategy. The characteristics of this framework are presented and discussed. Specific adaptations to the classical blackboard architecture have been required, such as the description of the activation patterns of the knowledge sources by using an extended set of Allen's temporal relations. Although the system is robust enough to be used on real-size applications, it is of primary use to bioinformatics researchers who want to experiment with blackboard architectures. In the context of genome annotation, blackboards have several interesting features related to the way methodological and biological knowledge can be updated. They can readily handle the cooperative (several methods are implied) and opportunistic (the flow of execution depends on the state of our knowledge) aspects of the annotation process.

  4. Genepi: a blackboard framework for genome annotation

    PubMed Central

    Descorps-Declère, Stéphane; Ziébelin, Danielle; Rechenmann, François; Viari, Alain

    2006-01-01

    Background Genome annotation can be viewed as an incremental, cooperative, data-driven, knowledge-based process that involves multiple methods to predict gene locations and structures. This process might have to be executed more than once and might be subjected to several revisions as the biological (new data) or methodological (new methods) knowledge evolves. In this context, although a lot of annotation platforms already exist, there is still a strong need for computer systems which take in charge, not only the primary annotation, but also the update and advance of the associated knowledge. In this paper, we propose to adopt a blackboard architecture for designing such a system Results We have implemented a blackboard framework (called Genepi) for developing automatic annotation systems. The system is not bound to any specific annotation strategy. Instead, the user will specify a blackboard structure in a configuration file and the system will instantiate and run this particular annotation strategy. The characteristics of this framework are presented and discussed. Specific adaptations to the classical blackboard architecture have been required, such as the description of the activation patterns of the knowledge sources by using an extended set of Allen's temporal relations. Although the system is robust enough to be used on real-size applications, it is of primary use to bioinformatics researchers who want to experiment with blackboard architectures. Conclusion In the context of genome annotation, blackboards have several interesting features related to the way methodological and biological knowledge can be updated. They can readily handle the cooperative (several methods are implied) and opportunistic (the flow of execution depends on the state of our knowledge) aspects of the annotation process. PMID:17038181

  5. Development and Evaluation of an Automated Annotation Pipeline and cDNA Annotation System

    PubMed Central

    Kasukawa, Takeya; Furuno, Masaaki; Nikaido, Itoshi; Bono, Hidemasa; Hume, David A.; Bult, Carol; Hill, David P.; Baldarelli, Richard; Gough, Julian; Kanapin, Alexander; Matsuda, Hideo; Schriml, Lynn M.; Hayashizaki, Yoshihide; Okazaki, Yasushi; Quackenbush, John

    2003-01-01

    Manual curation has long been held to be the “gold standard” for functional annotation of DNA sequence. Our experience with the annotation of more than 20,000 full-length cDNA sequences revealed problems with this approach, including inaccurate and inconsistent assignment of gene names, as well as many good assignments that were difficult to reproduce using only computational methods. For the FANTOM2 annotation of more than 60,000 cDNA clones, we developed a number of methods and tools to circumvent some of these problems, including an automated annotation pipeline that provides high-quality preliminary annotation for each sequence by introducing an “uninformative filter” that eliminates uninformative annotations, controlled vocabularies to accurately reflect both the functional assignments and the evidence supporting them, and a highly refined, Web-based manual annotation tool that allows users to view a wide array of sequence analyses and to assign gene names and putative functions using a consistent nomenclature. The ultimate utility of our approach is reflected in the low rate of reassignment of automated assignments by manual curation. Based on these results, we propose a new standard for large-scale annotation, in which the initial automated annotations are manually investigated and then computational methods are iteratively modified and improved based on the results of manual curation. PMID:12819153

  6. AutoAnnotate: A Cytoscape app for summarizing networks with semantic annotations

    PubMed Central

    Kucera, Mike; Isserlin, Ruth; Arkhangorodsky, Arkady; Bader, Gary D.

    2016-01-01

    Networks often contain regions of tightly connected nodes, or clusters, that highlight their shared relationships. An effective way to create a visual summary of a network is to identify clusters and annotate them with an enclosing shape and a summarizing label. Cytoscape provides the ability to annotate a network with shapes and labels, however these annotations must be created manually one at a time, which can be a laborious process. AutoAnnotate is a Cytoscape 3 App that automates the process of identifying clusters and visually annotating them. It greatly reduces the time and effort required to fully annotate clusters in a network, and provides freedom to experiment with different strategies for identifying and labelling clusters. Many customization options are available that enable the user to refine the generated annotations as required. Annotated clusters may be collapsed into single nodes using the Cytoscape groups feature, which helps simplify a network by making its overall structure more visible. AutoAnnotate is applicable to any type of network, including enrichment maps, protein-protein interactions, pathways, or social networks. PMID:27830058

  7. Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.

    PubMed

    Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.

  8. AutoAnnotate: A Cytoscape app for summarizing networks with semantic annotations.

    PubMed

    Kucera, Mike; Isserlin, Ruth; Arkhangorodsky, Arkady; Bader, Gary D

    2016-01-01

    Networks often contain regions of tightly connected nodes, or clusters, that highlight their shared relationships. An effective way to create a visual summary of a network is to identify clusters and annotate them with an enclosing shape and a summarizing label. Cytoscape provides the ability to annotate a network with shapes and labels, however these annotations must be created manually one at a time, which can be a laborious process. AutoAnnotate is a Cytoscape 3 App that automates the process of identifying clusters and visually annotating them. It greatly reduces the time and effort required to fully annotate clusters in a network, and provides freedom to experiment with different strategies for identifying and labelling clusters. Many customization options are available that enable the user to refine the generated annotations as required. Annotated clusters may be collapsed into single nodes using the Cytoscape groups feature, which helps simplify a network by making its overall structure more visible. AutoAnnotate is applicable to any type of network, including enrichment maps, protein-protein interactions, pathways, or social networks.

  9. Within-host evolution decreases virulence in an opportunistic bacterial pathogen.

    PubMed

    Mikonranta, Lauri; Mappes, Johanna; Laakso, Jouni; Ketola, Tarmo

    2015-08-19

    Pathogens evolve in a close antagonistic relationship with their hosts. The conventional theory proposes that evolution of virulence is highly dependent on the efficiency of direct host-to-host transmission. Many opportunistic pathogens, however, are not strictly dependent on the hosts due to their ability to reproduce in the free-living environment. Therefore it is likely that conflicting selection pressures for growth and survival outside versus within the host, rather than transmission potential, shape the evolution of virulence in opportunists. We tested the role of within-host selection in evolution of virulence by letting a pathogen Serratia marcescens db11 sequentially infect Drosophila melanogaster hosts and then compared the virulence to strains that evolved only in the outside-host environment. We found that the pathogen adapted to both Drosophila melanogaster host and novel outside-host environment, leading to rapid evolutionary changes in the bacterial life-history traits including motility, in vitro growth rate, biomass yield, and secretion of extracellular proteases. Most significantly, selection within the host led to decreased virulence without decreased bacterial load while the selection lines in the outside-host environment maintained the same level of virulence with ancestral bacteria. This experimental evidence supports the idea that increased virulence is not an inevitable consequence of within-host adaptation even when the epidemiological restrictions are removed. Evolution of attenuated virulence could occur because of immune evasion within the host. Alternatively, rapid fluctuation between outside-host and within-host environments, which is typical for the life cycle of opportunistic bacterial pathogens, could lead to trade-offs that lower pathogen virulence.

  10. Microbial communication and virulence: lessons from evolutionary theory.

    PubMed

    Diggle, Stephen P

    2010-12-01

    At the heart of tackling the huge challenge posed by infectious micro-organisms is the overwhelming need to understand their nature. A major question is, why do some species of bacteria rapidly kill their host whilst others are relatively benign? For example, Yersinia pestis, the causative organism of plague, is a highly virulent human pathogen whilst the closely related Yersinia pseudotuberculosis causes a much less severe disease. Using molecular techniques such as mutating certain genes, microbiologists have made significant advances over recent decades in elucidating the mechanisms that govern the production of virulence factors involved in causing disease in many bacterial species. There are also evolutionary and ecological factors which will influence virulence. Many of these ideas have arisen through the development of evolutionary theory and yet there is strikingly little empirical evidence testing them. By applying both mechanistic and adaptive approaches to microbial behaviours we can begin to address questions such as, what factors influence cooperation and the evolution of virulence in microbes and can we exploit these factors to develop new antimicrobial strategies?

  11. Engineering Attenuated Virulence of a Theileria annulata–Infected Macrophage

    PubMed Central

    Echebli, Nadia; Mhadhbi, Moez; Chaussepied, Marie; Vayssettes, Catherine; Di Santo, James P.; Darghouth, Mohamed Aziz; Langsley, Gordon

    2014-01-01

    Live attenuated vaccines are used to combat tropical theileriosis in North Africa, the Middle East, India, and China. The attenuation process is empirical and occurs only after many months, sometimes years, of in vitro culture of virulent clinical isolates. During this extensive culturing, attenuated lines lose their vaccine potential. To circumvent this we engineered the rapid ablation of the host cell transcription factor c-Jun, and within only 3 weeks the line engineered for loss of c-Jun activation displayed in vitro correlates of attenuation such as loss of adhesion, reduced MMP9 gelatinase activity, and diminished capacity to traverse Matrigel. Specific ablation of a single infected host cell virulence trait (c-Jun) induced a complete failure of Theileria annulata–transformed macrophages to disseminate, whereas virulent macrophages disseminated to the kidneys, spleen, and lungs of Rag2/γC mice. Thus, in this heterologous mouse model loss of c-Jun expression led to ablation of dissemination of T. annulata–infected and transformed macrophages. The generation of Theileria-infected macrophages genetically engineered for ablation of a specific host cell virulence trait now makes possible experimental vaccination of calves to address how loss of macrophage dissemination impacts the disease pathology of tropical theileriosis. PMID:25375322

  12. Carbohydrate Availability Regulates Virulence Gene Expression in Streptococcus suis

    PubMed Central

    Ferrando, M. Laura; van Baarlen, Peter; Orrù, Germano; Piga, Rosaria; Bongers, Roger S.; Wels, Michiel; De Greeff, Astrid; Smith, Hilde E.; Wells, Jerry M.

    2014-01-01

    Streptococcus suis is a major bacterial pathogen of young pigs causing worldwide economic problems for the pig industry. S. suis is also an emerging pathogen of humans. Colonization of porcine oropharynx by S. suis is considered to be a high risk factor for invasive disease. In the oropharyngeal cavity, where glucose is rapidly absorbed but dietary α-glucans persist, there is a profound effect of carbohydrate availability on the expression of virulence genes. Nineteen predicted or confirmed S. suis virulence genes that promote adhesion to and invasion of epithelial cells were expressed at higher levels when S. suis was supplied with the α-glucan starch/pullulan compared to glucose as the single carbon source. Additionally the production of suilysin, a toxin that damages epithelial cells, was increased more than ten-fold when glucose levels were low and S. suis was growing on pullulan. Based on biochemical, bioinformatics and in vitro and in vivo gene expression studies, we developed a biological model that postulates the effect of carbon catabolite repression on expression of virulence genes in the mucosa, organs and blood. This research increases our understanding of S. suis virulence mechanisms and has important implications for the design of future control strategies including the development of anti-infective strategies by modulating animal feed composition. PMID:24642967

  13. JGI Plant Genomics Gene Annotation Pipeline

    SciTech Connect

    Shu, Shengqiang; Rokhsar, Dan; Goodstein, David; Hayes, David; Mitros, Therese

    2014-07-14

    Plant genomes vary in size and are highly complex with a high amount of repeats, genome duplication and tandem duplication. Gene encodes a wealth of information useful in studying organism and it is critical to have high quality and stable gene annotation. Thanks to advancement of sequencing technology, many plant species genomes have been sequenced and transcriptomes are also sequenced. To use these vastly large amounts of sequence data to make gene annotation or re-annotation in a timely fashion, an automatic pipeline is needed. JGI plant genomics gene annotation pipeline, called integrated gene call (IGC), is our effort toward this aim with aid of a RNA-seq transcriptome assembly pipeline. It utilizes several gene predictors based on homolog peptides and transcript ORFs. See Methods for detail. Here we present genome annotation of JGI flagship green plants produced by this pipeline plus Arabidopsis and rice except for chlamy which is done by a third party. The genome annotations of these species and others are used in our gene family build pipeline and accessible via JGI Phytozome portal whose URL and front page snapshot are shown below.

  14. Annotating the human genome with Disease Ontology

    PubMed Central

    Osborne, John D; Flatow, Jared; Holko, Michelle; Lin, Simon M; Kibbe, Warren A; Zhu, Lihua (Julie); Danila, Maria I; Feng, Gang; Chisholm, Rex L

    2009-01-01

    Background The human genome has been extensively annotated with Gene Ontology for biological functions, but minimally computationally annotated for diseases. Results We used the Unified Medical Language System (UMLS) MetaMap Transfer tool (MMTx) to discover gene-disease relationships from the GeneRIF database. We utilized a comprehensive subset of UMLS, which is disease-focused and structured as a directed acyclic graph (the Disease Ontology), to filter and interpret results from MMTx. The results were validated against the Homayouni gene collection using recall and precision measurements. We compared our results with the widely used Online Mendelian Inheritance in Man (OMIM) annotations. Conclusion The validation data set suggests a 91% recall rate and 97% precision rate of disease annotation using GeneRIF, in contrast with a 22% recall and 98% precision using OMIM. Our thesaurus-based approach allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching. The much higher recall rate of our approach demonstrates that annotating human genome with Disease Ontology and GeneRIF for diseases dramatically increases the coverage of the disease annotation of human genome. PMID:19594883

  15. Crowdsourcing image annotation for nucleus detection and segmentation in computational pathology: evaluating experts, automated methods, and the crowd.

    PubMed

    Irshad, H; Montaser-Kouhsari, L; Waltz, G; Bucur, O; Nowak, J A; Dong, F; Knoblauch, N W; Beck, A H

    2015-01-01

    The development of tools in computational pathology to assist physicians and biomedical scientists in the diagnosis of disease requires access to high-quality annotated images for algorithm learning and evaluation. Generating high-quality expert-derived annotations is time-consuming and expensive. We explore the use of crowdsourcing for rapidly obtaining annotations for two core tasks in com- putational pathology: nucleus detection and nucleus segmentation. We designed and implemented crowdsourcing experiments using the CrowdFlower platform, which provides access to a large set of labor channel partners that accesses and manages millions of contributors worldwide. We obtained annotations from four types of annotators and compared concordance across these groups. We obtained: crowdsourced annotations for nucleus detection and segmentation on a total of 810 images; annotations using automated methods on 810 images; annotations from research fellows for detection and segmentation on 477 and 455 images, respectively; and expert pathologist-derived annotations for detection and segmentation on 80 and 63 images, respectively. For the crowdsourced annotations, we evaluated performance across a range of contributor skill levels (1, 2, or 3). The crowdsourced annotations (4,860 images in total) were completed in only a fraction of the time and cost required for obtaining annotations using traditional methods. For the nucleus detection task, the research fellow-derived annotations showed the strongest concordance with the expert pathologist- derived annotations (F-M =93.68%), followed by the crowd-sourced contributor levels 1,2, and 3 and the automated method, which showed relatively similar performance (F-M = 87.84%, 88.49%, 87.26%, and 86.99%, respectively). For the nucleus segmentation task, the crowdsourced contributor level 3-derived annotations, research fellow-derived annotations, and automated method showed the strongest concordance with the expert pathologist

  16. CROWDSOURCING IMAGE ANNOTATION FOR NUCLEUS DETECTION AND SEGMENTATION IN COMPUTATIONAL PATHOLOGY: EVALUATING EXPERTS, AUTOMATED METHODS, AND THE CROWD

    PubMed Central

    Irshad, H.; Montaser-Kouhsari, L.; Waltz, G.; Bucur, O.; Nowak, J.A.; Dong, F.; Knoblauch, N.W.; Beck, A. H.

    2014-01-01

    The development of tools in computational pathology to assist physicians and biomedical scientists in the diagnosis of disease requires access to high-quality annotated images for algorithm learning and evaluation. Generating high-quality expert-derived annotations is time-consuming and expensive. We explore the use of crowdsourcing for rapidly obtaining annotations for two core tasks in computational pathology: nucleus detection and nucleus segmentation. We designed and implemented crowdsourcing experiments using the CrowdFlower platform, which provides access to a large set of labor channel partners that accesses and manages millions of contributors worldwide. We obtained annotations from four types of annotators and compared concordance across these groups. We obtained: crowdsourced annotations for nucleus detection and segmentation on a total of 810 images; annotations using automated methods on 810 images; annotations from research fellows for detection and segmentation on 477 and 455 images, respectively; and expert pathologist-derived annotations for detection and segmentation on 80 and 63 images, respectively. For the crowdsourced annotations, we evaluated performance across a range of contributor skill levels (1, 2, or 3). The crowdsourced annotations (4,860 images in total) were completed in only a fraction of the time and cost required for obtaining annotations using traditional methods. For the nucleus detection task, the research fellow-derived annotations showed the strongest concordance with the expert pathologist-derived annotations (F−M =93.68%), followed by the crowd-sourced contributor levels 1,2, and 3 and the automated method, which showed relatively similar performance (F−M = 87.84%, 88.49%, 87.26%, and 86.99%, respectively). For the nucleus segmentation task, the crowdsourced contributor level 3-derived annotations, research fellow-derived annotations, and automated method showed the strongest concordance with the expert pathologist

  17. IFA - INTELLIGENT FRONT ANNOTATION PROGRAM

    NASA Technical Reports Server (NTRS)

    Burke, G. R.

    1994-01-01

    An important aspect of an ASIC (Application Specific Integrated Circuit) design process is verification. The design must not only be functionally accurate, but it must also maintain the correct timing. After a circuit has been laid out, one can utilize the Back Annotation (BA) method to simulate the design and obtain an accurate estimate of performance. However, this can lead to major design changes. It is therefore preferable to eliminate potential problems early in this process. IFA, the Intelligent Front Annotation program, assists in verifying the timing of the ASIC early in the design process. Many difficulties can arise during ASIC design. In a synchronous design, both long path and short path problems can be present. In modern ASIC technologies, the delay through a gate is very dependent on loading. This loading has two main components, the capacitance of the gates being driven and the capacitance of the metal tracks (wires). When using GaAs gate arrays, the metal line capacitance is often the dominating factor. Additionally, the RC delay through the wire itself is significant in sub-micron technologies. Since the wire lengths are unknown before place and route of the entire chip, this would seem to postpone any realistic timing verification until towards the end of the design process, obviously an undesirable situation. The IFA program estimates the delays in an ASIC before layout. Currently the program is designed for Vitesse GaAs gate arrays and, for input, requires the expansion file which is output by the program GED; however, the algorithm is appropriate for many different ASIC types and CAE platforms. IFA is especially useful for devices whose delay is extremely dependent on the interconnection wiring. It estimates the length of the interconnects using information supplied by the user and information in the netlist. The resulting wire lengths are also used to constrain the Place and Route program, ensuring reasonable results. IFA takes locality into

  18. Annotated checklist of Georgia birds

    USGS Publications Warehouse

    Beaton, G.; Sykes, P.W.; Parrish, J.W.

    2003-01-01

    This edition of the checklist includes 446 species, of which 407 are on the Regular Species List, 8 on the Provisional, and 31 on the Hypothetical. This new publication has been greatly expanded and much revised over the previous checklist (GOS Occasional Publ. No. 10, 1986, 48 pp., 6x9 inches) to a 7x10-inch format with an extensive Literature Cited section added, 22 species added to the Regular List, 2 to the Provisional List, and 9 to the Hypothetical List. Each species account is much more comprehensive over all previous editions of the checklist. Among some of the new features are citations for sources of most information used, high counts of individuals for each species on the Regular List, extreme dates of occurrence within physiographic regions, a list of abbreviations and acronyms, and for each species the highest form of verifiable documentation given with its repository institution with a catalog number. This checklist is helpful for anyone working with birds in the Southeastern United States or birding in that region. Sykes' contribution to this fifth edition of the Annotated Checklist of Georgia Birds includes: suggestion of the large format and spiral binding, use of Richard A. Parks' painting of the Barn Owl on the front cover, use of literature citations throughout, and inclusion of high counts for each species. Sykes helped plan all phases of the publication, wrote about 90% of the Introduction and 84 species accounts (Osprey through Red Phalarope), designed the four maps in the introduction section and format for the Literature Cited, and with Giff Beaton designed the layout of the title page.

  19. MvirDB: Microbial Database of Protein Toxins, Virulence Factors and Antibiotic Resistance Genes for Bio-Defense Applications

    DOE Data Explorer

    Zhou, C. E.; Smith, J.; Lam, M.; Zemla, M. D.; Slezak, T.

    MvirDB is a cenntralized resource (data warehouse) comprising all publicly accessible, organized sequence data for protein toxins, virulence factors, and antibiotic resistance genes. Protein entries in MvirDB are annotated using a high-throughput, fully automated computational annotation system; annotations are updated periodically to ensure that results are derived using current public database and open-source tool releases. Tools provided for using MvirDB include a web-based browser tool and BLAST interfaces. MvirDB serves researchers in the bio-defense and medical fields. (taken from page 3 of PI's paper of same title published in Nucleic Acids Research, 2007, Vol.35, Database Issue (Open Source)

  20. [Virulence determinant of Chromobacterium violaceum].

    PubMed

    Miki, Tsuyoshi

    2014-01-01

    Chromobacterium violaceum is a Gram-negative bacterium that infects humans and animals with fatal sepsis. The infection with C. violaceum is rare in case of those who are healthy, but once established, C. violaceum causes sever disease accompanied by abscess formation in the lungs, liver and spleen. Furthermore, C. violaceum is resistant to a broad range of antibiotics, which in some cases renders the antimicrobial therapy for this infection difficult. Thus, the infection with C. violaceum displays high mortality rates unless initial proper antimicrobial therapy. In contrast, the infection mechanism had completely remained unknown. To this end, we have tried to identify virulence factors-associated with C. violaceum infection. Two distinct type III secretion systems (TTSSs) were thought to be one of the most important virulence factors, which are encoded by Chromobacterium pathogenicity island 1/1a and 2 (Cpi-1/-1a and -2) respectively. Our results have shown that Cpi-1/-1a-encoded TTSS, but not Cpi-2, is indispensable for the virulence in a mouse infection model. C. violaceum caused fulminant hepatitis in a Cpi-1/-1a-encoded TTSS-dependent manner. We next have identified 16 novel effectors secreted from Cpi-1/-1a-encoded TTS machinery. From these effectors, we found that CopE (Chromobacterium outer protein E) has similarities to a guanine nucleotide exchange factor (GEF) for Rho GTPases. CopE acts as GEF for Rac1 and Cdc42, leading to induction of actin cytoskeletal rearrangement. Interestingly, C. violaceum invades cultured human epithelial cells in a CopE-dependent manner. Finally, an inactivation of CopE by disruption of copE gene or amino acid point mutation leading to loss of GEF activity attenuates significantly the mouse virulence of C. violaceum. These results suggest that Cpi-1/-1a-encoded TTSS is a major virulence determinant for C. violaceum infection, and that CopE contributes to the virulence in part of this pathogen.

  1. Annotated chemical patent corpus: a gold standard for text mining.

    PubMed

    Akhondi, Saber A; Klenner, Alexander G; Tyrchan, Christian; Manchala, Anil K; Boppana, Kiran; Lowe, Daniel; Zimmermann, Marc; Jagarlapudi, Sarma A R P; Sayle, Roger; Kors, Jan A; Muresan, Sorel

    2014-01-01

    Exploring the chemical and biological space covered by patent applications is crucial in early-stage medicinal chemistry activities. Patent analysis can provide understanding of compound prior art, novelty checking, validation of biological assays, and identification of new starting points for chemical exploration. Extracting chemical and biological entities from patents through manual extraction by expert curators can take substantial amount of time and resources. Text mining methods can help to ease this process. To validate the performance of such methods, a manually annotated patent corpus is essential. In this study we have produced a large gold standard chemical patent corpus. We developed annotation guidelines and selected 200 full patents from the World Intellectual Property Organization, United States Patent and Trademark Office, and European Patent Office. The patents were pre-annotated automatically and made available to four independent annotator groups each consisting of two to ten annotators. The annotators marked chemicals in different subclasses, diseases, targets, and modes of action. Spelling mistakes and spurious line break due to optical character recognition errors were also annotated. A subset of 47 patents was annotated by at least three annotator groups, from which harmonized annotations and inter-annotator agreement scores were derived. One group annotated the full set. The patent corpus includes 400,125 annotations for the full set and 36,537 annotations for the harmonized set. All patents and annotated entities are publicly available at www.biosemantics.org.

  2. Annotated Chemical Patent Corpus: A Gold Standard for Text Mining

    PubMed Central

    Akhondi, Saber A.; Klenner, Alexander G.; Tyrchan, Christian; Manchala, Anil K.; Boppana, Kiran; Lowe, Daniel; Zimmermann, Marc; Jagarlapudi, Sarma A. R. P.; Sayle, Roger; Kors, Jan A.; Muresan, Sorel

    2014-01-01

    Exploring the chemical and biological space covered by patent applications is crucial in early-stage medicinal chemistry activities. Patent analysis can provide understanding of compound prior art, novelty checking, validation of biological assays, and identification of new starting points for chemical exploration. Extracting chemical and biological entities from patents through manual extraction by expert curators can take substantial amount of time and resources. Text mining methods can help to ease this process. To validate the performance of such methods, a manually annotated patent corpus is essential. In this study we have produced a large gold standard chemical patent corpus. We developed annotation guidelines and selected 200 full patents from the World Intellectual Property Organization, United States Patent and Trademark Office, and European Patent Office. The patents were pre-annotated automatically and made available to four independent annotator groups each consisting of two to ten annotators. The annotators marked chemicals in different subclasses, diseases, targets, and modes of action. Spelling mistakes and spurious line break due to optical character recognition errors were also annotated. A subset of 47 patents was annotated by at least three annotator groups, from which harmonized annotations and inter-annotator agreement scores were derived. One group annotated the full set. The patent corpus includes 400,125 annotations for the full set and 36,537 annotations for the harmonized set. All patents and annotated entities are publicly available at www.biosemantics.org. PMID:25268232

  3. Antimicrobial Resistance and Virulence: a Successful or Deleterious Association in the Bacterial World?

    PubMed Central

    Beceiro, Alejandro; Tomás, María

    2013-01-01

    SUMMARY Hosts and bacteria have coevolved over millions of years, during which pathogenic bacteria have modified their virulence mechanisms to adapt to host defense systems. Although the spread of pathogens has been hindered by the discovery and widespread use of antimicrobial agents, antimicrobial resistance has increased globally. The emergence of resistant bacteria has accelerated in recent years, mainly as a result of increased selective pressure. However, although antimicrobial resistance and bacterial virulence have developed on different timescales, they share some common characteristics. This review considers how bacterial virulence and fitness are affected by antibiotic resistance and also how the relationship between virulence and resistance is affected by different genetic mechanisms (e.g., coselection and compensatory mutations) and by the most prevalent global responses. The interplay between these factors and the associated biological costs depend on four main factors: the bacterial species involved, virulence and resistance mechanisms, the ecological niche, and the host. The development of new strategies involving new antimicrobials or nonantimicrobial compounds and of novel diagnostic methods that focus on high-risk clones and rapid tests to detect virulence markers may help to resolve the increasing problem of the association between virulence and resistance, which is becoming more beneficial for pathogenic bacteria. PMID:23554414

  4. A Zebrafish Larval Model to Assess Virulence of Porcine Streptococcus suis Strains

    PubMed Central

    Zaccaria, Edoardo; Cao, Rui; Wells, Jerry M.; van Baarlen, Peter

    2016-01-01

    Streptococcus suis is an encapsulated Gram-positive bacterium, and the leading cause of sepsis and meningitis in young pigs resulting in considerable economic losses in the porcine industry. It is also considered an emerging zoonotic agent. In the environment, both avirulent and virulent strains occur in pigs, and virulent strains appear to cause disease in both humans and pigs. There is a need for a convenient, reliable and standardized animal model to assess S. suis virulence. A zebrafish (Danio rerio) larvae infection model has several advantages, including transparency of larvae, low cost, ease of use and exemption from ethical legislation up to 6 days post fertilization, but has not been previously established as a model for S. suis. Microinjection of different porcine strains of S. suis in zebrafish larvae resulted in highly reproducible dose- and strain-dependent larval death, strongly correlating with presence of the S. suis capsule and to the original virulence of the strain in pigs. Additionally we compared the virulence of the two-component system mutant of ciaRH, which is attenuated for virulence in both mice and pigs in vivo. Infection of larvae with the ΔciaRH strain resulted in significantly higher survival rate compared to infection with the S10 wild-type strain. Our data demonstrate that zebrafish larvae are a rapid and reliable model to assess the virulence of clinical porcine S. suis isolates. PMID:26999052

  5. AnnotCompute: annotation-based exploration and meta-analysis of genomics experiments

    PubMed Central

    Zheng, Jie; Stoyanovich, Julia; Manduchi, Elisabetta; Liu, Junmin; Stoeckert, Christian J.

    2011-01-01

    The ever-increasing scale of biological data sets, particularly those arising in the context of high-throughput technologies, requires the development of rich data exploration tools. In this article, we present AnnotCompute, an information discovery platform for repositories of functional genomics experiments such as ArrayExpress. Our system leverages semantic annotations of functional genomics experiments with controlled vocabulary and ontology terms, such as those from the MGED Ontology, to compute conceptual dissimilarities between pairs of experiments. These dissimilarities are then used to support two types of exploratory analysis—clustering and query-by-example. We show that our proposed dissimilarity measures correspond to a user's intuition about conceptual dissimilarity, and can be used to support effective query-by-example. We also evaluate the quality of clustering based on these measures. While AnnotCompute can support a richer data exploration experience, its effectiveness is limited in some cases, due to the quality of available annotations. Nonetheless, tools such as AnnotCompute may provide an incentive for richer annotations of experiments. Code is available for download at http://www.cbil.upenn.edu/downloads/AnnotCompute. Database URL: http://www.cbil.upenn.edu/annotCompute/ PMID:22190598

  6. The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes

    PubMed Central

    Overbeek, Ross; Begley, Tadhg; Butler, Ralph M.; Choudhuri, Jomuna V.; Chuang, Han-Yu; Cohoon, Matthew; de Crécy-Lagard, Valérie; Diaz, Naryttza; Disz, Terry; Edwards, Robert; Fonstein, Michael; Frank, Ed D.; Gerdes, Svetlana; Glass, Elizabeth M.; Goesmann, Alexander; Hanson, Andrew; Iwata-Reuyl, Dirk; Jensen, Roy; Jamshidi, Neema; Krause, Lutz; Kubal, Michael; Larsen, Niels; Linke, Burkhard; McHardy, Alice C.; Meyer, Folker; Neuweger, Heiko; Olsen, Gary; Olson, Robert; Osterman, Andrei; Portnoy, Vasiliy; Pusch, Gordon D.; Rodionov, Dmitry A.; Rückert, Christian; Steiner, Jason; Stevens, Rick; Thiele, Ines; Vassieva, Olga; Ye, Yuzhen; Zagnitko, Olga; Vonstein, Veronika

    2005-01-01

    The release of the 1000th complete microbial genome will occur in the next two to three years. In anticipation of this milestone, the Fellowship for Interpretation of Genomes (FIG) launched the Project to Annotate 1000 Genomes. The project is built around the principle that the key to improved accuracy in high-throughput annotation technology is to have experts annotate single subsystems over the complete collection of genomes, rather than having an annotation expert attempt to annotate all of the genes in a single genome. Using the subsystems approach, all of the genes implementing the subsystem are analyzed by an expert in that subsystem. An annotation environment was created where populated subsystems are curated and projected to new genomes. A portable notion of a populated subsystem was defined, and tools developed for exchanging and curating these objects. Tools were also developed to resolve conflicts between populated subsystems. The SEED is the first annotation environment that supports this model of annotation. Here, we describe the subsystem approach, and offer the first release of our growing library of populated subsystems. The initial release of data includes 180 177 distinct proteins with 2133 distinct functional roles. This data comes from 173 subsystems and 383 different organisms. PMID:16214803

  7. The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes.

    PubMed

    Overbeek, Ross; Begley, Tadhg; Butler, Ralph M; Choudhuri, Jomuna V; Chuang, Han-Yu; Cohoon, Matthew; de Crécy-Lagard, Valérie; Diaz, Naryttza; Disz, Terry; Edwards, Robert; Fonstein, Michael; Frank, Ed D; Gerdes, Svetlana; Glass, Elizabeth M; Goesmann, Alexander; Hanson, Andrew; Iwata-Reuyl, Dirk; Jensen, Roy; Jamshidi, Neema; Krause, Lutz; Kubal, Michael; Larsen, Niels; Linke, Burkhard; McHardy, Alice C; Meyer, Folker; Neuweger, Heiko; Olsen, Gary; Olson, Robert; Osterman, Andrei; Portnoy, Vasiliy; Pusch, Gordon D; Rodionov, Dmitry A; Rückert, Christian; Steiner, Jason; Stevens, Rick; Thiele, Ines; Vassieva, Olga; Ye, Yuzhen; Zagnitko, Olga; Vonstein, Veronika

    2005-01-01

    The release of the 1000th complete microbial genome will occur in the next two to three years. In anticipation of this milestone, the Fellowship for Interpretation of Genomes (FIG) launched the Project to Annotate 1000 Genomes. The project is built around the principle that the key to improved accuracy in high-throughput annotation technology is to have experts annotate single subsystems over the complete collection of genomes, rather than having an annotation expert attempt to annotate all of the genes in a single genome. Using the subsystems approach, all of the genes implementing the subsystem are analyzed by an expert in that subsystem. An annotation environment was created where populated subsystems are curated and projected to new genomes. A portable notion of a populated subsystem was defined, and tools developed for exchanging and curating these objects. Tools were also developed to resolve conflicts between populated subsystems. The SEED is the first annotation environment that supports this model of annotation. Here, we describe the subsystem approach, and offer the first release of our growing library of populated subsystems. The initial release of data includes 180 177 distinct proteins with 2133 distinct functional roles. This data comes from 173 subsystems and 383 different organisms.

  8. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    PubMed

    Chen, Shu-Chuan; Ogata, Aaron

    2015-01-01

    The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.

  9. Non-Formal Education and Radio: A Selected, Annotated Bibliography. Annotated Bibliography #14.

    ERIC Educational Resources Information Center

    Vergeldt, Vicki; And Others

    Materials concerning the use of radio and mass communications for non-formal education and development are listed in a selected annotated bibliography, intended for those actively involved in non-formal education and development. Three sections contain annotated entries (which range from 1972-1983), each of which includes source information and…

  10. Listeria Pathogenesis and Molecular Virulence Determinants

    PubMed Central

    Vázquez-Boland, José A.; Kuhn, Michael; Berche, Patrick; Chakraborty, Trinad; Domínguez-Bernal, Gustavo; Goebel, Werner; González-Zorn, Bruno; Wehland, Jürgen; Kreft, Jürgen

    2001-01-01

    , rapid intracytoplasmic multiplication, bacterially induced actin-based motility, and direct spread to neighboring cells, in which they reinitiate the cycle. In this way, listeriae disseminate in host tissues sheltered from the humoral arm of the immune system. Over the last 15 years, a number of virulence factors involved in key steps of this intracellular life cycle have been identified. This review describes in detail the molecular determinants of Listeria virulence and their mechanism of action and summarizes the current knowledge on the pathophysiology of listeriosis and the cell biology and host cell responses to Listeria infection. This article provides an updated perspective of the development of our understanding of Listeria pathogenesis from the first molecular genetic analyses of virulence mechanisms reported in 1985 until the start of the genomic era of Listeria research. PMID:11432815

  11. Genome cartography through domain annotation

    PubMed Central

    Ponting, Chris P; Dickens, Nicholas J

    2001-01-01

    The evolutionary history of eukaryotic proteins involves rapid sequence divergence, addition and deletion of domains, and fusion and fission of genes. Although the protein repertoires of distantly related species differ greatly, their domain repertoires do not. To account for the great diversity of domain contexts and an unexpected paucity of ortholog conservation, we must categorize the coding regions of completely sequenced genomes into domain families, as well as protein families.

  12. The evolution of tuberculosis virulence.

    PubMed

    Basu, Sanjay; Galvani, Alison P

    2009-07-01

    The evolution of Mycobacterium tuberculosis presents several challenges for public health. HIV and resistance to antimycobacterial medications have evolutionary implications for how Mycobacterium tuberculosis will evolve, as these factors influence the host environment and transmission dynamics of tuberculosis strains. We present an evolutionary invasion analysis of tuberculosis that characterizes the direction of tuberculosis evolution in the context of different natural and human-driven selective pressures, including changes in tuberculosis treatment and HIV prevalence. We find that the evolution of tuberculosis virulence can be affected by treatment success rates, the relative transmissibility of emerging strains, the rate of reactivation from latency among hosts, and the life expectancy of hosts. We find that the virulence of tuberculosis strains may also increase as a consequence of rising HIV prevalence, requiring faster case detection strategies in areas where the epidemics of HIV and tuberculosis collide.

  13. Manganese uptake and streptococcal virulence.

    PubMed

    Eijkelkamp, Bart A; McDevitt, Christopher A; Kitten, Todd

    2015-06-01

    Streptococcal solute-binding proteins (SBPs) associated with ATP-binding cassette transporters gained widespread attention first as ostensible adhesins, next as virulence determinants, and finally as metal ion transporters. In this mini-review, we will examine our current understanding of the cellular roles of these proteins, their contribution to metal ion homeostasis, and their crucial involvement in mediating streptococcal virulence. There are now more than 35 studies that have collected structural, biochemical and/or physiological data on the functions of SBPs across a broad range of bacteria. This offers a wealth of data to clarify the formerly puzzling and contentious findings regarding the metal specificity amongst this group of essential bacterial transporters. In particular we will focus on recent findings related to biological roles for manganese in streptococci. These advances will inform efforts aimed at exploiting the importance of manganese and manganese acquisition for the design of new approaches to combat serious streptococcal diseases.

  14. Campylobacter virulence and survival factors.

    PubMed

    Bolton, Declan J

    2015-06-01

    Despite over 30 years of research, campylobacteriosis is the most prevalent foodborne bacterial infection in many countries including in the European Union and the United States of America. However, relatively little is known about the virulence factors in Campylobacter or how an apparently fragile organism can survive in the food chain, often with enhanced pathogenicity. This review collates information on the virulence and survival determinants including motility, chemotaxis, adhesion, invasion, multidrug resistance, bile resistance and stress response factors. It discusses their function in transition through the food processing environment and human infection. In doing so it provides a fundamental understanding of Campylobacter, critical for improved diagnosis, surveillance and control. Copyright © 2014 Elsevier Ltd. All rights reserved.

  15. A proposal for the reference-based annotation of de novo transposable element insertions.

    PubMed

    Bergman, Casey M

    2012-01-01

    Understanding the causes and consequences of transposable element (TE) activity in the genomic era requires sophisticated bioinformatics approaches to accurately identify individual insertion sites. Next-generation sequencing technology now makes it possible to rapidly identify new TE insertions using resequencing data, opening up new possibilities to study the nature of TE-induced mutation and the target site preferences of different TE families. While the identification of new TE insertion sites is seemingly a simple task, the mechanisms of transposition present unique challenges for the annotation of de novo transposable element insertions mapped to a reference genome. Here I discuss these challenges and propose a framework for the annotation of de novo TE insertions that accommodates known mechanisms of TE insertion and established coordinate systems for genome annotation.

  16. Virulence factor activity relationships (VFARs): a bioinformatics perspective.

    PubMed

    Waseem, Hassan; Williams, Maggie R; Stedtfeld, Tiffany; Chai, Benli; Stedtfeld, Robert D; Cole, James R; Tiedje, James M; Hashsham, Syed A

    2017-03-06

    Virulence factor activity relationships (VFARs) - a concept loosely based on quantitative structure-activity relationships (QSARs) for chemicals was proposed as a predictive tool for ranking risks due to microorganisms relevant to water safety. A rapid increase in sequencing capabilities and bioinformatics tools has significantly increased the potential for VFAR-based analyses. This review summarizes more than 20 bioinformatics databases and tools, developed over the last decade, along with their virulence and antimicrobial resistance prediction capabilities. With the number of bacterial whole genome sequences exceeding 241 000 and metagenomic analysis projects exceeding 13 000 and the ability to add additional genome sequences for few hundred dollars, it is evident that further development of VFARs is not limited by the availability of information at least at the genomic level. However, additional information related to co-occurrence, treatment response, modulation of virulence due to environmental and other factors, and economic impact must be gathered and incorporated in a manner that also addresses the associated uncertainties. Of the bioinformatics tools, a majority are either designed exclusively for virulence/resistance determination or equipped with a dedicated module. The remaining have the potential to be employed for evaluating virulence. This review focusing broadly on omics technologies and tools supports the notion that these tools are now sufficiently developed to allow the application of VFAR approaches combined with additional engineering and economic analyses to rank and prioritize organisms important to a given niche. Knowledge gaps do exist but can be filled with focused experimental and theoretical analyses that were unimaginable a decade ago. Further developments should consider the integration of the measurement of activity, risk, and uncertainty to improve the current capabilities.

  17. Rag Virulence Among Soybean Aphids (Hemiptera: Aphididae) in Wisconsin.

    PubMed

    Crossley, Michael S; Hogg, David B

    2015-02-01

    Soybean aphid, Aphis glycines Matsumura, a pest of soybean, Glycine max (L.) Merr., and native of Asia, invaded North America sometime before 2000 and rapidly became the most significant insect pest of soybean in the upper Midwest. Plant resistance, a key component of integrated pest management, has received significant attention in the past decade, and several resistance (Rag) genes have been identified. However, the efficacy of Rag (Resistance to Aphis glycines) genes in suppressing aphid abundance has been challenged by the occurrence of soybean aphids capable of overcoming Rag gene-mediated resistance. Although the occurrence of these Rag virulent biotypes poses a serious threat to effective and sustainable management of soybean aphid, little is known about the current abundance of biotypes in North America. The objective of this research was to determine the distribution of Rag virulent soybean aphids in Wisconsin. Soybean aphids were collected from Wisconsin during the summers of 2012 and 2013, and assayed for Rag1, Rag2, and Rag1+2 virulence using no-choice tests in a greenhouse. One clone from Monroe County in 2012 reacted like biotype 4, three clones in different counties in 2013 responded like biotype 2, and eight others expressed varying degrees of Rag virulence. Rag virulence in 2013 was observed in aphids from 33% of the sampled sites and was accounted for by just 4.5% of sampled clones, although this is likely a conservative estimate. No-choice test results are discussed in light of current questions on the biology, ecology, and population genetics of soybean aphid. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  18. GeneTools--application for functional annotation and statistical hypothesis testing.

    PubMed

    Beisvag, Vidar; Jünge, Frode K R; Bergum, Hallgeir; Jølsum, Lars; Lydersen, Stian; Günther, Clara-Cecilie; Ramampiaro, Heri; Langaas, Mette; Sandvik, Arne K; Laegreid, Astrid

    2006-10-24

    " annotation tool, providing users with a rapid extraction of highly relevant gene annotation data for e.g. thousands of genes or clones at once. It allows a user to define and archive new GO annotations and it supports hypothesis testing related to GO category representations. GeneTools is freely available through www.genetools.no

  19. GeneTools – application for functional annotation and statistical hypothesis testing

    PubMed Central

    Beisvag, Vidar; Jünge, Frode KR; Bergum, Hallgeir; Jølsum, Lars; Lydersen, Stian; Günther, Clara-Cecilie; Ramampiaro, Heri; Langaas, Mette; Sandvik, Arne K; Lægreid, Astrid

    2006-01-01

    Tools is the first "all in one" annotation tool, providing users with a rapid extraction of highly relevant gene annotation data for e.g. thousands of genes or clones at once. It allows a user to define and archive new GO annotations and it supports hypothesis testing related to GO category representations. GeneTools is freely available through www.genetools.no PMID:17062145

  20. Uropathogenic Escherichia coli virulence genes: invaluable approaches for designing DNA microarray probes.

    PubMed

    Jahandeh, Nadia; Ranjbar, Reza; Behzadi, Payam; Behzadi, Elham

    2015-01-01

    The pathotypes of uropathogenic Escherichia coli (UPEC) cause different types of urinary tract infections (UTIs). The presence of a wide range of virulence genes in UPEC enables us to design appropriate DNA microarray probes. These probes, which are used in DNA microarray technology, provide us with an accurate and rapid diagnosis and definitive treatment in association with UTIs caused by UPEC pathotypes. The main goal of this article is to introduce the UPEC virulence genes as invaluable approaches for designing DNA microarray probes. Main search engines such as Google Scholar and databases like NCBI were searched to find and study several original pieces of literature, review articles, and DNA gene sequences. In parallel with in silico studies, the experiences of the authors were helpful for selecting appropriate sources and writing this review article. There is a significant variety of virulence genes among UPEC strains. The DNA sequences of virulence genes are fabulous patterns for designing microarray probes. The location of virulence genes and their sequence lengths influence the quality of probes. The use of selected virulence genes for designing microarray probes gives us a wide range of choices from which the best probe candidates can be chosen. DNA microarray technology provides us with an accurate, rapid, cost-effective, sensitive, and specific molecular diagnostic method which is facilitated by designing microarray probes. Via these tools, we are able to have an accurate diagnosis and a definitive treatment regarding UTIs caused by UPEC pathotypes.

  1. Active learning reduces annotation time for clinical concept extraction.

    PubMed

    Kholghi, Mahnoosh; Sitbon, Laurianne; Zuccon, Guido; Nguyen, Anthony

    2017-10-01

    To investigate: (1) the annotation time savings by various active learning query strategies compared to supervised learning and a random sampling baseline, and (2) the benefits of active learning-assisted pre-annotations in accelerating the manual annotation process compared to de novo annotation. There are 73 and 120 discharge summary reports provided by Beth Israel institute in the train and test sets of the concept extraction task in the i2b2/VA 2010 challenge, respectively. The 73 reports were used in user study experiments for manual annotation. First, all sequences within the 73 reports were manually annotated from scratch. Next, active learning models were built to generate pre-annotations for the sequences selected by a query strategy. The annotation/reviewing time per sequence was recorded. The 120 test reports were used to measure the effectiveness of the active learning models. When annotating from scratch, active learning reduced the annotation time up to 35% and 28% compared to a fully supervised approach and a random sampling baseline, respectively. Reviewing active learning-assisted pre-annotations resulted in 20% further reduction of the annotation time when compared to de novo annotation. The number of concepts that require manual annotation is a good indicator of the annotation time for various active learning approaches as demonstrated by high correlation between time rate and concept annotation rate. Active learning has a key role in reducing the time required to manually annotate domain concepts from clinical free text, either when annotating from scratch or reviewing active learning-assisted pre-annotations. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Web and personal image annotation by mining label correlation with relaxed visual graph embedding.

    PubMed

    Yang, Yi; Wu, Fei; Nie, Feiping; Shen, Heng Tao; Zhuang, Yueting; Hauptmann, Alexander G

    2012-03-01

    The number of digital images rapidly increases, and it becomes an important challenge to organize these resources effectively. As a way to facilitate image categorization and retrieval, automatic image annotation has received much research attention. Considering that there are a great number of unlabeled images available, it is beneficial to develop an effective mechanism to leverage unlabeled images for large-scale image annotation. Meanwhile, a single image is usually associated with multiple labels, which are inherently correlated to each other. A straightforward method of image annotation is to decompose the problem into multiple independent single-label problems, but this ignores the underlying correlations among different labels. In this paper, we propose a new inductive algorithm for image annotation by integrating label correlation mining and visual similarity mining into a joint framework. We first construct a graph model according to image visual features. A multilabel classifier is then trained by simultaneously uncovering the shared structure common to different labels and the visual graph embedded label prediction matrix for image annotation. We show that the globally optimal solution of the proposed framework can be obtained by performing generalized eigen-decomposition. We apply the proposed framework to both web image annotation and personal album labeling using the NUS-WIDE, MSRA MM 2.0, and Kodak image data sets, and the AUC evaluation metric. Extensive experiments on large-scale image databases collected from the web and personal album show that the proposed algorithm is capable of utilizing both labeled and unlabeled data for image annotation and outperforms other algorithms.

  3. Evaluation of training with an annotation schema for manual annotation of clinical conditions from emergency department reports.

    PubMed

    Chapman, Wendy W; Dowling, John N; Hripcsak, George

    2008-02-01

    Determine whether agreement among annotators improves after being trained to use an annotation schema that specifies: what types of clinical conditions to annotate, the linguistic form of the annotations, and which modifiers to include. Three physicians and 3 lay people individually annotated all clinical conditions in 23 emergency department reports. For annotations made using a Baseline Schema and annotations made after training on a detailed annotation schema, we compared: (1) variability of annotation length and number and (2) annotator agreement, using the F-measure. Physicians showed higher agreement and lower variability after training on the detailed annotation schema than when applying the Baseline Schema. Lay people agreed with physicians almost as well as other physicians did but showed a slower learning curve. Training annotators on the annotation schema we developed increased agreement among annotators and should be useful in generating reference standard sets for natural language processing studies. The methodology we used to evaluate the schema could be applied to other types of annotation or classification tasks in biomedical informatics.

  4. Gene and alternative splicing annotation with AIR

    PubMed Central

    Florea, Liliana; Di Francesco, Valentina; Miller, Jason; Turner, Russell; Yao, Alison; Harris, Michael; Walenz, Brian; Mobarry, Clark; Merkulov, Gennady V.; Charlab, Rosane; Dew, Ian; Deng, Zuoming; Istrail, Sorin; Li, Peter; Sutton, Granger

    2005-01-01

    Designing effective and accurate tools for identifying the functional and structural elements in a genome remains at the frontier of genome annotation owing to incompleteness and inaccuracy of the data, limitations in the computational models, and shifting paradigms in genomics, such as alternative splicing. We present a methodology for the automated annotation of genes and their alternatively spliced mRNA transcripts based on existing cDNA and protein sequence evidence from the same species or projected from a related species using syntenic mapping information. At the core of the method is the splice graph, a compact representation of a gene, its exons, introns, and alternatively spliced isoforms. The putative transcripts are enumerated from the graph and assigned confidence scores based on the strength of sequence evidence, and a subset of the high-scoring candidates are selected and promoted into the annotation. The method is highly selective, eliminating the unlikely candidates while retaining 98% of the high-quality mRNA evidence in well-formed transcripts, and produces annotation that is measurably more accurate than some evidence-based gene sets. The process is fast, accurate, and fully automated, and combines the traditionally distinct gene annotation and alternative splicing detection processes in a comprehensive and systematic way, thus considerably aiding in the ensuing manual curation efforts. PMID:15632090

  5. Automated analysis and annotation of basketball video

    NASA Astrophysics Data System (ADS)

    Saur, Drew D.; Tan, Yap-Peng; Kulkarni, Sanjeev R.; Ramadge, Peter J.

    1997-01-01

    Automated analysis and annotation of video sequences are important for digital video libraries, content-based video browsing and data mining projects. A successful video annotation system should provide users with useful video content summary in a reasonable processing time. Given the wide variety of video genres available today, automatically extracting meaningful video content for annotation still remains hard by using current available techniques. However, a wide range video has inherent structure such that some prior knowledge about the video content can be exploited to improve our understanding of the high-level video semantic content. In this paper, we develop tools and techniques for analyzing structured video by using the low-level information available directly from MPEG compressed video. Being able to work directly in the video compressed domain can greatly reduce the processing time and enhance storage efficiency. As a testbed, we have developed a basketball annotation system which combines the low-level information extracted from MPEG stream with the prior knowledge of basketball video structure to provide high level content analysis, annotation and browsing for events such as wide- angle and close-up views, fast breaks, steals, potential shots, number of possessions and possession times. We expect our approach can also be extended to structured video in other domains.

  6. Gene and alternative splicing annotation with AIR.

    PubMed

    Florea, Liliana; Di Francesco, Valentina; Miller, Jason; Turner, Russell; Yao, Alison; Harris, Michael; Walenz, Brian; Mobarry, Clark; Merkulov, Gennady V; Charlab, Rosane; Dew, Ian; Deng, Zuoming; Istrail, Sorin; Li, Peter; Sutton, Granger

    2005-01-01

    Designing effective and accurate tools for identifying the functional and structural elements in a genome remains at the frontier of genome annotation owing to incompleteness and inaccuracy of the data, limitations in the computational models, and shifting paradigms in genomics, such as alternative splicing. We present a methodology for the automated annotation of genes and their alternatively spliced mRNA transcripts based on existing cDNA and protein sequence evidence from the same species or projected from a related species using syntenic mapping information. At the core of the method is the splice graph, a compact representation of a gene, its exons, introns, and alternatively spliced isoforms. The putative transcripts are enumerated from the graph and assigned confidence scores based on the strength of sequence evidence, and a subset of the high-scoring candidates are selected and promoted into the annotation. The method is highly selective, eliminating the unlikely candidates while retaining 98% of the high-quality mRNA evidence in well-formed transcripts, and produces annotation that is measurably more accurate than some evidence-based gene sets. The process is fast, accurate, and fully automated, and combines the traditionally distinct gene annotation and alternative splicing detection processes in a comprehensive and systematic way, thus considerably aiding in the ensuing manual curation efforts.

  7. Annotating nonspecific SAGE tags with microarray data.

    PubMed

    Ge, Xijin; Jung, Yong-Chul; Wu, Qingfa; Kibbe, Warren A; Wang, San Ming

    2006-01-01

    SAGE (serial analysis of gene expression) detects transcripts by extracting short tags from the transcripts. Because of the limited length, many SAGE tags are shared by transcripts from different genes. Relying on sequence information in the general gene expression database has limited power to solve this problem due to the highly heterogeneous nature of the deposited sequences. Considering that the complexity of gene expression at a single tissue level should be much simpler than that in the general expression database, we reasoned that by restricting gene expression to tissue level, the accuracy of gene annotation for the nonspecific SAGE tags should be significantly improved. To test the idea, we developed a tissue-specific SAGE annotation database based on microarray data (). This database contains microarray expression information represented as UniGene clusters for 73 normal human tissues and 18 cancer tissues and cell lines. The nonspecific SAGE tag is first matched to the database by the same tissue type used by both SAGE and microarray analysis; then the multiple UniGene clusters assigned to the nonspecific SAGE tag are searched in the database under the matched tissue type. The UniGene cluster presented solely or at higher expression levels in the database is annotated to represent the specific gene for the nonspecific SAGE tags. The accuracy of gene annotation by this database was largely confirmed by experimental data. Our study shows that microarray data provide a useful source for annotating the nonspecific SAGE tags.

  8. MPEG-7 based video annotation and browsing

    NASA Astrophysics Data System (ADS)

    Hoeynck, Michael; Auweiler, Thorsten; Wellhausen, Jens

    2003-11-01

    The huge amount of multimedia data produced worldwide requires annotation in order to enable universal content access and to provide content-based search-and-retrieval functionalities. Since manual video annotation can be time consuming, automatic annotation systems are required. We review recent approaches to content-based indexing and annotation of videos for different kind of sports and describe our approach to automatic annotation of equestrian sports videos. We especially concentrate on MPEG-7 based feature extraction and content description, where we apply different visual descriptors for cut detection. Further, we extract the temporal positions of single obstacles on the course by analyzing MPEG-7 edge information. Having determined single shot positions as well as the visual highlights, the information is jointly stored with meta-textual information in an MPEG-7 description scheme. Based on this information, we generate content summaries which can be utilized in a user-interface in order to provide content-based access to the video stream, but further for media browsing on a streaming server.

  9. Salmonella-secreted Virulence Factors

    SciTech Connect

    Heffron, Fred; Niemann, George; Yoon, Hyunjin; Kidwai, Afshan S.; Brown, Roslyn N.; McDermott, Jason E.; Smith, Richard D.; Adkins, Joshua N.

    2011-05-01

    In this short review we discuss secreted virulence factors of Salmonella, which directly affect Salmonella interaction with its host. Salmonella secretes protein to subvert host defenses but also, as discussed, to reduce virulence thereby permitting the bacteria to persist longer and more successfully disperse. The type III secretion system (TTSS) is the best known and well studied of the mechanisms that enable secretion from the bacterial cytoplasm to the host cell cytoplasm. Other secretion systems include outer membrane vesicles, which are present in all Gram-negative bacteria examined to date, two-partner secretion, and type VI secretion will also be addressed. Excellent reviews of Salmonella secreted effectors have focused on themes such as actin rearrangements, vesicular trafficking, ubiquitination, and the activities of the virulence factors themselves. This short review is based on S. Typhimurium infection of mice because it is a model of typhoid like disease in humans. We have organized effectors in terms of events that happen during the infection cycle and how secreted effectors may be involved.

  10. A Pilot Study on Developing a Standardized and Sensitive School Violence Risk Assessment with Manual Annotation.

    PubMed

    Barzman, Drew H; Ni, Yizhao; Griffey, Marcus; Patel, Bianca; Warren, Ashaki; Latessa, Edward; Sorter, Michael

    2017-09-01

    School violence has increased over the past decade and innovative, sensitive, and standardized approaches to assess school violence risk are needed. In our current feasibility study, we initialized a standardized, sensitive, and rapid school violence risk approach with manual annotation. Manual annotation is the process of analyzing a student's transcribed interview to extract relevant information (e.g., key words) to school violence risk levels that are associated with students' behaviors, attitudes, feelings, use of technology (social media and video games), and other activities. In this feasibility study, we first implemented school violence risk assessments to evaluate risk levels by interviewing the student and parent separately at the school or the hospital to complete our novel school safety scales. We completed 25 risk assessments, resulting in 25 transcribed interviews of 12-18 year olds from 15 schools in Ohio and Kentucky. We then analyzed structured professional judgments, language, and patterns associated with school violence risk levels by using manual annotation and statistical methodology. To analyze the student interviews, we initiated the development of an annotation guideline to extract key information that is associated with students' behaviors, attitudes, feelings, use of technology and other activities. Statistical analysis was applied to associate the significant categories with students' risk levels to identify key factors which will help with developing action steps to reduce risk. In a future study, we plan to recruit more subjects in order to fully develop the manual annotation which will result in a more standardized and sensitive approach to school violence assessments.

  11. ConsPred: a rule-based (re-)annotation framework for prokaryotic genomes.

    PubMed

    Weinmaier, Thomas; Platzer, Alexander; Frank, Jeroen; Hellinger, Hans-Jörg; Tischler, Patrick; Rattei, Thomas

    2016-11-01

    The rapidly growing number of available prokaryotic genome sequences requires fully automated and high-quality software solutions for their initial and re-annotation. Here we present ConsPred, a prokaryotic genome annotation framework that performs intrinsic gene predictions, homology searches, predictions of non-coding genes as well as CRISPR repeats and integrates all evidence into a consensus annotation. ConsPred achieves comprehensive, high-quality annotations based on rules and priorities, similar to decision-making in manual curation and avoids conflicting predictions. Parameters controlling the annotation process are configurable by the user. ConsPred has been used in the institutions of the authors for longer than 5 years and can easily be extended and adapted to specific needs. The ConsPred algorithm for producing a consensus from the varying scores of multiple gene prediction programs approaches manual curation in accuracy. Its rule-based approach for choosing final predictions avoids overriding previous manual curations. ConsPred is implemented in Java, Perl and Shell and is freely available under the Creative Commons license as a stand-alone in-house pipeline or as an Amazon Machine Image for cloud computing, see https://sourceforge.net/projects/conspred/. thomas.rattei@univie.ac.atSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. Automatic Annotation of Spatial Expression Patterns via Sparse Bayesian Factor Models

    PubMed Central

    Pruteanu-Malinici, Iulian; Mace, Daniel L.; Ohler, Uwe

    2011-01-01

    Advances in reporters for gene expression have made it possible to document and quantify expression patterns in 2D–4D. In contrast to microarrays, which provide data for many genes but averaged and/or at low resolution, images reveal the high spatial dynamics of gene expression. Developing computational methods to compare, annotate, and model gene expression based on images is imperative, considering that available data are rapidly increasing. We have developed a sparse Bayesian factor analysis model in which the observed expression diversity of among a large set of high-dimensional images is modeled by a small number of hidden common factors. We apply this approach on embryonic expression patterns from a Drosophila RNA in situ image database, and show that the automatically inferred factors provide for a meaningful decomposition and represent common co-regulation or biological functions. The low-dimensional set of factor mixing weights is further used as features by a classifier to annotate expression patterns with functional categories. On human-curated annotations, our sparse approach reaches similar or better classification of expression patterns at different developmental stages, when compared to other automatic image annotation methods using thousands of hard-to-interpret features. Our study therefore outlines a general framework for large microscopy data sets, in which both the generative model itself, as well as its application for analysis tasks such as automated annotation, can provide insight into biological questions. PMID:21814502

  13. AnnoLnc: a web server for systematically annotating novel human lncRNAs.

    PubMed

    Hou, Mei; Tang, Xing; Tian, Feng; Shi, Fangyuan; Liu, Fenglin; Gao, Ge

    2016-11-16

    Long noncoding RNAs (lncRNAs) have been shown to play essential roles in almost every important biological process through multiple mechanisms. Although the repertoire of human lncRNAs has rapidly expanded, their biological function and regulation remain largely elusive, calling for a systematic and integrative annotation tool. Here we present AnnoLnc ( http://annolnc.cbi.pku.edu.cn ), a one-stop portal for systematically annotating novel human lncRNAs. Based on more than 700 data sources and various tool chains, AnnoLnc enables a systematic annotation covering genomic location, secondary structure, expression patterns, transcriptional regulation, miRNA interaction, protein interaction, genetic association and evolution. An intuitive web interface is available for interactive analysis through both desktops and mobile devices, and programmers can further integrate AnnoLnc into their pipeline through standard JSON-based Web Service APIs. To the best of our knowledge, AnnoLnc is the only web server to provide on-the-fly and systematic annotation for newly identified human lncRNAs. Compared with similar tools, the annotation generated by AnnoLnc covers a much wider spectrum with intuitive visualization. Case studies demonstrate the power of AnnoLnc in not only rediscovering known functions of human lncRNAs but also inspiring novel hypotheses.

  14. The AnnoLite and AnnoLyze programs for comparative annotation of protein structures

    PubMed Central

    Marti-Renom, Marc A; Rossi, Andrea; Al-Shahrour, Fátima; Davis, Fred P; Pieper, Ursula; Dopazo, Joaquín; Sali, Andrej

    2007-01-01

    Background Advances in structural biology, including structural genomics, have resulted in a rapid increase in the number of experimentally determined protein structures. However, about half of the structures deposited by the structural genomics consortia have little or no information about their biological function. Therefore, there is a need for tools for automatically and comprehensively annotating the function of protein structures. We aim to provide such tools by applying comparative protein structure annotation that relies on detectable relationships between protein structures to transfer functional annotations. Here we introduce two programs, AnnoLite and AnnoLyze, which use the structural alignments deposited in the DBAli database. Description AnnoLite predicts the SCOP, CATH, EC, InterPro, PfamA, and GO terms with an average sensitivity of ~90% and average precision of ~80%. AnnoLyze predicts ligand binding site and domain interaction patches with an average sensitivity of ~70% and average precision of ~30%, correctly localizing binding sites for small molecules in ~95% of its predictions. Conclusion The AnnoLite and AnnoLyze programs for comparative annotation of protein structures can reliably and automatically annotate new protein structures. The programs are fully accessible via the Internet as part of the DBAli suite of tools at . PMID:17570147

  15. Quantifying Variability of Manual Annotation in Cryo-Electron Tomograms

    PubMed Central

    Hecksel, Corey W.; Darrow, Michele C.; Dai, Wei; Galaz-Montoya, Jesús G.; Chin, Jessica A.; Mitchell, Patrick G.; Chen, Shurui; Jakana, Jemba; Schmid, Michael F.; Chiu, Wah

    2016-01-01

    Although acknowledged to be variable and subjective, manual annotation of cryo-electron tomography data is commonly used to answer structural questions and to create a “ground truth” for evaluation of automated segmentation algorithms. Validation of such annotation is lacking, but is critical for understanding the reproducibility of manual annotations. Here, we used voxel-based similarity scores for a variety of specimens, ranging in complexity and segmented by several annotators, to quantify the variation among their annotations. In addition, we have identified procedures for merging annotations to reduce variability, thereby increasing the reliability of manual annotation. Based on our analyses, we find that it is necessary to combine multiple manual annotations to increase the confidence level for answering structural questions. We also make recommendations to guide algorithm development for automated annotation of features of interest. PMID:27225525

  16. Comparative transcriptome analysis of salivary glands of two populations of rice brown planthopper, Nilaparvata lugens, that differ in virulence.

    PubMed

    Ji, Rui; Yu, Haixin; Fu, Qiang; Chen, Hongdan; Ye, Wenfeng; Li, Shaohui; Lou, Yonggen

    2013-01-01

    The brown planthopper (BPH), Nilaparvata lugens (Stål), a destructive rice pest in Asia, can quickly overcome rice resistance by evolving new virulent populations. Herbivore saliva plays an important role in plant-herbivore interactions, including in plant defense and herbivore virulence. However, thus far little is known about BPH saliva at the molecular level, especially its role in virulence and BPH-rice interaction. Using cDNA amplification in combination with Illumina short-read sequencing technology, we sequenced the salivary-gland transcriptomes of two BPH populations with different virulence; the populations were derived from rice variety TN1 (TN1 population) and Mudgo (M population). In total, 37,666 and 38,451 unigenes were generated from the salivary glands of these populations, respectively. When combined, a total of 43,312 unigenes were obtained, about 18 times more than the number of expressed sequence tags previously identified from these glands. Gene ontology annotations and KEGG orthology classifications indicated that genes related to metabolism, binding and transport were significantly active in the salivary glands. A total of 352 genes were predicted to encode secretory proteins, and some might play important roles in BPH feeding and BPH-rice interactions. Comparative analysis of the transcriptomes of the two populations revealed that the genes related to 'metabolism,' 'digestion and absorption,' and 'salivary secretion' might be associated with virulence. Moreover, 67 genes encoding putative secreted proteins were differentially expressed between the two populations, suggesting these genes may contribute to the change in virulence. This study was the first to compare the salivary-gland transcriptomes of two BPH populations having different virulence traits and to find genes that may be related to this difference. Our data provide a rich molecular resource for future functional studies on salivary glands and will be useful for elucidating the

  17. Polyphasic characterization and genetic relatedness of low-virulence and virulent Listeria monocytogenes isolates

    PubMed Central

    2012-01-01

    Background Currently, food regulatory authorities consider all Listeria monocytogenes isolates as equally virulent. However, an increasing number of studies demonstrate extensive variations in virulence and pathogenicity of L. monocytogenes strains. Up to now, there is no comprehensive overview of the population genetic structure of L. monocytogenes taking into account virulence level. We have previously demonstrated that different low-virulence strains exhibit the same mutations in virulence genes suggesting that they could have common evolutionary pathways. New low-virulence strains were identified and assigned to phenotypic and genotypic Groups using cluster analysis. Pulsed-field gel electrophoresis, virulence gene sequencing and multi-locus sequence typing analyses were performed to study the genetic relatedness and the population structure between the studied low-virulence isolates and virulent strains. Results These methods showed that low-virulence strains are widely distributed in the two major lineages, but some are also clustered according to their genetic mutations. These analyses showed that low-virulence strains initially grouped according to their lineage, then to their serotypes and after which, they lost their virulence suggesting a relatively recent emergence. Conclusions Loss of virulence in lineage II strains was related to point mutation in a few virulence genes (prfA, inlA, inlB, plcA). These strains thus form a tightly clustered, monophyletic group with limited diversity. In contrast, low-virulence strains of lineage I were more dispersed among the virulence strains and the origin of their loss of virulence has not been identified yet, even if some strains exhibited different mutations in prfA or inlA. PMID:23267677

  18. Corpus annotation for mining biomedical events from literature

    PubMed Central

    Kim, Jin-Dong; Ohta, Tomoko; Tsujii, Jun'ichi

    2008-01-01

    Background Advanced Text Mining (TM) such as semantic enrichment of papers, event or relation extraction, and intelligent Question Answering have increasingly attracted attention in the bio-medical domain. For such attempts to succeed, text annotation from the biological point of view is indispensable. However, due to the complexity of the task, semantic annotation has never been tried on a large scale, apart from relatively simple term annotation. Results We have completed a new type of semantic annotation, event annotation, which is an addition to the existing annotations in the GENIA corpus. The corpus has already been annotated with POS (Parts of Speech), syntactic trees, terms, etc. The new annotation was made on half of the GENIA corpus, consisting of 1,000 Medline abstracts. It contains 9,372 sentences in which 36,114 events are identified. The major challenges during event annotation were (1) to design a scheme of annotation which meets specific requirements of text annotation, (2) to achieve biology-oriented annotation which reflect biologists' interpretation of text, and (3) to ensure the homogeneity of annotation quality across annotators. To meet these challenges, we introduced new concepts such as Single-facet Annotation and Semantic Typing, which have collectively contributed to successful completion of a large scale annotation. Conclusion The resulting event-annotated corpus is the largest and one of the best in quality among similar annotation efforts. We expect it to become a valuable resource for NLP (Natural Language Processing)-based TM in the bio-medical domain. PMID:18182099

  19. Annotating images by mining image search results.

    PubMed

    Wang, Xin-Jing; Zhang, Lei; Li, Xirong; Ma, Wei-Ying

    2008-11-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search results. Some 2.4 million images with their surrounding text are collected from a few photo forums to support this approach. The entire process is formulated in a divide-and-conquer framework where a query keyword is provided along with the uncaptioned image to improve both the effectiveness and efficiency. This is helpful when the collected data set is not dense everywhere. In this sense, our approach contains three steps: 1) the search process to discover visually and semantically similar search results, 2) the mining process to identify salient terms from textual descriptions of the search results, and 3) the annotation rejection process to filter out noisy terms yielded by Step 2. To ensure real-time annotation, two key techniques are leveraged-one is to map the high-dimensional image visual features into hash codes, the other is to implement it as a distributed system, of which the search and mining processes are provided as Web services. As a typical result, the entire process finishes in less than 1 second. Since no training data set is required, our approach enables annotating with unlimited vocabulary and is highly scalable and robust to outliers. Experimental results on both real Web images and a benchmark image data set show the effectiveness and efficiency of the proposed algorithm. It is also worth noting that, although the entire approach is illustrated within the divide-and conquer framework, a query keyword is not crucial to our current implementation. We provide experimental results to prove this.

  20. Helicobacter pylori virulence and cancer pathogenesis.

    PubMed

    Yamaoka, Yoshio; Graham, David Y

    2014-06-01

    Helicobacter pylori is human gastric pathogen that causes chronic and progressive gastric mucosal inflammation and is responsible for the gastric inflammation-associated diseases, gastric cancer and peptic ulcer disease. Specific outcomes reflect the interplay between host-, environmental- and bacterial-specific factors. Progress in understanding putative virulence factors in disease pathogenesis has been limited and many false leads have consumed scarce resources. Few in vitro-in vivo correlations or translational applications have proved clinically relevant. Reported virulence factor-related outcomes reflect differences in relative risk of disease rather than specificity for any specific outcome. Studies of individual virulence factor associations have provided conflicting results. Since virulence factors are linked, studies of groups of putative virulence factors are needed to provide clinically useful information. Here, the authors discuss the progress made in understanding the role of H. pylori virulence factors CagA, vacuolating cytotoxin, OipA and DupA in disease pathogenesis and provide suggestions for future studies.

  1. AnnotCompute: annotation-based exploration and meta-analysis of genomics experiments.

    PubMed

    Zheng, Jie; Stoyanovich, Julia; Manduchi, Elisabetta; Liu, Junmin; Stoeckert, Christian J

    2011-01-01

    The ever-increasing scale of biological data sets, particularly those arising in the context of high-throughput technologies, requires the development of rich data exploration tools. In this article, we present AnnotCompute, an information discovery platform for repositories of functional genomics experiments such as ArrayExpress. Our system leverages semantic annotations of functional genomics experiments with controlled vocabulary and ontology terms, such as those from the MGED Ontology, to compute conceptual dissimilarities between pairs of experiments. These dissimilarities are then used to support two types of exploratory analysis-clustering and query-by-example. We show that our proposed dissimilarity measures correspond to a user's intuition about conceptual dissimilarity, and can be used to support effective query-by-example. We also evaluate the quality of clustering based on these measures. While AnnotCompute can support a richer data exploration experience, its effectiveness is limited in some cases, due to the quality of available annotations. Nonetheless, tools such as AnnotCompute may provide an incentive for richer annotations of experiments. Database URL: http://www.cbil.upenn.edu/annotCompute/

  2. GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations

    PubMed Central

    Paila, Umadevi; Chapman, Brad A.; Kirchner, Rory; Quinlan, Aaron R.

    2013-01-01

    Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics. PMID:23874191

  3. GEMINI: integrative exploration of genetic variation and genome annotations.

    PubMed

    Paila, Umadevi; Chapman, Brad A; Kirchner, Rory; Quinlan, Aaron R

    2013-01-01

    Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics.

  4. High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

    PubMed Central

    Seaver, Samuel M. D.; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M. T.; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D.; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D.; Henry, Christopher S.

    2014-01-01

    The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today’s annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed. PMID:24927599

  5. High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource.

    PubMed

    Seaver, Samuel M D; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M T; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D; Henry, Christopher S

    2014-07-01

    The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed.

  6. Solar Tutorial and Annotation Resource (STAR)

    NASA Astrophysics Data System (ADS)

    Showalter, C.; Rex, R.; Hurlburt, N. E.; Zita, E. J.

    2009-12-01

    We have written a software suite designed to facilitate solar data analysis by scientists, students, and the public, anticipating enormous datasets from future instruments. Our “STAR" suite includes an interactive learning section explaining 15 classes of solar events. Users learn software tools that exploit humans’ superior ability (over computers) to identify many events. Annotation tools include time slice generation to quantify loop oscillations, the interpolation of event shapes using natural cubic splines (for loops, sigmoids, and filaments) and closed cubic splines (for coronal holes). Learning these tools in an environment where examples are provided prepares new users to comfortably utilize annotation software with new data. Upon completion of our tutorial, users are presented with media of various solar events and asked to identify and annotate the images, to test their mastery of the system. Goals of the project include public input into the data analysis of very large datasets from future solar satellites, and increased public interest and knowledge about the Sun. In 2010, the Solar Dynamics Observatory (SDO) will be launched into orbit. SDO’s advancements in solar telescope technology will generate a terabyte per day of high-quality data, requiring innovation in data management. While major projects develop automated feature recognition software, so that computers can complete much of the initial event tagging and analysis, still, that software cannot annotate features such as sigmoids, coronal magnetic loops, coronal dimming, etc., due to large amounts of data concentrated in relatively small areas. Previously, solar physicists manually annotated these features, but with the imminent influx of data it is unrealistic to expect specialized researchers to examine every image that computers cannot fully process. A new approach is needed to efficiently process these data. Providing analysis tools and data access to students and the public have proven

  7. RNA-eXpress annotates novel transcript features in RNA-seq data.

    PubMed

    Forster, Samuel C; Finkel, Alexander M; Gould, Jodee A; Hertzog, Paul J

    2013-03-15

    Next-generation sequencing is rapidly becoming the approach of choice for transcriptional analysis experiments. Substantial advances have been achieved in computational approaches to support these technologies. These approaches typically rely on existing transcript annotations, introducing a bias towards known genes, require specific experimental design and computational resources, or focus only on identification of splice variants (ignoring other biologically relevant transcribed features contained within the data that may be important for downstream analysis). Biologically relevant transcribed features also include large and small non-coding RNA, new transcription start sites, alternative promoters, RNA editing and processing of coding transcripts. Also, many existing solutions lack accessible interfaces required for wide scale adoption. We present a user-friendly, rapid and computation-efficient feature annotation framework (RNA-eXpress) that enables identification of transcripts and other genomic and transcriptional features independently of current annotations. RNA-eXpress accepts mapped reads in the standard binary alignment (BAM) format and produces a study-specific feature annotation in GTF format, comparison statistics, sequence extraction and feature counts. The framework is designed to be easily accessible while allowing advanced users to integrate new feature-identification algorithms through simple class extension, thus facilitating expansion to novel feature types or identification of study-specific feature types.

  8. Corrected Genome Annotations Reveal Gene Loss and Antibiotic Resistance as Drivers in the Fitness Evolution of Salmonella enterica Serovar Typhimurium.

    PubMed

    Paul, Sandip; Sokurenko, Evgeni V; Chattopadhyay, Sujay

    2016-12-01

    Horizontal acquisition of novel chromosomal genes is considered to be a key process in the evolution of bacterial pathogens. However, the identification of gene presence or absence could be hindered by the inconsistencies in bacterial genome annotations. Here, we performed a cross-annotation of omnipresent core and mosaic accessory genes in the chromosome of Salmonella enterica serovar Typhimurium across a total of 20 fully assembled genomes deposited into GenBank. Cross-annotation resulted in a 32% increase in the number of core genes and a 3-fold decrease in the number of genes identified as mosaic genes (i.e., genes present in some strains only) by the original annotation. Of the remaining noncore genes, the vast majority were prophage genes, and 255 of the nonphage genes were actually of core origin but lost in some strains upon the emergence of the S Typhimurium serovar, suggesting that the chromosomal portion of the S Typhimurium genome acquired a very limited number of novel genes other than prophages. Only horizontally acquired nonphage genes related to bacterial fitness or virulence were found in four recently sequenced isolates, all located on three different genomic islands that harbor multidrug resistance determinants. Thus, the extensive use of antimicrobials could be the main selection force behind the new fitness gene acquisition and the emergence of novel Salmonella pathotypes.

  9. ppGpp Conjures Bacterial Virulence

    PubMed Central

    Dalebroux, Zachary D.; Svensson, Sarah L.; Gaynor, Erin C.; Swanson, Michele S.

    2010-01-01

    Summary: Like for all microbes, the goal of every pathogen is to survive and replicate. However, to overcome the formidable defenses of their hosts, pathogens are also endowed with traits commonly associated with virulence, such as surface attachment, cell or tissue invasion, and transmission. Numerous pathogens couple their specific virulence pathways with more general adaptations, like stress resistance, by integrating dedicated regulators with global signaling networks. In particular, many of nature's most dreaded bacteria rely on nucleotide alarmones to cue metabolic disturbances and coordinate survival and virulence programs. Here we discuss how components of the stringent response contribute to the virulence of a wide variety of pathogenic bacteria. PMID:20508246

  10. Automating Ontological Annotation with WordNet

    SciTech Connect

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob L.; Hohimer, Ryan E.; White, Amanda M.

    2006-01-22

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  11. Ontological Annotation with WordNet

    SciTech Connect

    Sanfilippo, Antonio P.; Tratz, Stephen C.; Gregory, Michelle L.; Chappell, Alan R.; Whitney, Paul D.; Posse, Christian; Paulson, Patrick R.; Baddeley, Bob; Hohimer, Ryan E.; White, Amanda M.

    2006-06-06

    Semantic Web applications require robust and accurate annotation tools that are capable of automating the assignment of ontological classes to words in naturally occurring text (ontological annotation). Most current ontologies do not include rich lexical databases and are therefore not easily integrated with word sense disambiguation algorithms that are needed to automate ontological annotation. WordNet provides a potentially ideal solution to this problem as it offers a highly structured lexical conceptual representation that has been extensively used to develop word sense disambiguation algorithms. However, WordNet has not been designed as an ontology, and while it can be easily turned into one, the result of doing this would present users with serious practical limitations due to the great number of concepts (synonym sets) it contains. Moreover, mapping WordNet to an existing ontology may be difficult and requires substantial labor. We propose to overcome these limitations by developing an analytical platform that (1) provides a WordNet-based ontology offering a manageable and yet comprehensive set of concept classes, (2) leverages the lexical richness of WordNet to give an extensive characterization of concept class in terms of lexical instances, and (3) integrates a class recognition algorithm that automates the assignment of concept classes to words in naturally occurring text. The ensuing framework makes available an ontological annotation platform that can be effectively integrated with intelligence analysis systems to facilitate evidence marshaling and sustain the creation and validation of inference models.

  12. Death Education. An Annotated Bibliography for Teachers.

    ERIC Educational Resources Information Center

    Lockard, Bonnie Elam, Comp.

    This annotated bibliography contains resources for teachers to use in the preparation of curricula for death education for children in grades K-12. Section one contains printed resources for teachers. Many of these resources offer more comprehensive guides for death education and materials. Section two describes 70 books for students in grade K-12…

  13. Food for Thought: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Bennett, Susan G., Ed.

    Most of the 24 books reviewed in this annotated bibliography concern writing and are recent publications (1980-1985). Titles and authors are as follows: "Teacher" (Sylvia Ashton-Warner); "What Did I Write? Beginning Writing Behavior" (Marie M. Clay); "Composing: Writing as a Self-Creating Process" (William E. Coles);…

  14. Revenue Producing Athletes: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Ervin, Leroy; And Others

    An annotated bibliography on revenue producing sports is presented, with attention to: Proposition 48, exploitation of athletes, legal proceedings, research related to athletes and academic performance, psychological characteristics of athletes, and counseling programs for athletes. Introductions to each of the six topics are included. The section…

  15. Statistical mechanics of ontology based annotations

    NASA Astrophysics Data System (ADS)

    Hoyle, David C.; Brass, Andrew

    2016-01-01

    We present a statistical mechanical theory of the process of annotating an object with terms selected from an ontology. The term selection process is formulated as an ideal lattice gas model, but in a highly structured inhomogeneous field. The model enables us to explain patterns recently observed in real-world annotation data sets, in terms of the underlying graph structure of the ontology. By relating the external field strengths to the information content of each node in the ontology graph, the statistical mechanical model also allows us to propose a number of practical metrics for assessing the quality of both the ontology, and the annotations that arise from its use. Using the statistical mechanical formalism we also study an ensemble of ontologies of differing size and complexity; an analysis not readily performed using real data alone. Focusing on regular tree ontology graphs we uncover a rich set of scaling laws describing the growth in the optimal ontology size as the number of objects being annotated increases. In doing so we provide a further possible measure for assessment of ontologies.

  16. International, Intercultural Communication: Selected Annotated Bibliography.

    ERIC Educational Resources Information Center

    Casmir, Fred L.

    Designed to assist the student, scholar or practitioner interested in the role of culture in communications and human organization, this annotated bibliography cites sources since 1972 on intercultural and international communication. The 78 references are organized as follows: (1) books (including general handbooks for training sojourners or…

  17. Communication and Culture: Five Annotated Bibliographies.

    ERIC Educational Resources Information Center

    Gebhard, Jerry G.; Graber, Elizabeth; Grote, Ellen; Miller, Patricia; Thongrin, Saneh; Rodriguez, Xinia

    The five annotated bibliographies, developed by students as a requirement for a graduate-level course in cross-cultural communication, include: "Teaching Cross-Cultural Capability in the ESL/EFL Classroom" (Ellen Grote); "To Have and To Hold: Marriage Connections Across Cultures" (Elizabeth Graber); "Rhetoric and Ideology in Compositions: L1 and…

  18. La Mujer Chicana: An Annotated Bibliography, 1976.

    ERIC Educational Resources Information Center

    Chapa, Evey, Ed.; And Others

    Intended to provide interested persons, researchers, and educators with information about "la mujer Chicana", this annotated bibliography cites 320 materials published between 1916 and 1975, with the majority being between 1960 and 1975. The 12 sections cover the following subject areas: Chicana publications; Chicana feminism and…

  19. Annotated Bibliography of Literature on Narcotic Addiction.

    ERIC Educational Resources Information Center

    Bowden, R. Renee

    Nearly 150 abstracts have been included in this annotated bibliography; its purpose has been to scan the voluminous number of documents on the problem of drug addiction in order to summarize the present state of knowledge on narcotic addiction and on methods for its treatment and control. The literature reviewed has been divided into the following…

  20. Project for Global Education: Annotated Bibliography.

    ERIC Educational Resources Information Center

    Institute for World Order, New York, NY.

    Over 260 books, textbooks, articles, pamphlets, periodicals, films, and multi-media packages appropriate for the analysis of global issues at the college level are briefly annotated. Entries include classic books and articles as well as a number of recent (1976-1981) publications. The purpose is to assist students and educators in developing a…

  1. Teleconferencing, an annotated bibliography, volume 3

    NASA Technical Reports Server (NTRS)

    Shervis, K.

    1971-01-01

    In this annotated and indexed listing of works on teleconferencing, emphasis has been placed upon teleconferencing as real-time, two way audio communication with or without visual aids. However, works on the use of television in two-way or multiway nets, data transmission, regional communications networks and on telecommunications in general are also included.

  2. Health Economics Research: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Dillard, Carole D.; And Others

    This annotated bibliography lists books and journal articles published since 1976 which deal with health economics and which are based on health services research supported by the National Center for Health Services Research (NCHSR). Articles prepared by NCHSR staff are listed as intramural. All other articles cite the NCHSR grant or contract…

  3. Children and Poetry: A Selective, Annotated Bibliography.

    ERIC Educational Resources Information Center

    Haviland, Virginia; Smith, William Jay

    This annotated bibliography of over 120 books was compiled to call attention to poetry for children that is both pleasing and rewarding. Omitted are traditional materials such as Mother Goose rhymes, textbooks, and collections designed especially for the classroom. Sample illustrations from the books noted and lines from poems are reproduced…

  4. READABILITY AND READING--AN ANNOTATED BIBLIOGRAPHY.

    ERIC Educational Resources Information Center

    DALE, EDGAR; SEELS, BARBARA

    THIS ANNOTATED BIBLIOGRAPHY COVERS THE FIELD OF READABILITY AND READING. THE SELECTED WORKS ARE ORGANIZED INTO NINE SECTIONS--(1) GENERAL REFERENCES ON READABILITY, (2) MEASURING OF READABILITY, (3) READABILITY AND SENTENCE STRUCTURE, (4) READABILITY AND VOCABULARY, (5) READABILITY AND LITERARY STYLE, (6) READABILITY IN SUBJECT AREA MATERIALS, (7)…

  5. An Annotated Journalism Bibliography; 1958-1968.

    ERIC Educational Resources Information Center

    Price, Warren C.; Pickett, Calder M.

    Annotated entries of 2172 books in journalism which have appeared between 1958 and 1968 comprise this volume. Materials are listed alphabetically, by author, and an index of names and subject headings is provided. General categories of entries are biographies, narratives of journalists at work, anthologies of journalistic writing, ethical and…

  6. A Semi-Annotated Bibliography: The Wabanakis.

    ERIC Educational Resources Information Center

    Braber, Lee; Dean, Jacquelyn M.

    A companion to the booklet, "A Teacher Manual on Native Americans: The Wabanakis," the semi-annotated bibliography consisting of 235 citations may be used by people who wish to have access to information and research (1890-1982) done about the tribes on the New England and Maritime shores, including the Wabanaki Confederacy composed of…

  7. Labor and Migration; An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Brooks, Thomas R.

    This annotated bibliography is intended to contribute toward an understanding of labor and migration, both of which have helped to shape our nation. A total of 131 works, including a few periodicals and newspapers, focus on immigration and internal migration as it affects organized and unorganized labor. (BH)

  8. Annotated bibliography of psychomotor testing. Technical report

    SciTech Connect

    Ervin, C.

    1987-03-01

    An annotated bibliography of 67 publications in the field of psychomotor testing has been prepared. The collection includes technical reports, journal articles, presented at scientific meetings, books and conference proceedings. The publications were assembled as preliminary work in the development of a dexterity test battery designed to measure the effects of chemical-defense-treatment drugs.

  9. Annotated Bibliography of EDGE2D Use

    SciTech Connect

    J.D. Strachan and G. Corrigan

    2005-06-24

    This annotated bibliography is intended to help EDGE2D users, and particularly new users, find existing published literature that has used EDGE2D. Our idea is that a person can find existing studies which may relate to his intended use, as well as gain ideas about other possible applications by scanning the attached tables.

  10. Revenue Producing Athletes: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Ervin, Leroy; And Others

    An annotated bibliography on revenue producing sports is presented, with attention to: Proposition 48, exploitation of athletes, legal proceedings, research related to athletes and academic performance, psychological characteristics of athletes, and counseling programs for athletes. Introductions to each of the six topics are included. The section…

  11. Higher Education Literature: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    White, Jane N., Ed.; Burnett, Collins W., Ed.

    An annotated bibliography on higher education is presented that is limited to programs and phenomena in two- and four-year accredited degree-granting colleges and universities. The following sections and topics are covered: (1) Historical Background and Nature and Scope of American Higher Education (ancient, medieval, and U.S. education,…

  12. Evaluating Image Browsers Using Structured Annotation.

    ERIC Educational Resources Information Center

    Muller, Wolfgang; Marchand-Mailet, Stephane; Muller, Henning; Squire, David McG.; Pun, Thierry

    2001-01-01

    Addresses the problem of benchmarking image browsers. Existence of different search paradigms for image browsers makes it difficult to compare them. Currently, the only admissible evaluation method involves conducting large-scale user studies. An automatic image browser benchmark is proposed that uses structured text annotation of the image…

  13. Communication and Politics: A Selected, Annotated Bibliography.

    ERIC Educational Resources Information Center

    Kaid, Lynda Lee; And Others

    Noting that the study of communication in political settings is an increasingly popular and important area of teaching and research in many disciplines, this 51-item annotated bibliography reflects the interdisciplinary nature of the field and is designed to incorporate varying approaches to the subject matter. With few exceptions, the books and…

  14. Values and Minorities: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Vasquez, James A.; And Others

    This annotated bibliography on values of ethnic minorities in the United States contains one hundred entries from various sources, mostly research and educational journals. It is intended to assist researchers, teachers, school administrators, and students to understand how some American minorities function within their own cultures and societies.…

  15. Suggested Books for Children: An Annotated Bibliography

    ERIC Educational Resources Information Center

    NHSA Dialog, 2008

    2008-01-01

    This article provides an annotated bibliography of various children's books. It includes listings of books that illustrate the dynamic relationships within the natural environment, economic context, racial and cultural identities, cross-group similarities and differences, gender, different abilities and stories of injustice and resistance.

  16. Auditory Handicaps and Reading: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Geoffrion, Leo D., Comp.; Schuster, Karen E., Comp.

    This annotated bibliography on the reading achievement of the deaf is designed to aid those who wish to learn more about how children with severe auditory handicaps read. The various sections focus on the severity of the reading deficit of deaf students, the findings of basic research on how they read, and some of the instructional approaches…

  17. Case Studies in Reading: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Trela, Thaddeus M., Comp.; Becker, George J., Comp.

    Descriptions of individual diagnosis and remediation of reading problems experienced by students at all levels are included in this annotated bibliography. Included are books, texts having case study sections, and journal reports which together comprise useful sources of case studies of reading disabilities. An opening section lists nine "first…

  18. Early Childhood Education: A Selected, Annotated Bibliography.

    ERIC Educational Resources Information Center

    Yan, Rose, Comp.

    This bibliography on early childhood (toddler to about age six) and early childhood education is divided into three main sections: annotations of monographs and selected papers, notes on journal articles, and abstracts of research reports. These are followed by a brief section on nonconventional (usually mixed media) materials on early childhood…

  19. Parenting: An Annotated Bibliography, 1965-1987.

    ERIC Educational Resources Information Center

    Feinberg, Sandra; And Others

    This annotated bibliography on parenting resources is designed to assist parents and those who work with them to locate books on the many and complex topics that affect family life. The materials included encompass the various stages of parenting, from pregnancy and childbirth through the parenting of adult children. The many topics covered…

  20. Research: Annotated Bibliography of New Canadian Studies.

    ERIC Educational Resources Information Center

    Toronto Board of Education (Ontario). Research Dept.

    This annotated bibliography of twenty-one research reports that provide knowledge about various cultures and educational experiences of the major ethnic groups in the Toronto schools is designed to present information for not only special English teachers, but other school personnel as well. The bibliography consists of reports that aim to: 1)…

  1. Skin Cancer Education Materials: Selected Annotations.

    ERIC Educational Resources Information Center

    National Cancer Inst. (NIH), Bethesda, MD.

    This annotated bibliography presents 85 entries on a variety of approaches to cancer education. The entries are grouped under three broad headings, two of which contain smaller sub-divisions. The first heading, Public Education, contains prevention and general information, and non-print materials. The second heading, Professional Education,…

  2. Work Adjustment Competencies: Annotated Resources for Training.

    ERIC Educational Resources Information Center

    Menz, Frederick E., Ed.; And Others

    This resource manual is intended to be used by instructors and trainers in both preservice and short-term training as a tool to assist in designing new offerings and redesigning old offerings for work adjustment trainees. Annotations are provided of resources in 19 work adjustment competency areas, including the following: specific marketable…

  3. An Annotated Bibliography for Instructional Systems Development

    DTIC Science & Technology

    1979-08-01

    rwDiON INov s IS owsoLETE Unclassified j SECURITY CLASSIFICATION OF THIS PAGE flWlm De E,,eeed) L iI. --. ,! Technical Report 426 AN ANNOTATED...that task analysis is an artistic, creative , syner- gistic, multi-purpose, problem solving, global, interpersonal, political, and cognitive task. They

  4. Human object annotation for surveillance video forensics

    NASA Astrophysics Data System (ADS)

    Fraz, Muhammad; Zafar, Iffat; Tzanidou, Giounona; Edirisinghe, Eran A.; Sarfraz, Muhammad Saquib

    2013-10-01

    A system that can automatically annotate surveillance video in a manner useful for locating a person with a given description of clothing is presented. Each human is annotated based on two appearance features: primary colors of clothes and the presence of text/logos on clothes. The annotation occurs after a robust foreground extraction stage employing a modified Gaussian mixture model-based approach. The proposed pipeline consists of a preprocessing stage where color appearance of an image is improved using a color constancy algorithm. In order to annotate color information for human clothes, we use the color histogram feature in HSV space and find local maxima to extract dominant colors for different parts of a segmented human object. To detect text/logos on clothes, we begin with the extraction of connected components of enhanced horizontal, vertical, and diagonal edges in the frames. These candidate regions are classified as text or nontext on the basis of their local energy-based shape histogram features. Further, to detect humans, a novel technique has been proposed that uses contourlet transform-based local binary pattern (CLBP) features. In the proposed method, we extract the uniform direction invariant LBP feature descriptor for contourlet transformed high-pass subimages from vertical and diagonal directional bands. In the final stage, extracted CLBP descriptors are classified by a trained support vector machine. Experimental results illustrate the superiority of our method on large-scale surveillance video data.

  5. Mulligan Concept manual therapy: standardizing annotation.

    PubMed

    McDowell, Jillian Marie; Johnson, Gillian Margaret; Hetherington, Barbara Helen

    2014-10-01

    Quality technique documentation is integral to the practice of manual therapy, ensuring uniform application and reproducibility of treatment. Manual therapy techniques are described by annotations utilizing a range of acronyms, abbreviations and universal terminology based on biomechanical and anatomical concepts. The various combinations of therapist and patient generated forces utilized in a variety of weight-bearing positions, which are synonymous with Mulligan Concept, challenge practitioners existing annotational skills. An annotation framework with recording rules adapted to the Mulligan Concept is proposed in which the abbreviations incorporate established manual therapy tenets and are detailed in the following sequence of; starting position, side, joint/s, method of application, glide/s, Mulligan technique, movement (or function), whether an assistant is used, overpressure (and by whom) and numbers of repetitions or time and sets. Therapist or patient application of overpressure and utilization of treatment belts or manual techniques must be recorded to capture the complete description. The adoption of the Mulligan Concept annotation framework in this way for documentation purposes will provide uniformity and clarity of information transfer for the future purposes of teaching, clinical practice and audit for its practitioners.

  6. An Annotated Bibliography of Nonsexist Resources.

    ERIC Educational Resources Information Center

    Miles Coll., Eutaw, AL. West Alabama Curriculum and Materials Resource Center.

    The result of a thorough search, review, and compilation of resources on women's equity, the annotated bibliography represents a sample of print materials, games and kits, photos and posters, and audiovisual aids now available on sexism that should prove useful to counselors, instructors, school administrators, parents, and elementary and…

  7. Postsecondary Peer Cooperative Learning Programs: Annotated Bibliography

    ERIC Educational Resources Information Center

    Arendale, David R., Comp.

    2005-01-01

    Purpose: This annotated bibliography is focused intentionally on postsecondary peer cooperative learning programs that increasing student achievement. Peer learning has been popular in education for decades. As both a pedagogy and learning strategy, it has been frequently adapted for a wide range of academic content areas at the elementary,…

  8. Children's Theatre: A Selected and Annotated Bibliography.

    ERIC Educational Resources Information Center

    Van Tassel, Wesley

    One hundred studies of children's theatre are annotated in this bibliography. The entries are listed alphabetically within three areas: "History of Children's Theatre and Specific Children's Theatres" (35 entries); "Theory, Criticism, Directories, and Bibliographies" (42 entries); and "Studies of Individual Plays, Play…

  9. Organizational and Intercultural Communication: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Constantinides, Helen; St. Amant, Kirk; Kampf, Connie

    2001-01-01

    Presents a 27-item annotated bibliography that overviews theories of organization from the viewpoint of culture, using five themes of organizational research as a framework. Notes that each section introduces specific theories of international, intercultural, or organizational communication, building upon them through a series of related articles,…

  10. SNAD: Sequence Name Annotation-based Designer.

    PubMed

    Sidorov, Igor A; Reshetov, Denis A; Gorbalenya, Alexander E

    2009-08-14

    A growing diversity of biological data is tagged with unique identifiers (UIDs) associated with polynucleotides and proteins to ensure efficient computer-mediated data storage, maintenance, and processing. These identifiers, which are not informative for most people, are often substituted by biologically meaningful names in various presentations to facilitate utilization and dissemination of sequence-based knowledge. This substitution is commonly done manually that may be a tedious exercise prone to mistakes and omissions. Here we introduce SNAD (Sequence Name Annotation-based Designer) that mediates automatic conversion of sequence UIDs (associated with multiple alignment or phylogenetic tree, or supplied as plain text list) into biologically meaningful names and acronyms. This conversion is directed by precompiled or user-defined templates that exploit wealth of annotation available in cognate entries of external databases. Using examples, we demonstrate how this tool can be used to generate names for practical purposes, particularly in virology. A tool for controllable annotation-based conversion of sequence UIDs into biologically meaningful names and acronyms has been developed and placed into service, fostering links between quality of sequence annotation, and efficiency of communication and knowledge dissemination among researchers.

  11. Elementary Language Arts: Authorized Resources Annotated List.

    ERIC Educational Resources Information Center

    Alberta Dept. of Education, Edmonton. Curriculum Standards Branch.

    This comprehensive, annotated resource list is designed to assist educators in selecting language arts resources for the elementary classroom. The authorized resources are listed under two main headings: series and individual resources. The series are listed alphabetically under each grade level. The individual resources are often authorized…

  12. Elementary Science: Authorized Resources Annotated List.

    ERIC Educational Resources Information Center

    Alberta Dept. of Education, Edmonton. Curriculum Standards Branch.

    This comprehensive annotated resource list is designed to assist in the selection of science resources for elementary classrooms. This guide is organized by grade level, and within that heading by series and topic. Following the lists for individual grades is a list of elementary science resources, which are not topic specific, authorized for…

  13. People: Annotated Multiethnic Bibliography K-12.

    ERIC Educational Resources Information Center

    Gilmore, Dolores D., Comp.; Petrie, Kenneth, Comp.

    This annotated bibliography has been compiled to assist personnel in the selection of multiethnic media for schools. The bibliography includes sections entitled "Asian Americans,""Jewish Americans,""Mexican Americans,""Native Americans,""Puerto Rican Americans,""Other Hyphenated Americans," and "All Americans (Multiethnic)." The entries for the…

  14. Skin Cancer Education Materials: Selected Annotations.

    ERIC Educational Resources Information Center

    National Cancer Inst. (NIH), Bethesda, MD.

    This annotated bibliography presents 85 entries on a variety of approaches to cancer education. The entries are grouped under three broad headings, two of which contain smaller sub-divisions. The first heading, Public Education, contains prevention and general information, and non-print materials. The second heading, Professional Education,…

  15. Text-mining assisted regulatory annotation

    PubMed Central

    Aerts, Stein; Haeussler, Maximilian; van Vooren, Steven; Griffith, Obi L; Hulpiau, Paco; Jones, Steven JM; Montgomery, Stephen B; Bergman, Casey M

    2008-01-01

    Background Decoding transcriptional regulatory networks and the genomic cis-regulatory logic implemented in their control nodes is a fundamental challenge in genome biology. High-throughput computational and experimental analyses of regulatory networks and sequences rely heavily on positive control data from prior small-scale experiments, but the vast majority of previously discovered regulatory data remains locked in the biomedical literature. Results We develop text-mining strategies to identify relevant publications and extract sequence information to assist the regulatory annotation process. Using a vector space model to identify Medline abstracts from papers likely to have high cis-regulatory content, we demonstrate that document relevance ranking can assist the curation of transcriptional regulatory networks and estimate that, minimally, 30,000 papers harbor unannotated cis-regulatory data. In addition, we show that DNA sequences can be extracted from primary text with high cis-regulatory content and mapped to genome sequences as a means of identifying the location, organism and target gene information that is critical to the cis-regulatory annotation process. Conclusion Our results demonstrate that text-mining technologies can be successfully integrated with genome annotation systems, thereby increasing the availability of annotated cis-regulatory data needed to catalyze advances in the field of gene regulation. PMID:18271954

  16. Male-Female Sexuality: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Wilson, Janice

    This annotated bibliography contains over 500 sources on the historical and contemporary development and expression of male and female sexuality. There are 68 topic headings which provide easy access for subject areas. A major portion of the bibliography is devoted to contemporary male-female sexuality. These materials consist of research findings…

  17. Project for Global Education: Annotated Bibliography.

    ERIC Educational Resources Information Center

    Institute for World Order, New York, NY.

    Over 260 books, textbooks, articles, pamphlets, periodicals, films, and multi-media packages appropriate for the analysis of global issues at the college level are briefly annotated. Entries include classic books and articles as well as a number of recent (1976-1981) publications. The purpose is to assist students and educators in developing a…

  18. The Community; A Classified, Annotated Bibliography.

    ERIC Educational Resources Information Center

    Payne, Raymond, Comp.; Bailey, Wilfrid C., Comp.

    This is a classified retrospective bibliography of 839 items on the community (about 140 are annotated) from rural sociology and agricultural economics departments and sections, agricultural experiment stations, extension services, and related agencies. Items are categorized as follows: bibliography and reference lists; location and delineation of…

  19. Staff Differentiation; An Annotated Bibliography Addendum.

    ERIC Educational Resources Information Center

    Marin County Public Schools, Corte Madera, CA.

    Differentiated staffing has emphasized development of teacher leadership roles, the importance of shared decision making in schools, and the constructive ways in which paid instructional aides and volunteer aides can support the professional teaching staff. Eighteen annotated bibliographic citations concerning the various aspects of differentiated…

  20. Children and Poetry: A Selective, Annotated Bibliography.

    ERIC Educational Resources Information Center

    Haviland, Virginia; Smith, William Jay

    This annotated bibliography of over 120 books was compiled to call attention to poetry for children that is both pleasing and rewarding. Omitted are traditional materials such as Mother Goose rhymes, textbooks, and collections designed especially for the classroom. Sample illustrations from the books noted and lines from poems are reproduced…

  1. Health Communication and Literacy: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Beveridge, Jennifer

    This annotated bibliography lists publications and World Wide Web sites dealing with health communication and literacy. The 51 publications, which were all published between 1982 and 1998, contain information about and/or for use in the following areas: assessment, assessment tools, elderly adults, empowerment, maternal and child health, patient…

  2. An Annotated Bibliography of Latino Educational Research

    ERIC Educational Resources Information Center

    Baumann, Paul; Cabrera, Alberto; Swail, Watson Scott

    2007-01-01

    This bibliography lists and provides annotations for 59 recent research studies on a variety of Latino educational issues. Descriptions of the focus of each item, as well as implications for policy and practice are provided. Items range in publication date from 1993 to 2007. [This document was compiled by the Educational Policy Institute in…

  3. Adolescent Reproductive Behaviour: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    United Nations, New York, NY. Population Div.

    A general overview of the literature on adolescent fertility and closely related issues is provided in this annotated bibliography. Material on the following topics is included: (1) programs related to adolescent pregnancy, contraception, abortion, and births; (2) studies relating socioeconomic characteristics of pregnant adolescents to their…

  4. Health Communication and Literacy: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Beveridge, Jennifer

    This annotated bibliography lists publications and World Wide Web sites dealing with health communication and literacy. The 51 publications, which were all published between 1982 and 1998, contain information about and/or for use in the following areas: assessment, assessment tools, elderly adults, empowerment, maternal and child health, patient…

  5. An Annotated Guide to Contemporary China.

    ERIC Educational Resources Information Center

    National Committee on United States-China Relations, New York, NY.

    Three years after the publication of the first "Annotated Guide to Modern China" this second expanded bibliography of books and periodicals has been published. The intended readership is the non-specialist who desires an introduction to modern China. One section gives other reference works for more extensive study. Several others are…

  6. Sexually Transmitted Diseases: A Selective, Annotated Bibliography.

    ERIC Educational Resources Information Center

    Planned Parenthood Federation of America, Inc., New York, NY. Education Dept.

    This document contains a reference sheet and an annotated bibliography concerned with sexually transmitted diseases (STD). The reference sheet provides a brief, accurate overview of STDs which includes both statistical and background information. The bibliography contains 83 entries, listed alphabetically, that deal with STDs. Books and articles…

  7. The Basic Course: A Selected, Annotated Bibliography.

    ERIC Educational Resources Information Center

    Demo, Penny

    Defining basic speech communication courses as those public speaking, interpersonal, or communication courses that treat fundamental communication concepts, this annotated bibliography reflects the current thought of speech educators on the basic course. The bibliography consists of 27 citations, all of which are drawn from the ERIC database. (SKC)

  8. Document Delivery: An Annotated Selective Bibliography.

    ERIC Educational Resources Information Center

    Khalil, Mounir A.; Katz, Suzanne R.

    1992-01-01

    Presents a selective annotated bibliography of 61 items that deal with topics related to document delivery, including networks; hypertext; interlibrary loan; computer security; electronic publishing; copyright; online catalogs; resource sharing; electronic mail; electronic libraries; optical character recognition; microcomputers; liability issues;…

  9. La Mujer Chicana: An Annotated Bibliography, 1976.

    ERIC Educational Resources Information Center

    Chapa, Evey, Ed.; And Others

    Intended to provide interested persons, researchers, and educators with information about "la mujer Chicana", this annotated bibliography cites 320 materials published between 1916 and 1975, with the majority being between 1960 and 1975. The 12 sections cover the following subject areas: Chicana publications; Chicana feminism and…

  10. Annotated Bibliography of Autism 1943-1983.

    ERIC Educational Resources Information Center

    Tari, Andor J.; And Others

    The annotated bibliography of over 1,200 citations published between 1943 and 1983 is intended as a comprehensive reference guide to the scientific study of infantile autism. After a search of the literature was conducted, the information was organized by format and subject, first for journal articles (19 topics are concerned with general…

  11. Book Reviews, Annotation, and Web Technology.

    ERIC Educational Resources Information Center

    Schulze, Patricia

    From reading texts to annotating web pages, grade 6-8 students rely on group cooperation and individual reading and writing skills in this research project that spans six 50-minute lessons. Student objectives for this project are that they will: read, discuss, and keep a journal on a book in literature circles; understand the elements of and…

  12. Chemical Principles Revisited: Annotating Reaction Equations.

    ERIC Educational Resources Information Center

    Tykodi, R. J.

    1987-01-01

    Urges chemistry teachers to have students annotate the chemical reactions in aqueous-solutions that they see in their textbooks and witness in the laboratory. Suggests this will help students recognize the reaction type more readily. Examples are given for gas formation, precipitate formation, redox interaction, acid-base interaction, and…

  13. Annotated Bibliography of English for Special Purposes.

    ERIC Educational Resources Information Center

    Allix, Beverley, Comp.

    This annotated bibliography covers the following types of materials of use to teachers of English for Special Purposes: (1) books, monographs, reports, and conference papers; (2) periodical articles and essays in collections; (3) theses and dissertations; (4) bibliographies; (5) dictionaries; and (6) textbooks in series by publisher. Section (1)…

  14. The Career Education Resource Center Annotated Catalog.

    ERIC Educational Resources Information Center

    Lawhead, Jeanie; And Others

    This catalog provides an annotated list of the career education materials which may be borrowed for previewing from the Career Education Resource Center in Colorado. Covering materials of interest to educators in kindergarten through postsecondary programs, the catalog includes items produced by classroom teachers, commercial publishers, business…

  15. Vision/Visual Perception: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Weintraub, Sam, Comp.; Cowan, Robert J., Comp.

    An update and modification of "Vision-Visual Discrimination" published in 1973, this annotated bibliography contains entries from the annual summaries of research in reading published by the International Reading Association (IRA) since then. The first large section, "Vision," is divided into two subgroups: (1) "Visually…

  16. Middle Level Education: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Totten, Samuel; And Others

    Developed as a reference tool for teachers, administrators, researchers, parents, and others interested in middle level education, this annotated bibliography of 1,757 entries focuses on practical aspects of middle level education and on research related to adolescence and middle level practices. Following an introduction and discussion of…

  17. Science Fiction Criticism: An Annotated Checklist.

    ERIC Educational Resources Information Center

    Clareson, Thomas

    An expansion of the list published in "Extrapolation" between May 1970 and May 1971, this book contains approximately 800 entries of science fiction criticism. Divided into special categories, all items are annotated and explicitly discuss science fiction. The nine categories of science fiction criticism are Literary Studies; Book Reviews; the…

  18. Environment and the Community: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Department of Housing and Urban Development, Washington, DC.

    Three hundred and nine citations of books, reports, and articles dating from 1964 to 1971 are included in this annotated bibliography, intended as a selection tool for concerned citizens, architects, builders, and city planners emphasizing the environment of American cities and communities. It is topically arranged into sixteen broad sections with…

  19. Teaching Creative Writing: A Selective, Annotated Bibliography.

    ERIC Educational Resources Information Center

    Bishop, Wendy; And Others

    Focusing on pedagogical issues in creative writing, this annotated bibliography reviews 149 books, articles, and dissertations in the fields of creative writing and composition, and, selectively, feminist and literary theory. Anthologies of original writing and reference books are not included. (MM)

  20. Bibliografia de Aztlan: An Annotated Chicano Bibliography.

    ERIC Educational Resources Information Center

    Barrios, Ernie, Ed.

    More than 300 books and articles published from 1920 to 1971 are reviewed in this annotated bibliography of literature on the Chicano. The citations and reviews are categorized by subject area and deal with contemporary Chicano history, education, health, history of Mexico, literature, native Americans, philosophy, political science, pre-Columbian…

  1. Annotated Bibliography of Special Education Instructional Materials.

    ERIC Educational Resources Information Center

    Cook, Iva Dean, Comp.

    The annotated bibliography lists approximately 900 commercially prepared materials available for statewide distribution from the West Virginia College of Graduate Studies Special Education Instructional Materials Center (WEIMC) for use in teaching educable (EMR) and trainable mentally retarded (TMR) students. Materials are grouped under subject…

  2. College Students in Transition: An Annotated Bibliography

    ERIC Educational Resources Information Center

    Foote, Stephanie M., Ed.; Hinkle, Sara M., Ed.; Kranzow, Jeannine, Ed.; Pistilli, Matthew D., Ed.; Miles, LaTonya Rease, Ed.; Simmons, Jannell G., Ed.

    2013-01-01

    The transition from high school to college is an important milestone, but it is only one of many steps in the journey through higher education. This volume is an annotated bibliography of the emerging literature examining the many other transitions students make beyond the first year, including the sophomore year, the transfer experience, and the…

  3. Educational Quality Indicators: Annotated Bibliography. Second Edition.

    ERIC Educational Resources Information Center

    Alberta Dept. of Education, Edmonton.

    This annotated bibliography of journal articles and documents on educational quality indicators contains approximately 230 entries arranged by the following topics: (1) indicator systems, including international, local/provincial/state, models, and national/federal systems; (2) interpretive framework (context, inputs, processes), including…

  4. Communication and Sexuality: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Buley, Jerry, Comp.; And Others

    The entries in this annotated bibliography represent books, educational journals, dissertations, popular magazines, and research studies that deal with the topic of communication and sexuality. Arranged alphabetically by author and also indexed according to subject matter, the titles span a variety of topics, including the following: sex and…

  5. Sexually Transmitted Diseases: A Selective, Annotated Bibliography.

    ERIC Educational Resources Information Center

    Planned Parenthood Federation of America, Inc., New York, NY. Education Dept.

    This document contains a reference sheet and an annotated bibliography concerned with sexually transmitted diseases (STD). The reference sheet provides a brief, accurate overview of STDs which includes both statistical and background information. The bibliography contains 83 entries, listed alphabetically, that deal with STDs. Books and articles…

  6. College Students in Transition: An Annotated Bibliography

    ERIC Educational Resources Information Center

    Foote, Stephanie M., Ed.; Hinkle, Sara M., Ed.; Kranzow, Jeannine, Ed.; Pistilli, Matthew D., Ed.; Miles, LaTonya Rease, Ed.; Simmons, Jannell G., Ed.

    2013-01-01

    The transition from high school to college is an important milestone, but it is only one of many steps in the journey through higher education. This volume is an annotated bibliography of the emerging literature examining the many other transitions students make beyond the first year, including the sophomore year, the transfer experience, and the…

  7. Women and World Development: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Buvinic, Mayra; And Others

    This annotated bibliography focuses on the effects of socioeconomic development and cultural change on women and on women's reactions to these changes. It is an expanded version of one which was prepared for the American Association of Science Seminar on Women in Development held in Mexico City in June 1975. The objectives were to disseminate this…

  8. Suggested Books for Children: An Annotated Bibliography

    ERIC Educational Resources Information Center

    NHSA Dialog, 2008

    2008-01-01

    This article provides an annotated bibliography of various children's books. It includes listings of books that illustrate the dynamic relationships within the natural environment, economic context, racial and cultural identities, cross-group similarities and differences, gender, different abilities and stories of injustice and resistance.

  9. Annotated Psychodynamic Bibliography for Residents in Psychiatry

    PubMed Central

    CALIGOR, EVE

    1996-01-01

    The author provides an annotated bibliography to introduce psychodynamic psychotherapy and psychoanalysis to residents in psychiatry. The emphasis of the selection is on relevance to practice. The entries are grouped by topic, levels of difficulty are noted, and readings are identified as being of either current or historic relevance. PMID:22700303

  10. Great Basin Experimental Range: Annotated bibliography

    Treesearch

    E. Durant McArthur; Bryce A. Richardson; Stanley G. Kitchen

    2013-01-01

    This annotated bibliography documents the research that has been conducted on the Great Basin Experimental Range (GBER, also known as the Utah Experiment Station, Great Basin Station, the Great Basin Branch Experiment Station, Great Basin Experimental Center, and other similar name variants) over the 102 years of its existence. Entries were drawn from the original...

  11. Studies of Scientific Disciplines. An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Weisz, Diane; Kruytbosch, Carlos

    Provided in this bibliography are annotated lists of social studies of science literature, arranged alphabetically by author in 13 disciplinary areas. These areas include astronomy; general biology; biochemistry and molecular biology; biomedicine; chemistry; earth and space sciences; economics; engineering; mathematics; physics; political science;…

  12. Nutrition & Adolescent Pregnancy: A Selected Annotated Bibliography.

    ERIC Educational Resources Information Center

    National Agricultural Library (USDA), Washington, DC.

    This annotated bibliography on nutrition and adolescent pregnancy is intended to be a source of technical assistance for nurses, nutritionists, physicians, educators, social workers, and other personnel concerned with improving the health of teenage mothers and their babies. It is divided into two major sections. The first section lists selected…

  13. READABILITY AND READING--AN ANNOTATED BIBLIOGRAPHY.

    ERIC Educational Resources Information Center

    DALE, EDGAR; SEELS, BARBARA

    THIS ANNOTATED BIBLIOGRAPHY COVERS THE FIELD OF READABILITY AND READING. THE SELECTED WORKS ARE ORGANIZED INTO NINE SECTIONS--(1) GENERAL REFERENCES ON READABILITY, (2) MEASURING OF READABILITY, (3) READABILITY AND SENTENCE STRUCTURE, (4) READABILITY AND VOCABULARY, (5) READABILITY AND LITERARY STYLE, (6) READABILITY IN SUBJECT AREA MATERIALS, (7)…

  14. Paraprofessionals and Teacher Aides: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Grambs, Jean D.; And Others

    The 167 citations included in this annotated bibliography on the training of paraprofessionals and teacher aides are presented under the following headings: (1) general training (71 entries); (2) training aides for specialized roles--preschool and elementary programs, home visits; aides for disadvantaged, adult education, special curriculum and…

  15. Annotated Bibliography of the Graduate Record Examinations.

    ERIC Educational Resources Information Center

    Fortna, Richard O.

    The Graduate Record Examinations (GRE) bibliography provides an exhaustive list of references to studies adding to the understanding of the development, nature, and use of the test, and is divided into two sections: (1) the first section lists 125 annotated citations that contain research studies on the GRE; (2) the second section lists reviews…

  16. Greeks in Canada (an Annotated Bibliography).

    ERIC Educational Resources Information Center

    Bombas, Leonidas C.

    This bibliography on Greeks in Canada includes annotated references to both published and (mostly) unpublished works. Among the 70 entries (arranged in alphabetical order by author) are articles, reports, papers, and theses that deal either exclusively with or include a separate section on Greeks in the various Canadian provinces. (GC)

  17. Approaching the Holocaust: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Naftali, Haiya

    1990-01-01

    Presents an annotated bibliography of books on the Holocaust suitable for young adults. Describes Holocaust institutions and their resources. Provides an address for teachers interested in contacting a Holocaust institute in their area. Discusses the proliferation of teaching resources on the Second World War and their value in the classroom. (RW)

  18. Reflective Annotations: On Becoming a Scholar

    ERIC Educational Resources Information Center

    Alexander, Mark; Taylor, Caroline; Greenberger, Scott; Watts, Margie; Balch, Riann

    2012-01-01

    This article presents the authors' reflective annotations on becoming a scholar. This paper begins with a discussion on socialization for teaching, followed by a discussion on socialization for service and sense of belonging. Then, it describes how the doctoral process evolves. Finally, it talks about adult learners who pursue doctoral education.

  19. Online Annotation--Research and Practices

    ERIC Educational Resources Information Center

    Glover, Ian; Xu, Zhijie; Hardaker, Glenn

    2007-01-01

    Annotation can be a valuable exercise when trying to understand new information. The technique can be used to create a "condensed" version of the original information for later review and to add additional information into the existing document. The growth in web-based learning materials and information sources has created requirement for systems…

  20. Organizational Communication: A Selected, Annotated Bibliography.

    ERIC Educational Resources Information Center

    Putnam, Linda; Frye, Mary

    Directing the reader to books and articles making significant contributions to theory and research in the field of organizational communication, this annotated bibliography contains 43 entries, including seminal works, exemplars, and state-of-the-art pieces primarily by authors within the field of communication. The entries are grouped into 9…

  1. Communication and Sexuality: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Buley, Jerry, Comp.; And Others

    The entries in this annotated bibliography represent books, educational journals, dissertations, popular magazines, and research studies that deal with the topic of communication and sexuality. Arranged alphabetically by author and also indexed according to subject matter, the titles span a variety of topics, including the following: sex and…

  2. The Mentally Retarded Offender: Annotated Bibliography.

    ERIC Educational Resources Information Center

    Schilit, Jeffrey; And Others

    An annotated bibliography of approximately 150 books and articles on the mentally retarded offender as well as 30 nonannotated entries are provided. Topics covered include such areas as characteristics of mentally retarded delinquents, rehabilitation of the retarded offender, community services for retarded persons, rights of the mentally…

  3. The Mentally Retarded Offender: Annotated Bibliography.

    ERIC Educational Resources Information Center

    Schilit, Jeffrey; And Others

    An annotated bibliography of approximately 150 books and articles on the mentally retarded offender as well as 30 nonannotated entries are provided. Topics covered include such areas as characteristics of mentally retarded delinquents, rehabilitation of the retarded offender, community services for retarded persons, rights of the mentally…

  4. Comprehensive Annotation of Mature Peptides and Genotypes for Zika Virus.

    PubMed

    Sun, Guangyu; Larsen, Christopher N; Baumgarth, Nicole; Klem, Edward B; Scheuermann, Richard H

    2017-01-01

    The rapid spread of Zika virus (ZIKV) has caused much concern in the global health community, due in part to a link to fetal microcephaly and other neurological illnesses. While an increasing amount of ZIKV genomic sequence data is being generated, an understanding of the virus molecular biology is still greatly lacking. A significant step towards establishing ZIKV proteomics would be the compilation of all proteins produced by the virus, and the resultant virus genotypes. Here we report for the first time such data, using new computational methods for the annotation of mature peptide proteins, genotypes, and recombination events for all ZIKV genomes. The data is made publicly available through the Virus Pathogen Resource at www.viprbrc.org.

  5. Comprehensive Annotation of Mature Peptides and Genotypes for Zika Virus

    PubMed Central

    Sun, Guangyu; Baumgarth, Nicole; Klem, Edward B.; Scheuermann, Richard H.

    2017-01-01

    The rapid spread of Zika virus (ZIKV) has caused much concern in the global health community, due in part to a link to fetal microcephaly and other neurological illnesses. While an increasing amount of ZIKV genomic sequence data is being generated, an understanding of the virus molecular biology is still greatly lacking. A significant step towards establishing ZIKV proteomics would be the compilation of all proteins produced by the virus, and the resultant virus genotypes. Here we report for the first time such data, using new computational methods for the annotation of mature peptide proteins, genotypes, and recombination events for all ZIKV genomes. The data is made publicly available through the Virus Pathogen Resource at www.viprbrc.org. PMID:28125631

  6. MEETING: Chlamydomonas Annotation Jamboree - October 2003

    SciTech Connect

    Grossman, Arthur R

    2007-04-13

    Shotgun sequencing of the nuclear genome of Chlamydomonas reinhardtii (Chlamydomonas throughout) was performed at an approximate 10X coverage by JGI. Roughly half of the genome is now contained on 26 scaffolds, all of which are at least 1.6 Mb, and the coverage of the genome is ~95%. There are now over 200,000 cDNA sequence reads that we have generated as part of the Chlamydomonas genome project (Grossman, 2003; Shrager et al., 2003; Grossman et al. 2007; Merchant et al., 2007); other sequences have also been generated by the Kasuza sequence group (Asamizu et al., 1999; Asamizu et al., 2000) or individual laboratories that have focused on specific genes. Shrager et al. (2003) placed the reads into distinct contigs (an assemblage of reads with overlapping nucleotide sequences), and contigs that group together as part of the same genes have been designated ACEs (assembly of contigs generated from EST information). All of the reads have also been mapped to the Chlamydomonas nuclear genome and the cDNAs and their corresponding genomic sequences have been reassembled, and the resulting assemblage is called an ACEG (an Assembly of contiguous EST sequences supported by genomic sequence) (Jain et al., 2007). Most of the unique genes or ACEGs are also represented by gene models that have been generated by the Joint Genome Institute (JGI, Walnut Creek, CA). These gene models have been placed onto the DNA scaffolds and are presented as a track on the Chlamydomonas genome browser associated with the genome portal (http://genome.jgi-psf.org/Chlre3/Chlre3.home.html). Ultimately, the meeting grant awarded by DOE has helped enormously in the development of an annotation pipeline (a set of guidelines used in the annotation of genes) and resulted in high quality annotation of over 4,000 genes; the annotators were from both Europe and the USA. Some of the people who led the annotation initiative were Arthur Grossman, Olivier Vallon, and Sabeeha Merchant (with many individual

  7. Computer systems for annotation of single molecule fragments

    DOEpatents

    Schwartz, David Charles; Severin, Jessica

    2016-07-19

    There are provided computer systems for visualizing and annotating single molecule images. Annotation systems in accordance with this disclosure allow a user to mark and annotate single molecules of interest and their restriction enzyme cut sites thereby determining the restriction fragments of single nucleic acid molecules. The markings and annotations may be automatically generated by the system in certain embodiments and they may be overlaid translucently onto the single molecule images. An image caching system may be implemented in the computer annotation systems to reduce image processing time. The annotation systems include one or more connectors connecting to one or more databases capable of storing single molecule data as well as other biomedical data. Such diverse array of data can be retrieved and used to validate the markings and annotations. The annotation systems may be implemented and deployed over a computer network. They may be ergonomically optimized to facilitate user interactions.

  8. VariantAnnotation: a Bioconductor package for exploration and annotation of genetic variants

    PubMed Central

    Obenchain, Valerie; Lawrence, Michael; Carey, Vincent; Gogarten, Stephanie; Shannon, Paul; Morgan, Martin

    2014-01-01

    Summary: VariantAnnotation is an R / Bioconductor package for the exploration and annotation of genetic variants. Capabilities exist for reading, writing and filtering variant call format (VCF) files. VariantAnnotation allows ready access to additional R / Bioconductor facilities for advanced statistical analysis, data transformation, visualization and integration with diverse genomic resources. Availability and implementation: This package is implemented in R and available for download at the Bioconductor Web site (http://bioconductor.org/packages/2.13/bioc/html/VariantAnnotation.html). The package contains extensive help pages for individual functions and a ‘vignette’ outlining typical work flows; it is made available under the open source ‘Artistic-2.0’ license. Version 1.9.38 was used in this article. Contact: vobencha@fhcrc.org PMID:24681907

  9. VIGOR, an annotation program for small viral genomes

    PubMed Central

    2010-01-01

    Background The decrease in cost for sequencing and improvement in technologies has made it easier and more common for the re-sequencing of large genomes as well as parallel sequencing of small genomes. It is possible to completely sequence a small genome within days and this increases the number of publicly available genomes. Among the types of genomes being rapidly sequenced are those of microbial and viral genomes responsible for infectious diseases. However, accurate gene prediction is a challenge that persists for decoding a newly sequenced genome. Therefore, accurate and efficient gene prediction programs are highly desired for rapid and cost effective surveillance of RNA viruses through full genome sequencing. Results We have developed VIGOR (Viral Genome ORF Reader), a web application tool for gene prediction in influenza virus, rotavirus, rhinovirus and coronavirus subtypes. VIGOR detects protein coding regions based on sequence similarity searches and can accurately detect genome specific features such as frame shifts, overlapping genes, embedded genes, and can predict mature peptides within the context of a single polypeptide open reading frame. Genotyping capability for influenza and rotavirus is built into the program. We compared VIGOR to previously described gene prediction programs, ZCURVE_V, GeneMarkS and FLAN. The specificity and sensitivity of VIGOR are greater than 99% for the RNA viral genomes tested. Conclusions VIGOR is a user friendly web-based genome annotation program for five different viral agents, influenza, rotavirus, rhinovirus, coronavirus and SARS coronavirus. This is the first gene prediction program for rotavirus and rhinovirus for public access. VIGOR is able to accurately predict protein coding genes for the above five viral types and has the capability to assign function to the predicted open reading frames and genotype influenza virus. The prediction software was designed for performing high throughput annotation and closure

  10. BEACON: automated tool for Bacterial GEnome Annotation ComparisON.

    PubMed

    Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B

    2015-08-18

    Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  11. Correlation of Succinate Metabolism and Virulence in Salmonella typhimurium

    PubMed Central

    Herzberg, Mendel; Jawad, Mudhaffer J.; Pratt, Darrell

    1965-01-01

    Herzberg, Mendel (University of Florida, Gainesville), and Mudhaffer J. Jawad, and Darrell Pratt. Succinate metabolism and virulence in Salmonella typhimurium. J. Bacteriol. 89:185–192. 1965.—A virulent, smooth strain of Salmonella typhimurium (Wild-7) grew slowly with succinate as sole carbon source (Suc-L). Old stock cultures yielded a smooth variant which grew rapidly (Suc-E). Visible colonies of Suc-E appeared in 24 hr, whereas Suc-L required 48 hr. Differences other than the response to succinate were not demonstrable between the two strains; ld50 values of both strains were similar, but equivalent numbers of Suc-E required longer periods of time to kill mice. Recovery of bacteria from liver and spleen homogenates revealed that Suc-L remains as such in vivo, but Suc-E populations change to Suc-L. By the eighth day of infection, the organisms were 93 to 100% Suc-L; thus, mortality was due to the Suc-L population developed in vivo and not the Suc-E of the original inoculum. Animal passage of a number of stock cultures of S. typhimurium of diverse origin, all Suc-E type, invariably yielded Suc-L. Slow utilization of succinate appears to be correlated with virulence. Images PMID:14255661

  12. Genetic determinants of virulence - Candida parapsilosis.

    PubMed

    Singaravelu, Kumara; Gácser, Attila; Nosanchuk, Joshua D

    2014-01-01

    The global epidemiology of fungal infections is changing. While overall, Candida albicans remains the most common pathogen; several institutions in Europe, Asia and South America have reported the rapid emergence to predominance of Candida parapsilosis. This mini-review examines the impact of gene deletions achieved in C. parapsilosis that have been published to date. The molecular approaches to gene disruption in C. parapsilosis and the molecularly characterized genes to date are reviewed. Similar to C. albicans, factors influencing virulence in C. parapsilosis include adherence, biofilm formation, lipid metabolism, and secretion of hydrolytic enzymes such as lipases, phospholipases and secreted aspartyl proteinases. Development of a targeted gene deletion method has enabled the identification of several unique aspects of C. parapsilosis genes that play a role in host-pathogen interactions - CpLIP1, CpLIP2, SAPP1a, SAPP1b, BCR1, RBT1, CpFAS2, OLE1, FIT-2. This manuscript is part of the series of works presented at the "V International Workshop: Molecular genetic approaches to the study of human pathogenic fungi" (Oaxaca, Mexico, 2012).

  13. 'Small worlds' and the evolution of virulence: infection occurs locally and at a distance.

    PubMed Central

    Boots, M; Sasaki, A

    1999-01-01

    Why are some discases more virulent than others? Vector-borne diseases such as malaria and water-borne diseases such as cholera are generally more virulent than diseases spread by direct contagion. One factor that characterizes both vector- and water-borne diseases is their ability to spread over long distances, thus causing infection of susceptible individuals distant from the infected individual. Here we show that this ability of the pathogen to infect distant individuals in a spatially structured host population leads to the evolution of a more virulent pathogen. We use a lattice model in which reproduction is local but infection can vary between completely local to completely global. With completely global infection the evolutionarily stable strategy (ESS) is the same as in mean-field models while a lower virulence is predicted as infection becomes more local. There is characteristically a period of relatively moderate increase in virulence followed by a more rapid rise with increasing proportions of global infection as we move beyond a 'critical connectivity'. In the light of recent work emphasizing the existence of 'small world' networks in human populations, our results suggests that if the world is getting 'smaller'--as populations become more connected--diseases may evolve higher virulence. PMID:10584335

  14. Long-Distance Delivery of Bacterial Virulence Factors by Pseudomonas aeruginosa Outer Membrane Vesicles

    PubMed Central

    Bomberger, Jennifer M.; MacEachran, Daniel P.; Coutermarsh, Bonita A.; Ye, Siying; O'Toole, George A.; Stanton, Bruce A.

    2009-01-01

    Bacteria use a variety of secreted virulence factors to manipulate host cells, thereby causing significant morbidity and mortality. We report a mechanism for the long-distance delivery of multiple bacterial virulence factors, simultaneously and directly into the host cell cytoplasm, thus obviating the need for direct interaction of the pathogen with the host cell to cause cytotoxicity. We show that outer membrane–derived vesicles (OMV) secreted by the opportunistic human pathogen Pseudomonas aeruginosa deliver multiple virulence factors, including β-lactamase, alkaline phosphatase, hemolytic phospholipase C, and Cif, directly into the host cytoplasm via fusion of OMV with lipid rafts in the host plasma membrane. These virulence factors enter the cytoplasm of the host cell via N-WASP–mediated actin trafficking, where they rapidly distribute to specific subcellular locations to affect host cell biology. We propose that secreted virulence factors are not released individually as naked proteins into the surrounding milieu where they may randomly contact the surface of the host cell, but instead bacterial derived OMV deliver multiple virulence factors simultaneously and directly into the host cell cytoplasm in a coordinated manner. PMID:19360133

  15. Multi-Atlas Segmentation using Partially Annotated Data: Methods and Annotation Strategies.

    PubMed

    Koch, Lisa M; Rajchl, Martin; Bai, Wenjia; Baumgartner, Christian F; Tong, Tong; Passerat-Palmbach, Jonathan; Aljabar, Paul; Rueckert, Daniel

    2017-08-22

    Multi-atlas segmentation is a widely used tool in medical image analysis, providing robust and accurate results by learning from annotated atlas datasets. However, the availability of fully annotated atlas images for training is limited due to the time required for the labelling task. Segmentation methods requiring only a proportion of each atlas image to be labelled could therefore reduce the workload on expert raters tasked with annotating atlas images. To address this issue, we first re-examine the labelling problem common in many existing approaches and formulate its solution in terms of a Markov Random Field energy minimisation problem on a graph connecting atlases and the target image. This provides a unifying framework for multi-atlas segmentation. We then show how modifications in the graph configuration of the proposed framework enable the use of partially annotated atlas images and investigate different partial annotation strategies. The proposed method was evaluated on two Magnetic Resonance Imaging (MRI) datasets for hippocampal and cardiac segmentation. Experiments were performed aimed at (1) recreating existing segmentation techniques with the proposed framework and (2) demonstrating the potential of employing sparsely annotated atlas data for multi-atlas segmentation.

  16. The Reading/Writing Connection Updated: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Ahern, Jennifer; Bishop, Wendy; Briggs, Terri L.; Chapman, Joe; Davis, Kevin; Fay, Jennifer A.; Gillen, N. Kent; Harrill, Rob; Haswell, Richard H.; Loomis, Ormond; Melzer, Daniel; Methvin, Holly; Shupala, Andrew M.; Trevino, Sylvia

    This 1997 annotated bibliography of 244 items updates an earlier 87-item annotated bibliography. The current annotated bibliography focuses on the relationship between reading and writing as it bears upon the teaching of composition. Items looking at writing as a way of teaching reading, and items focused exclusively upon writer-based concerns…

  17. Annotation of Fusarium graminearum (PH-1) Version 5.0

    PubMed Central

    Hammond-Kosack, Kim E.

    2017-01-01

    ABSTRACT Fusarium graminearum floral infections are a major risk to the global supply of safe cereal grains. We report updates to the PH-1 reference genome and significant improvements to the annotation. Changes include introduction of legacy annotation identifiers, new gene models, secretome and effectorP predictions, and inclusion of extensive untranslated region (UTR) annotations. PMID:28082505

  18. VideoANT: Extending Online Video Annotation beyond Content Delivery

    ERIC Educational Resources Information Center

    Hosack, Bradford

    2010-01-01

    This paper expands the boundaries of video annotation in education by outlining the need for extended interaction in online video use, identifying the challenges faced by existing video annotation tools, and introducing Video-ANT, a tool designed to create text-based annotations integrated within the time line of a video hosted online. Several…

  19. VideoANT: Extending Online Video Annotation beyond Content Delivery

    ERIC Educational Resources Information Center

    Hosack, Bradford

    2010-01-01

    This paper expands the boundaries of video annotation in education by outlining the need for extended interaction in online video use, identifying the challenges faced by existing video annotation tools, and introducing Video-ANT, a tool designed to create text-based annotations integrated within the time line of a video hosted online. Several…

  20. Large-scale annotation of small-molecule libraries using public databases.

    PubMed

    Zhou, Yingyao; Zhou, Bin; Chen, Kaisheng; Yan, S Frank; King, Frederick J; Jiang, Shumei; Winzeler, Elizabeth A

    2007-01-01

    While many large publicly accessible databases provide excellent annotation for biological macromolecules, the same is not true for small chemical compounds. Commercial data sources also fail to encompass an annotation interface for large numbers of compounds and tend to be cost prohibitive to be widely available to biomedical researchers. Therefore, using annotation information for the selection of lead compounds from a modern day high-throughput screening (HTS) campaign presently occurs only under a very limited scale. The recent rapid expansion of the NIH PubChem database provides an opportunity to link existing biological databases with compound catalogs and provides relevant information that potentially could improve the information garnered from large-scale screening efforts. Using the 2.5 million compound collection at the Genomics Institute of the Novartis Research Foundation (GNF) as a model, we determined that approximately 4% of the library contained compounds with potential annotation in such databases as PubChem and the World Drug Index (WDI) as well as related databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) and ChemIDplus. Furthermore, the exact structure match analysis showed 32% of GNF compounds can be linked to third party databases via PubChem. We also showed annotations such as MeSH (medical subject headings) terms can be applied to in-house HTS databases in identifying signature biological inhibition profiles of interest as well as expediting the assay validation process. The automated annotation of thousands of screening hits in batch is becoming feasible and has the potential to play an essential role in the hit-to-lead decision making process.