candida genome database: Topics by Science.gov

Sample records for candida genome database

Comparative molecular dynamics studies of heterozygous open reading frames of DNA polymerase eta (η) in pathogenic yeast Candida albicans

NASA Astrophysics Data System (ADS)

Satpati, Suresh; Manohar, Kodavati; Acharya, Narottam; Dixit, Anshuman

2017-01-01

Genomic instability in Candida albicans is believed to play a crucial role in fungal pathogenesis. DNA polymerases contribute significantly to stability of any genome. Although Candida Genome database predicts presence of S. cerevisiae DNA polymerase orthologs; functional and structural characterizations of Candida DNA polymerases are still unexplored. DNA polymerase eta (Polη) is unique as it promotes efficient bypass of cyclobutane pyrimidine dimers. Interestingly, C. albicans is heterozygous in carrying two Polη genes and the nucleotide substitutions were found only in the ORFs. As allelic differences often result in functional differences of the encoded proteins, comparative analyses of structural models and molecular dynamic simulations were performed to characterize these orthologs of DNA Polη. Overall structures of both the ORFs remain conserved except subtle differences in the palm and PAD domains. The complementation analysis showed that both the ORFs equally suppressed UV sensitivity of yeast rad30 deletion strain. Our study has predicted two novel molecular interactions, a highly conserved molecular tetrad of salt bridges and a series of π-π interactions spanning from thumb to PAD. This study suggests these ORFs as the homologues of yeast Polη, and due to its heterogeneity in C. albicans they may play a significant role in pathogenicity.
Genome-wide analysis of the Zn(II)2Cys6 zinc cluster-encoding gene family in Aspergillus flavus

USDA-ARS?s Scientific Manuscript database

Proteins with a Zn(II)2Cys6 domain, Cys-X2-Cys-X6-Cys-X5-12-Cys-X2-Cys-X6-9-Cys (hereafter, referred to as the C6 domain), form a subclass of zinc finger proteins found exclusively in fungi and yeast. Genome sequence databases of Saccharomyces cerevisiae and Candida albicans have provided an overvie...
Prioritizing and modelling of putative drug target proteins of Candida albicans by systems biology approach.

PubMed

Ismail, Tariq; Fatima, Nighat; Muhammad, Syed Aun; Zaidi, Syed Saoud; Rehman, Nisar; Hussain, Izhar; Tariq, Najam Us Sahr; Amirzada, Imran; Mannan, Abdul

2018-01-01

Candida albicans (Candida albicans) is one of the major sources of nosocomial infections in humans which may prove fatal in 30% of cases. The hospital acquired infection is very difficult to treat affectively due to the presence of drug resistant pathogenic strains, therefore there is a need to find alternative drug targets to cure this infection. In silico and computational level frame work was used to prioritize and establish antifungal drug targets of Candida albicans. The identification of putative drug targets was based on acquiring 5090 completely annotated genes of Candida albicans from available databases which were categorized into essential and non-essential genes. The result indicated that 9% of proteins were essential and could become potential candidates for intervention which might result in pathogen eradication. We studied cluster of orthologs and the subtractive genomic analysis of these essential proteins against human genome was made as a reference to minimize the side effects. It was seen that 14% of Candida albicans proteins were evolutionary related to the human proteins while 86% are non-human homologs. In the next step of compatible drug target selections, the non-human homologs were sequentially compared to the human microbiome data to minimize the potential effects against gut flora which accumulated to 38% of the essential genome. The sub-cellular localization of these candidate proteins in fungal cellular systems indicated that 80% of them are cytoplasmic, 10% are mitochondrial and the remaining 10% are associated with the cell wall. The role of these non-human and non-gut flora putative target proteins in Candida albicans biological pathways was studied. Due to their integrated and critical role in Candida albicans replication cycle, four proteins were selected for molecular modeling. For drug designing and development, four high quality and reliable protein models with more than 70% sequence identity were constructed. These proteins are used for the docking studies of the known and new ligands (unpublished data). Our study will be an effective framework for drug target identifications of pathogenic microbial strains and development of new therapies against the infections they cause.
The PathoYeastract database: an information system for the analysis of gene and genomic transcription regulation in pathogenic yeasts.

PubMed

Monteiro, Pedro Tiago; Pais, Pedro; Costa, Catarina; Manna, Sauvagya; Sá-Correia, Isabel; Teixeira, Miguel Cacho

2017-01-04

We present the PATHOgenic YEAst Search for Transcriptional Regulators And Consensus Tracking (PathoYeastract - http://pathoyeastract.org) database, a tool for the analysis and prediction of transcription regulatory associations at the gene and genomic levels in the pathogenic yeasts Candida albicans and C. glabrata Upon data retrieval from hundreds of publications, followed by curation, the database currently includes 28 000 unique documented regulatory associations between transcription factors (TF) and target genes and 107 DNA binding sites, considering 134 TFs in both species. Following the structure used for the YEASTRACT database, PathoYeastract makes available bioinformatics tools that enable the user to exploit the existing information to predict the TFs involved in the regulation of a gene or genome-wide transcriptional response, while ranking those TFs in order of their relative importance. Each search can be filtered based on the selection of specific environmental conditions, experimental evidence or positive/negative regulatory effect. Promoter analysis tools and interactive visualization tools for the representation of TF regulatory networks are also provided. The PathoYeastract database further provides simple tools for the prediction of gene and genomic regulation based on orthologous regulatory associations described for other yeast species, a comparative genomics setup for the study of cross-species evolution of regulatory networks. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Budding off: bringing functional genomics to Candida albicans

PubMed Central

Anderson, Matthew Z.

2016-01-01

Candida species are the most prevalent human fungal pathogens, with Candida albicans being the most clinically relevant species. Candida albicans resides as a commensal of the human gastrointestinal tract but is a frequent cause of opportunistic mucosal and systemic infections. Investigation of C. albicans virulence has traditionally relied on candidate gene approaches, but recent advances in functional genomics have now facilitated global, unbiased studies of gene function. Such studies include comparative genomics (both between and within Candida species), analysis of total RNA expression, and regulation and delineation of protein–DNA interactions. Additionally, large collections of mutant strains have begun to aid systematic screening of clinically relevant phenotypes. Here, we will highlight the development of functional genomics in C. albicans and discuss the use of these approaches to addressing both commensalism and pathogenesis in this species. PMID:26424829
Budding off: bringing functional genomics to Candida albicans.

PubMed

Anderson, Matthew Z; Bennett, Richard J

2016-03-01

Candida species are the most prevalent human fungal pathogens, with Candida albicans being the most clinically relevant species. Candida albicans resides as a commensal of the human gastrointestinal tract but is a frequent cause of opportunistic mucosal and systemic infections. Investigation of C. albicans virulence has traditionally relied on candidate gene approaches, but recent advances in functional genomics have now facilitated global, unbiased studies of gene function. Such studies include comparative genomics (both between and within Candida species), analysis of total RNA expression, and regulation and delineation of protein-DNA interactions. Additionally, large collections of mutant strains have begun to aid systematic screening of clinically relevant phenotypes. Here, we will highlight the development of functional genomics in C. albicans and discuss the use of these approaches to addressing both commensalism and pathogenesis in this species. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
De Novo Assembly of Candida sojae and Candida boidinii Genomes, Unexplored Xylose-Consuming Yeasts with Potential for Renewable Biochemical Production

PubMed Central

Borelli, Guilherme; José, Juliana; Teixeira, Paulo José Pereira Lima; dos Santos, Leandro Vieira

2016-01-01

Candida boidinii and Candida sojae yeasts were isolated from energy cane bagasse and plague-insects. Both have fast xylose uptake rate and produce great amounts of xylitol, which are interesting features for food and 2G ethanol industries. Because they lack published genomes, we have sequenced and assembled them, offering new possibilities for gene prospection. PMID:26769937
De Novo Assembly of Candida sojae and Candida boidinii Genomes, Unexplored Xylose-Consuming Yeasts with Potential for Renewable Biochemical Production.

PubMed

Borelli, Guilherme; José, Juliana; Teixeira, Paulo José Pereira Lima; Dos Santos, Leandro Vieira; Pereira, Gonçalo Amarante Guimarães

2016-01-14

Candida boidinii and Candida sojae yeasts were isolated from energy cane bagasse and plague-insects. Both have fast xylose uptake rate and produce great amounts of xylitol, which are interesting features for food and 2G ethanol industries. Because they lack published genomes, we have sequenced and assembled them, offering new possibilities for gene prospection. Copyright © 2016 Borelli et al.
High-Quality Draft Genome Sequence of Candida apicola NRRL Y-50540

PubMed Central

Vega-Alvarado, Leticia; Gómez-Angulo, Jorge; Escalante-García, Zazil; Grande, Ricardo; Gschaedler-Mathis, Anne; Amaya-Delgado, Lorena

2015-01-01

Candida apicola, a highly osmotolerant ascomycetes yeast, produces sophorolipids (biosurfactants), membrane fatty acids, and enzymes of biotechnological interest. The genome obtained has a high-quality draft for this species and can be used as a reference to perform further analyses, such as differential gene expression in yeast from Candida genera. PMID:26067948
A fungal phylogeny based on 42 complete genomes derived from supertree and combined gene analysis

PubMed Central

Fitzpatrick, David A; Logue, Mary E; Stajich, Jason E; Butler, Geraldine

2006-01-01

Background To date, most fungal phylogenies have been derived from single gene comparisons, or from concatenated alignments of a small number of genes. The increase in fungal genome sequencing presents an opportunity to reconstruct evolutionary events using entire genomes. As a tool for future comparative, phylogenomic and phylogenetic studies, we used both supertrees and concatenated alignments to infer relationships between 42 species of fungi for which complete genome sequences are available. Results A dataset of 345,829 genes was extracted from 42 publicly available fungal genomes. Supertree methods were employed to derive phylogenies from 4,805 single gene families. We found that the average consensus supertree method may suffer from long-branch attraction artifacts, while matrix representation with parsimony (MRP) appears to be immune from these. A genome phylogeny was also reconstructed from a concatenated alignment of 153 universally distributed orthologs. Our MRP supertree and concatenated phylogeny are highly congruent. Within the Ascomycota, the sub-phyla Pezizomycotina and Saccharomycotina were resolved. Both phylogenies infer that the Leotiomycetes are the closest sister group to the Sordariomycetes. There is some ambiguity regarding the placement of Stagonospora nodurum, the sole member of the class Dothideomycetes present in the dataset. Within the Saccharomycotina, a monophyletic clade containing organisms that translate CTG as serine instead of leucine is evident. There is also strong support for two groups within the CTG clade, one containing the fully sexual species Candida lusitaniae, Candida guilliermondii and Debaryomyces hansenii, and the second group containing Candida albicans, Candida dubliniensis, Candida tropicalis, Candida parapsilosis and Lodderomyces elongisporus. The second major clade within the Saccharomycotina contains species whose genomes have undergone a whole genome duplication (WGD), and their close relatives. We could not confidently resolve whether Candida glabrata or Saccharomyces castellii lies at the base of the WGD clade. Conclusion We have constructed robust phylogenies for fungi based on whole genome analysis. Overall, our phylogenies provide strong support for the classification of phyla, sub-phyla, classes and orders. We have resolved the relationship of the classes Leotiomyctes and Sordariomycetes, and have identified two classes within the CTG clade of the Saccharomycotina that may correlate with sexual status. PMID:17121679
Species and condition specific adaptation of the transcriptional landscapes in Candida albicans and Candida dubliniensis

PubMed Central

2013-01-01

Background Although Candida albicans and Candida dubliniensis are most closely related, both species behave significantly different with respect to morphogenesis and virulence. In order to gain further insight into the divergent routes for morphogenetic adaptation in both species, we investigated qualitative along with quantitative differences in the transcriptomes of both organisms by cDNA deep sequencing. Results Following genome-associated assembly of sequence reads we were able to generate experimentally verified databases containing 6016 and 5972 genes for C. albicans and C. dubliniensis, respectively. About 95% of the transcriptionally active regions (TARs) contain open reading frames while the remaining TARs most likely represent non-coding RNAs. Comparison of our annotations with publically available gene models for C. albicans and C. dubliniensis confirmed approximately 95% of already predicted genes, but also revealed so far unknown novel TARs in both species. Qualitative cross-species analysis of these databases revealed in addition to 5802 orthologs also 399 and 49 species-specific protein coding genes for C. albicans and C. dubliniensis, respectively. Furthermore, quantitative transcriptional profiling using RNA-Seq revealed significant differences in the expression of orthologs across both species. We defined a core subset of 84 hyphal-specific genes required for both species, as well as a set of 42 genes that seem to be specifically induced during hyphal morphogenesis in C. albicans. Conclusions Species-specific adaptation in C. albicans and C. dubliniensis is governed by individual genetic repertoires but also by altered regulation of conserved orthologs on the transcriptional level. PMID:23547856
Vaginal Candida spp. genomes from women with vulvovaginal candidiasis.

PubMed

Bradford, L Latéy; Chibucos, Marcus C; Ma, Bing; Bruno, Vincent; Ravel, Jacques

2017-08-31

Candida albicans is the predominant cause of vulvovaginal candidiasis (VVC). Little is known regarding the genetic diversity of Candida spp. in the vagina or the microvariations in strains over time that may contribute to the development of VVC. This study reports the draft genome sequences of four C. albicans and one C. glabrata strains isolated from women with VVC. An SNP-based whole-genome phylogeny indicates that these isolates are closely related; however, phylogenetic distances between them suggest that there may be genetic adaptations driven by unique host environments. These sequences will facilitate further comparative analyses and ultimately improve our understanding of genetic variation in isolates of Candida spp. that are associated with VVC. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Sequence and Analysis of the Genome of the Pathogenic Yeast Candida orthopsilosis

PubMed Central

Riccombeni, Alessandro; Vidanes, Genevieve; Proux-Wéra, Estelle; Wolfe, Kenneth H.; Butler, Geraldine

2012-01-01

Candida orthopsilosis is closely related to the fungal pathogen Candida parapsilosis. However, whereas C. parapsilosis is a major cause of disease in immunosuppressed individuals and in premature neonates, C. orthopsilosis is more rarely associated with infection. We sequenced the C. orthopsilosis genome to facilitate the identification of genes associated with virulence. Here, we report the de novo assembly and annotation of the genome of a Type 2 isolate of C. orthopsilosis. The sequence was obtained by combining data from next generation sequencing (454 Life Sciences and Illumina) with paired-end Sanger reads from a fosmid library. The final assembly contains 12.6 Mb on 8 chromosomes. The genome was annotated using an automated pipeline based on comparative analysis of genomes of Candida species, together with manual identification of introns. We identified 5700 protein-coding genes in C. orthopsilosis, of which 5570 have an ortholog in C. parapsilosis. The time of divergence between C. orthopsilosis and C. parapsilosis is estimated to be twice as great as that between Candida albicans and Candida dubliniensis. There has been an expansion of the Hyr/Iff family of cell wall genes and the JEN family of monocarboxylic transporters in C. parapsilosis relative to C. orthopsilosis. We identified one gene from a Maltose/Galactoside O-acetyltransferase family that originated by horizontal gene transfer from a bacterium to the common ancestor of C. orthopsilosis and C. parapsilosis. We report that TFB3, a component of the general transcription factor TFIIH, undergoes alternative splicing by intron retention in multiple Candida species. We also show that an intein in the vacuolar ATPase gene VMA1 is present in C. orthopsilosis but not C. parapsilosis, and has a patchy distribution in Candida species. Our results suggest that the difference in virulence between C. parapsilosis and C. orthopsilosis may be associated with expansion of gene families. PMID:22563396
Draft Genome Sequence of Candida pseudohaemulonii Isolated from the Blood of a Neutropenic Patient.

PubMed

Mohd Tap, Ratna; Kamarudin, Nur Amalina; Ginsapu, Stephanie Jane; Ahmed Bakri, Ahmed Rafezzan; Ahmad, Norazah; Amran, Fairuz; Sipiczki, Matthias

2018-04-05

Candida pseudohaemulonii is phylogenetically close to the C. haemulonii complex and exhibits resistance to amphotericin B and azole agents. We report here the draft genome sequence of C. pseudohaemulonii UZ153_17 isolated from the blood culture of a neutropenic patient. The draft genome is 3,532,003,666 bp in length, with 579,838 reads, 130 contigs, and a G+C content of 47.15%. Copyright © 2018 Mohd Tap et al.
Evolution of pathogenicity and sexual reproduction in eight Candida genomes

PubMed Central

Butler, Geraldine; Rasmussen, Matthew D.; Lin, Michael F.; Santos, Manuel A.S.; Sakthikumar, Sharadha; Munro, Carol A.; Rheinbay, Esther; Grabherr, Manfred; Forche, Anja; Reedy, Jennifer L.; Agrafioti, Ino; Arnaud, Martha B.; Bates, Steven; Brown, Alistair J.P.; Brunke, Sascha; Costanzo, Maria C.; Fitzpatrick, David A.; de Groot, Piet W. J.; Harris, David; Hoyer, Lois L.; Hube, Bernhard; Klis, Frans M.; Kodira, Chinnappa; Lennard, Nicola; Logue, Mary E.; Martin, Ronny; Neiman, Aaron M.; Nikolaou, Elissavet; Quail, Michael A.; Quinn, Janet; Santos, Maria C.; Schmitzberger, Florian F.; Sherlock, Gavin; Shah, Prachi; Silverstein, Kevin; Skrzypek, Marek S.; Soll, David; Staggs, Rodney; Stansfield, Ian; Stumpf, Michael P H; Sudbery, Peter E.; Thyagarajan, Srikantha; Zeng, Qiandong; Berman, Judith; Berriman, Matthew; Heitman, Joseph; Gow, Neil A. R.; Lorenz, Michael C.; Birren, Bruce W.; Kellis, Manolis; Cuomo, Christina A.

2009-01-01

Candida species are the most common cause of opportunistic fungal infection worldwide. We report the genome sequences of six Candida species and compare these and related pathogens and nonpathogens. There are significant expansions of cell wall, secreted, and transporter gene families in pathogenic species, suggesting adaptations associated with virulence. Large genomic tracts are homozygous in three diploid species, possibly resulting from recent recombination events. Surprisingly, key components of the mating and meiosis pathways are missing from several species. These include major differences at the Mating-type loci (MTL); Lodderomyces elongisporus lacks MTL, and components of the a1/alpha2 cell identity determinant were lost in other species, raising questions about how mating and cell types are controlled. Analysis of the CUG leucine to serine genetic code change reveals that 99% of ancestral CUG codons were erased and new ones arose elsewhere. Lastly, we revise the C. albicans gene catalog, identifying many new genes. PMID:19465905
Variability in the clinical distributions of Candida species and the emergence of azole-resistant non-Candida albicans species in public hospitals in the Midwest region of Brazil.

PubMed

Mattos, Karine; Rodrigues, Luana Carbonera; Oliveira, Kelly Mari Pires de; Diniz, Pedro Fernando; Marques, Luiza Inahê; Araujo, Adriana Almeida; Chang, Marilene Rodrigues

2017-01-01

Incidence and antifungal susceptibility of Candida spp. from two teaching public hospitals are described. The minimum inhibitory concentrations of fluconazole, voriconazole, itraconazole, and amphotericin B were determined using Clinical Laboratory Standard Institute broth microdilution and genomic differentiation using PCR. Of 221 Candida isolates, 50.2% were obtained from intensive care unit patients; 71.5% were recovered from urine and 9.1% from bloodstream samples. Candida parapsilosis sensu stricto was the most common candidemia agent. We observed variations in Candida species distribution in hospitals in the same geographic region and documented the emergence of non-C. albicans species resistant to azoles.
The mitochondrial genome of the pathogenic yeast Candida subhashii: GC-rich linear DNA with a protein covalently attached to the 5′ termini

PubMed Central

Fricova, Dominika; Valach, Matus; Farkas, Zoltan; Pfeiffer, Ilona; Kucsera, Judit; Tomaska, Lubomir; Nosek, Jozef

2010-01-01

As a part of our initiative aimed at a large-scale comparative analysis of fungal mitochondrial genomes, we determined the complete DNA sequence of the mitochondrial genome of the yeast Candida subhashii and found that it exhibits a number of peculiar features. First, the mitochondrial genome is represented by linear dsDNA molecules of uniform length (29 795 bp), with an unusually high content of guanine and cytosine residues (52.7 %). Second, the coding sequences lack introns; thus, the genome has a relatively compact organization. Third, the termini of the linear molecules consist of long inverted repeats and seem to contain a protein covalently bound to terminal nucleotides at the 5′ ends. This architecture resembles the telomeres in a number of linear viral and plasmid DNA genomes classified as invertrons, in which the terminal proteins serve as specific primers for the initiation of DNA synthesis. Finally, although the mitochondrial genome of C. subhashii contains essentially the same set of genes as other closely related pathogenic Candida species, we identified additional ORFs encoding two homologues of the family B protein-priming DNA polymerases and an unknown protein. The terminal structures and the genes for DNA polymerases are reminiscent of linear mitochondrial plasmids, indicating that this genome architecture might have emerged from fortuitous recombination between an ancestral, presumably circular, mitochondrial genome and an invertron-like element. PMID:20395267
[Study on the relationship between vaginal and intestinal candida in patients with vulvovaginal candidiasis].

PubMed

Lin, Xiao-li; Li, Zhen; Zuo, Xu-lei

2011-07-01

To investigate the relationship between vaginal and intestinal candida in patients with vulvovaginal candidiasis by using microbiological and molecular methods. The samples of vaginal discharge and anal swabs were collected from 148 cases with vulvovaginal candidiasis, followed by fungal culture, identification, purification and genome DNA extraction. The genome sequences from respective locations were aligned and typed according to their homology analyzed by internal transcribed spacer (ITS) PCR and random amplified polymorphic DNA (RAPD) PCR. Patients with vulvovaginal infection or those with infections in intestine and vulvovagina were pooled respectively, while the recurrent incidences after local anti-fungal treatments were analyzed. Candida albicans is the dominant pathogen in 148 cases with vulvovaginal candidiasis (91.9%, 136/148); 33.1% (49/148) of patients with vulvovaginal candidiasis were infected in both intestine and vulvovagina. While 92% (22/24) of patients with intestinal and vaginal candida infection showed high homology. The recurrent rate of patients with vulvovaginal candidiasis complicated with concurrent intestinal candida infection (7/14) was significantly higher than that of solo vaginal infected patients [21% (6/29)] after vaginal treatment (P<0.05). The infection of vulvovaginal candidiasis is highly associated with the concurrent infection of intestinal candida. The recurrent rate is high in patients with vulvovaginal candidiasis with concurrent infection of intestinal candida after vaginal treatment. The general management to those patients infected by both vulvovaginal and intestinal candida is necessary in reducing the recurrence of the disease.
Curation accuracy of model organism databases

PubMed Central

Keseler, Ingrid M.; Skrzypek, Marek; Weerasinghe, Deepika; Chen, Albert Y.; Fulcher, Carol; Li, Gene-Wei; Lemmer, Kimberly C.; Mladinich, Katherine M.; Chow, Edmond D.; Sherlock, Gavin; Karp, Peter D.

2014-01-01

Manual extraction of information from the biomedical literature—or biocuration—is the central methodology used to construct many biological databases. For example, the UniProt protein database, the EcoCyc Escherichia coli database and the Candida Genome Database (CGD) are all based on biocuration. Biological databases are used extensively by life science researchers, as online encyclopedias, as aids in the interpretation of new experimental data and as golden standards for the development of new bioinformatics algorithms. Although manual curation has been assumed to be highly accurate, we are aware of only one previous study of biocuration accuracy. We assessed the accuracy of EcoCyc and CGD by manually selecting curated assertions within randomly chosen EcoCyc and CGD gene pages and by then validating that the data found in the referenced publications supported those assertions. A database assertion is considered to be in error if that assertion could not be found in the publication cited for that assertion. We identified 10 errors in the 633 facts that we validated across the two databases, for an overall error rate of 1.58%, and individual error rates of 1.82% for CGD and 1.40% for EcoCyc. These data suggest that manual curation of the experimental literature by Ph.D-level scientists is highly accurate. Database URL: http://ecocyc.org/, http://www.candidagenome.org// PMID:24923819
The Candida Pathogenic Species Complex

PubMed Central

Turner, Siobhán A.; Butler, Geraldine

2014-01-01

Candida species are the most common causes of fungal infection. Approximately 90% of infections are caused by five species: Candida albicans, Candida glabrata, Candida tropicalis, Candida parapsilosis, and Candida krusei. Three (C. albicans, C. tropicalis, and C. parapsilosis) belong to the CTG clade, in which the CTG codon is translated as serine and not leucine. C. albicans remains the most commonly isolated but is decreasing relative to the other species. The increasing incidence of C. glabrata is related to its reduced susceptibility to azole drugs. Genome analysis suggests that virulence in the CTG clade is associated with expansion of gene families, particularly of cell wall genes. Similar independent processes took place in the C. glabrata species group. Gene loss and expansion in an ancestor of C. glabrata may have resulted in preadaptations that enabled pathogenicity. PMID:25183855

Coping with living in the soil: the genome of the parthenogenetic springtail Folsomia candida.

PubMed

Faddeeva-Vakhrusheva, Anna; Kraaijeveld, Ken; Derks, Martijn F L; Anvar, Seyed Yahya; Agamennone, Valeria; Suring, Wouter; Kampfraath, Andries A; Ellers, Jacintha; Le Ngoc, Giang; van Gestel, Cornelis A M; Mariën, Janine; Smit, Sandra; van Straalen, Nico M; Roelofs, Dick

2017-06-28

Folsomia candida is a model in soil biology, belonging to the family of Isotomidae, subclass Collembola. It reproduces parthenogenetically in the presence of Wolbachia, and exhibits remarkable physiological adaptations to stress. To better understand these features and adaptations to life in the soil, we studied its genome in the context of its parthenogenetic lifestyle. We applied Pacific Bioscience sequencing and assembly to generate a reference genome for F. candida of 221.7 Mbp, comprising only 162 scaffolds. The complete genome of its endosymbiont Wolbachia, was also assembled and turned out to be the largest strain identified so far. Substantial gene family expansions and lineage-specific gene clusters were linked to stress response. A large number of genes (809) were acquired by horizontal gene transfer. A substantial fraction of these genes are involved in lignocellulose degradation. Also, the presence of genes involved in antibiotic biosynthesis was confirmed. Intra-genomic rearrangements of collinear gene clusters were observed, of which 11 were organized as palindromes. The Hox gene cluster of F. candida showed major rearrangements compared to arthropod consensus cluster, resulting in a disorganized cluster. The expansion of stress response gene families suggests that stress defense was important to facilitate colonization of soils. The large number of HGT genes related to lignocellulose degradation could be beneficial to unlock carbohydrate sources in soil, especially those contained in decaying plant and fungal organic matter. Intra- as well as inter-scaffold duplications of gene clusters may be a consequence of its parthenogenetic lifestyle. This high quality genome will be instrumental for evolutionary biologists investigating deep phylogenetic lineages among arthropods and will provide the basis for a more mechanistic understanding in soil ecology and ecotoxicology.
Evolutionary genomics of yeast pathogens in the Saccharomycotina

PubMed Central

Naranjo-Ortíz, Miguel A.; Marcet-Houben, Marina

2016-01-01

Saccharomycotina comprises a diverse group of yeasts that includes numerous species of industrial or clinical relevance. Opportunistic pathogens within this clade are often assigned to the genus Candida but belong to phylogenetically distant lineages that also comprise non-pathogenic species. This indicates that the ability to infect humans has evolved independently several times among Saccharomycotina. Although the mechanisms of infection of the main groups of Candida pathogens are starting to be unveiled, we still lack sufficient understanding of the evolutionary paths that led to a virulent phenotype in each of the pathogenic lineages. Deciphering what genomic changes underlie the evolutionary emergence of a virulence trait will not only aid the discovery of novel virulence mechanisms but it will also provide valuable information to understand how new pathogens emerge, and what clades may pose a future danger. Here we review recent comparative genomics efforts that have revealed possible evolutionary paths to pathogenesis in different lineages, focusing on the main three agents of candidiasis worldwide: Candida albicans, C. parapsilosis and C. glabrata. We will discuss what genomic traits may facilitate the emergence of virulence, and focus on two different genome evolution mechanisms able to generate drastic phenotypic changes and which have been associated to the emergence of virulence: gene family expansion and interspecies hybridization. PMID:27493146
Intra-Genomic Internal Transcribed Spacer Region Sequence Heterogeneity and Molecular Diagnosis in Clinical Microbiology.

PubMed

Zhao, Ying; Tsang, Chi-Ching; Xiao, Meng; Cheng, Jingwei; Xu, Yingchun; Lau, Susanna K P; Woo, Patrick C Y

2015-10-22

Internal transcribed spacer region (ITS) sequencing is the most extensively used technology for accurate molecular identification of fungal pathogens in clinical microbiology laboratories. Intra-genomic ITS sequence heterogeneity, which makes fungal identification based on direct sequencing of PCR products difficult, has rarely been reported in pathogenic fungi. During the process of performing ITS sequencing on 71 yeast strains isolated from various clinical specimens, direct sequencing of the PCR products showed ambiguous sequences in six of them. After cloning the PCR products into plasmids for sequencing, interpretable sequencing electropherograms could be obtained. For each of the six isolates, 10-49 clones were selected for sequencing and two to seven intra-genomic ITS copies were detected. The identities of these six isolates were confirmed to be Candida glabrata (n=2), Pichia (Candida) norvegensis (n=2), Candida tropicalis (n=1) and Saccharomyces cerevisiae (n=1). Multiple sequence alignment revealed that one to four intra-genomic ITS polymorphic sites were present in the six isolates, and all these polymorphic sites were located in the ITS1 and/or ITS2 regions. We report and describe the first evidence of intra-genomic ITS sequence heterogeneity in four different pathogenic yeasts, which occurred exclusively in the ITS1 and ITS2 spacer regions for the six isolates in this study.
Intra-Genomic Internal Transcribed Spacer Region Sequence Heterogeneity and Molecular Diagnosis in Clinical Microbiology

PubMed Central

Zhao, Ying; Tsang, Chi-Ching; Xiao, Meng; Cheng, Jingwei; Xu, Yingchun; Lau, Susanna K. P.; Woo, Patrick C. Y.

2015-01-01

Internal transcribed spacer region (ITS) sequencing is the most extensively used technology for accurate molecular identification of fungal pathogens in clinical microbiology laboratories. Intra-genomic ITS sequence heterogeneity, which makes fungal identification based on direct sequencing of PCR products difficult, has rarely been reported in pathogenic fungi. During the process of performing ITS sequencing on 71 yeast strains isolated from various clinical specimens, direct sequencing of the PCR products showed ambiguous sequences in six of them. After cloning the PCR products into plasmids for sequencing, interpretable sequencing electropherograms could be obtained. For each of the six isolates, 10–49 clones were selected for sequencing and two to seven intra-genomic ITS copies were detected. The identities of these six isolates were confirmed to be Candida glabrata (n = 2), Pichia (Candida) norvegensis (n = 2), Candida tropicalis (n = 1) and Saccharomyces cerevisiae (n = 1). Multiple sequence alignment revealed that one to four intra-genomic ITS polymorphic sites were present in the six isolates, and all these polymorphic sites were located in the ITS1 and/or ITS2 regions. We report and describe the first evidence of intra-genomic ITS sequence heterogeneity in four different pathogenic yeasts, which occurred exclusively in the ITS1 and ITS2 spacer regions for the six isolates in this study. PMID:26506340
Candida auris

MedlinePlus

... auris infection spread globally? CDC conducted whole genome sequencing of C. auris specimens from countries in the ... Asia, southern Africa, and South America. Whole genome sequencing produces detailed DNA fingerprints of organisms. CDC found ...
Genomic identification of potential targets unique to Candida albicans for the discovery of antifungal agents.

PubMed

Tripathi, Himanshu; Luqman, Suaib; Meena, Abha; Khan, Feroz

2014-01-01

Despite of modern antifungal therapy, the mortality rates of invasive infection with human fungal pathogen Candida albicans are up to 40%. Studies suggest that drug resistance in the three most common species of human fungal pathogens viz., C. albicans, Aspergillus fumigatus (causing mortality rate up to 90%) and Cryptococcus neoformans (causing mortality rate up to 70%) is due to mutations in the target enzymes or high expression of drug transporter genes. Drug resistance in human fungal pathogens has led to an imperative need for the identification of new targets unique to fungal pathogens. In the present study, we have used a comparative genomics approach to find out potential target proteins unique to C. albicans, an opportunistic fungus responsible for severe infection in immune-compromised human. Interestingly, many target proteins of existing antifungal agents showed orthologs in human cells. To identify unique proteins, we have compared proteome of C. albicans [SC5314] i.e., 14,633 total proteins retrieved from the RefSeq database of NCBI, USA with proteome of human and non-pathogenic yeast Saccharomyces cerevisiae. Results showed that 4,568 proteins were identified unique to C. albicans as compared to those of human and later when these unique proteins were compared with S. cerevisiae proteome, finally 2,161 proteins were identified as unique proteins and after removing repeats total 1,618 unique proteins (42 functionally known, 1,566 hypothetical and 10 unknown) were selected as potential antifungal drug targets unique to C. albicans.
An improved host-vector system for Candida maltosa using a gene isolated from its genome that complements the his5 mutation of Saccharomyces cerevisiae.

PubMed

Hikiji, T; Ohkuma, M; Takagi, M; Yano, K

1989-10-01

The host-vector system of an n-alkane-assimilating-yeast, Candida maltosa, which we previously constructed using an autonomously replicating sequence (ARS) region isolated from the genome of this yeast, utilizes C. maltosa J288 (leu2-) as a host. As this host had a serious growth defect on n-alkane, we developed an improved host-vector system using C. maltosa CH1 (his-) as host. The vectors were constructed with the Candida ARS region and a DNA fragment isolated from the genome of C. maltosa. Since this DNA fragment could complement histidine auxotrophy of both C. maltosa CH1 and S. cerevisiae (his5-), we termed the gene contained in this DNA fragment C-HIS5. The vectors were characterized in terms of transformation frequency and stability, and the nucleotide sequence of C-HIS5 was determined. The deduced amino acid sequence (389 residues) shared 51% homology with that of HIS5 of S. cerevisiae (384 residues; Nishiwaki et al. 1987).
ISOLATION OF THE CANDIDA TROPICALIS GENE FOR P450 LANOSTEROL DEMETHYLASE AND ITS EXPRESSION IN SACCAROMYCES CEREVISIAE

EPA Science Inventory

We have isolated the gene for cytochrome P450 lanosterol 14-demethylase (14DM) from the yeast Candida tropicalis. This was accomplished by screening genomic libraries of strain ATCC750 in E. coli using a DNA fragment containing the yeast Saccharomyces cerevisiae 14DM gene. Identi...
An Efficient, Rapid, and Recyclable System for CRISPR-Mediated Genome Editing in Candida albicans.

PubMed

Nguyen, Namkha; Quail, Morgan M F; Hernday, Aaron D

2017-01-01

Candida albicans is the most common fungal pathogen of humans. Historically, molecular genetic analysis of this important pathogen has been hampered by the lack of stable plasmids or meiotic cell division, limited selectable markers, and inefficient methods for generating gene knockouts. The recent development of clustered regularly interspaced short palindromic repeat(s) (CRISPR)-based tools for use with C. albicans has opened the door to more efficient genome editing; however, previously reported systems have specific limitations. We report the development of an optimized CRISPR-based genome editing system for use with C. albicans . Our system is highly efficient, does not require molecular cloning, does not leave permanent markers in the genome, and supports rapid, precise genome editing in C. albicans . We also demonstrate the utility of our system for generating two independent homozygous gene knockouts in a single transformation and present a method for generating homozygous wild-type gene addbacks at the native locus. Furthermore, each step of our protocol is compatible with high-throughput strain engineering approaches, thus opening the door to the generation of a complete C. albicans gene knockout library. IMPORTANCE Candida albicans is the major fungal pathogen of humans and is the subject of intense biomedical and discovery research. Until recently, the pace of research in this field has been hampered by the lack of efficient methods for genome editing. We report the development of a highly efficient and flexible genome editing system for use with C. albicans . This system improves upon previously published C. albicans CRISPR systems and enables rapid, precise genome editing without the use of permanent markers. This new tool kit promises to expedite the pace of research on this important fungal pathogen.
In Vitro Analysis of Finasteride Activity against Candida albicans Urinary Biofilm Formation and Filamentation

PubMed Central

Chavez-Dozal, Alba A.; Lown, Livia; Jahng, Maximillian; Walraven, Carla J.

2014-01-01

Candida albicans is the 3rd most common cause of catheter-associated urinary tract infections, with a strong propensity to form drug-resistant catheter-related biofilms. Due to the limited efficacy of available antifungals against biofilms, drug repurposing has been investigated in order to identify novel agents with activities against fungal biofilms. Finasteride is a 5-α-reductase inhibitor commonly used for the treatment of benign prostatic hyperplasia, with activity against human type II and III isoenzymes. We analyzed the Candida Genome Database and identified a C. albicans homolog of type III 5-α-reductase, Dfg10p, which shares 27% sequence identity and 41% similarity to the human type III 5-α-reductase. Thus, we investigated finasteride for activity against C. albicans urinary biofilms, alone and in combination with amphotericin B or fluconazole. Finasteride alone was highly effective in the prevention of C. albicans biofilm formation at doses of ≥16 mg/liter and the treatment of preformed biofilms at doses of ≥128 mg/liter. In biofilm checkerboard analyses, finasteride exhibited synergistic activity in the prevention of biofilm formation in a combination of 4 mg/liter finasteride with 2 mg/liter fluconazole. Finasteride inhibited filamentation, thus suggesting a potential mechanism of action. These results indicate that finasteride alone is highly active in the prevention of C. albicans urinary biofilms in vitro and has synergistic activity in combination with fluconazole. Further investigation of the clinical utility of finasteride in the prevention of urinary candidiasis is warranted. PMID:25049253
Chemical signaling and insect attraction is a conserved trait in yeasts.

PubMed

Becher, Paul G; Hagman, Arne; Verschut, Vasiliki; Chakraborty, Amrita; Rozpędowska, Elżbieta; Lebreton, Sébastien; Bengtsson, Marie; Flick, Gerhard; Witzgall, Peter; Piškur, Jure

2018-03-01

Yeast volatiles attract insects, which apparently is of mutual benefit, for both yeasts and insects. However, it is unknown whether biosynthesis of metabolites that attract insects is a basic and general trait, or if it is specific for yeasts that live in close association with insects. Our goal was to study chemical insect attractants produced by yeasts that span more than 250 million years of evolutionary history and vastly differ in their metabolism and lifestyle. We bioassayed attraction of the vinegar fly Drosophila melanogaster to odors of phylogenetically and ecologically distinct yeasts grown under controlled conditions. Baker's yeast Saccharomyces cerevisiae , the insect-associated species Candida californica , Pichia kluyveri and Metschnikowia andauensis , wine yeast Dekkera bruxellensis , milk yeast Kluyveromyces lactis , the vertebrate pathogens Candida albicans and Candida glabrata , and oleophilic Yarrowia lipolytica were screened for fly attraction in a wind tunnel. Yeast headspace was chemically analyzed, and co-occurrence of insect attractants in yeasts and flowering plants was investigated through a database search. In yeasts with known genomes, we investigated the occurrence of genes involved in the synthesis of key aroma compounds. Flies were attracted to all nine yeasts studied. The behavioral response to baker's yeast was independent of its growth stage. In addition to Drosophila , we tested the basal hexapod Folsomia candida (Collembola) in a Y-tube assay to the most ancient yeast, Y. lipolytica, which proved that early yeast signals also function on clades older than neopteran insects. Behavioral and chemical data and a search for selected genes of volatile metabolites underline that biosynthesis of chemical signals is found throughout the yeast clade and has been conserved during the evolution of yeast lifestyles. Literature and database reviews corroborate that yeast signals mediate mutualistic interactions between insects and yeasts. Moreover, volatiles emitted by yeasts are commonly found also in flowers and attract many insect species. The collective evidence suggests that the release of volatile signals by yeasts is a widespread and phylogenetically ancient trait, and that insect-yeast communication evolved prior to the emergence of flowering plants. Co-occurrence of the same attractant signals in yeast and flowers suggests that yeast-insect communication may have contributed to the evolution of insect-mediated pollination in flowers.
Development of a CRISPR-Cas9 System for Efficient Genome Editing of Candida lusitaniae.

PubMed

Norton, Emily L; Sherwood, Racquel K; Bennett, Richard J

2017-01-01

Candida lusitaniae is a member of the Candida clade that includes a diverse group of fungal species relevant to both human health and biotechnology. This species exhibits a full sexual cycle to undergo interconversion between haploid and diploid forms. C. lusitaniae is also an emerging opportunistic pathogen that can cause serious bloodstream infections in the clinic and yet has often proven to be refractory to facile genetic manipulations. In this work, we develop a clustered regularly interspaced short palindromic repeat (CRISPR) and CRISPR-associated gene 9 (Cas9) system to enable genome editing of C. lusitaniae . We demonstrate that expression of CRISPR-Cas9 components under species-specific promoters is necessary for efficient gene targeting and can be successfully applied to multiple genes in both haploid and diploid isolates. Gene deletion efficiencies with CRISPR-Cas9 were further enhanced in C. lusitaniae strains lacking the established nonhomologous end joining (NHEJ) factors Ku70 and DNA ligase 4. These results indicate that NHEJ plays an important role in directing the repair of DNA double-strand breaks (DSBs) in C. lusitaniae and that removal of this pathway increases integration of gene deletion templates by homologous recombination. The described approaches significantly enhance the ability to perform genetic studies in, and promote understanding of, this emerging human pathogen and model sexual species. IMPORTANCE The ability to perform efficient genome editing is a key development for detailed mechanistic studies of a species. Candida lusitaniae is an important member of the Candida clade and is relevant both as an emerging human pathogen and as a model for understanding mechanisms of sexual reproduction. We highlight the development of a CRISPR-Cas9 system for efficient genome manipulation in C. lusitaniae and demonstrate the importance of species-specific promoters for expression of CRISPR components. We also demonstrate that the NHEJ pathway contributes to non-template-mediated repair of DNA DSBs and that removal of this pathway enhances efficiencies of gene targeting by CRISPR-Cas9. These results therefore establish important genetic tools for further exploration of C. lusitaniae biology.
A surprisingly large RNase P RNA in Candida glabrata

PubMed Central

KACHOURI, RYM; STRIBINSKIS, VILIUS; ZHU, YANGLONG; RAMOS, KENNETH S.; WESTHOF, ERIC; LI, YONG

2005-01-01

We have found an extremely large ribonuclease P (RNase P) RNA (RPR1) in the human pathogen Candida glabrata and verified that this molecule is expressed and present in the active enzyme complex of this hemiascomycete yeast. A structural alignment of the C. glabrata sequence with 36 other hemiascomycete RNase P RNAs (abbreviated as P RNAs) allows us to characterize the types of insertions. In addition, 15 P RNA sequences were newly characterized by searching in the recently sequenced genomes Candida albicans, C. glabrata, Debaryomyces hansenii, Eremothecium gossypii, Kluyveromyces lactis, Kluyveromyces waltii, Naumovia castellii, Saccharomyces kudriavzevii, Saccharomyces mikatae, and Yarrowia lipolytica; and by PCR amplification for other Candida species (Candida guilliermondii, Candida krusei, Candida parapsilosis, Candida stellatoidea, and Candida tropicalis). The phylogenetic comparative analysis identifies a hemiascomycete secondary structure consensus that presents a conserved core in all species with variable insertions or deletions. The most significant variability is found in C. glabrata P RNA in which three insertions exceeding in total 700 nt are present in the Specificity domain. This P RNA is more than twice the length of any other homologous P RNAs known in the three domains of life and is eight times the size of the smallest. RNase P RNA, therefore, represents one of the most diversified noncoding RNAs in terms of size variation and structural diversity. PMID:15987816
Urinary tract infections and Candida albicans.

PubMed

Behzadi, Payam; Behzadi, Elham; Ranjbar, Reza

2015-01-01

Urinary tract candidiasis is known as the most frequent nosocomial fungal infection worldwide. Candida albicans is the most common cause of nosocomial fungal urinary tract infections; however, a rapid change in the distribution of Candida species is undergoing. Simultaneously, the increase of urinary tract candidiasis has led to the appearance of antifungal resistant Candida species. In this review, we have an in depth look into Candida albicans uropathogenesis and distribution of the three most frequent Candida species contributing to urinary tract candidiasis in different countries around the world. For writing this review, Google Scholar -a scholarly search engine- (http://scholar.google.com/) and PubMed database (http://www.ncbi.nlm.nih.gov/pubmed/) were used. The most recently published original articles and reviews of literature relating to the first three Candida species causing urinary tract infections in different countries and the pathogenicity of Candida albicans were selected and studied. Although some studies show rapid changes in the uropathogenesis of Candida species causing urinary tract infections in some countries, Candida albicans is still the most important cause of candidal urinary tract infections. Despite the ranking of Candida albicans as the dominant species for urinary tract candidiasis, specific changes have occurred in some countries. At this time, it is important to continue the surveillance related to Candida species causing urinary tract infections to prevent, control and treat urinary tract candidiasis in future.
Candida nivariensis as a New Emergent Agent of Vulvovaginal Candidiasis: Description of Cases and Review of Published Studies.

PubMed

Aznar-Marin, Pilar; Galan-Sanchez, Fátima; Marin-Casanova, Pilar; García-Martos, Pedro; Rodríguez-Iglesias, Manuel

2016-06-01

Candida nivariensis is a new emergent agent related to human infections in the vaginal tract and other localizations, but the phenotypic characteristics are very similar to Candida glabrata and can be misidentified and underdiagnosed. We described four cases of vulvovaginitis identified by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry and confirmed the results with PCR amplification and sequencing of the entire ITS genomic region (ITS1, ITS2 and 5.8 rRNA). We reinforce the need for new diagnostic tools for the correct identification of yeast infections.
Genome sequence and physiological analysis of Yamadazyma laniorum f.a. sp. nov. and a reevaluation of the apocryphal xylose fermentation of its sister species, Candida tenuis

USDA-ARS?s Scientific Manuscript database

Xylose fermentation is a rare trait that is immensely important to the cellulosic biofuel industry, and Candida tenuis is one of the few yeasts that has been reported with this trait. Here we report the isolation of two strains representing a candidate sister species to C. tenuis. Integrated analysi...
A stable hybrid containing haploid genomes of two obligate diploid Candida species.

PubMed

Chakraborty, Uttara; Mohamed, Aiyaz; Kakade, Pallavi; Mugasimangalam, Raja C; Sadhale, Parag P; Sanyal, Kaustuv

2013-08-01

Candida albicans and Candida dubliniensis are diploid, predominantly asexual human-pathogenic yeasts. In this study, we constructed tetraploid (4n) strains of C. albicans of the same or different lineages by spheroplast fusion. Induction of chromosome loss in the tetraploid C. albicans generated diploid or near-diploid progeny strains but did not produce any haploid progeny. We also constructed stable heterotetraploid somatic hybrid strains (2n + 2n) of C. albicans and C. dubliniensis by spheroplast fusion. Heterodiploid (n + n) progeny hybrids were obtained after inducing chromosome loss in a stable heterotetraploid hybrid. To identify a subset of hybrid heterodiploid progeny strains carrying at least one copy of all chromosomes of both species, unique centromere sequences of various chromosomes of each species were used as markers in PCR analysis. The reduction of chromosome content was confirmed by a comparative genome hybridization (CGH) assay. The hybrid strains were found to be stably propagated. Chromatin immunoprecipitation (ChIP) assays with antibodies against centromere-specific histones (C. albicans Cse4/C. dubliniensis Cse4) revealed that the centromere identity of chromosomes of each species is maintained in the hybrid genomes of the heterotetraploid and heterodiploid strains. Thus, our results suggest that the diploid genome content is not obligatory for the survival of either C. albicans or C. dubliniensis. In keeping with the recent discovery of the existence of haploid C. albicans strains, the heterodiploid strains of our study can be excellent tools for further species-specific genome elimination, yielding true haploid progeny of C. albicans or C. dubliniensis in future.
Candida Pneumonia in Intensive Care Unit?

PubMed Central

Schnabel, Ronny M.; Linssen, Catharina F.; Guion, Nele; van Mook, Walther N.; Bergmans, Dennis C.

2014-01-01

It has been questioned if Candida pneumonia exists as a clinical entity. Only histopathology can establish the definite diagnosis. Less invasive diagnostic strategies lack specificity and have been insufficiently validated. Scarcity of this pathomechanism and nonspecific clinical presentation make validation and the development of a clinical algorithm difficult. In the present study, we analyze whether Candida pneumonia exists in our critical care population. We used a bronchoalveolar lavage (BAL) specimen database that we have built in a structural diagnostic approach to ventilator-associated pneumonia for more than a decade consisting of 832 samples. Microbiological data were linked to clinical information and available autopsy data. We searched for critically ill patients with respiratory failure with no other microbiological or clinical explanation than exclusive presence of Candida species in BAL fluid. Five cases could be identified with Candida as the likely cause of pneumonia. PMID:25734099
Next-generation sequencing offers new insights into the resistance of Candida spp. to echinocandins and azoles.

PubMed

Garnaud, Cécile; Botterel, Françoise; Sertour, Natacha; Bougnoux, Marie-Elisabeth; Dannaoui, Eric; Larrat, Sylvie; Hennequin, Christophe; Guinea, Jesus; Cornet, Muriel; Maubon, Danièle

2015-09-01

MDR Candida strains are emerging. Next-generation sequencing (NGS), which enables extensive and deep genome analysis, was used to investigate echinocandin and azole resistance in clinical Candida isolates. Six genes commonly involved in antifungal resistance (ERG11, ERG3, TAC1, CgPDR1, FKS1 and FKS2) were analysed using NGS in 40 Candida isolates (18 Candida albicans, 15 Candida glabrata and 7 Candida parapsilosis). The strategy was validated using strains with known sequences. Then, 8 clinical strains displaying antifungal resistance and 23 sequential isolates collected from 10 patients receiving antifungal therapy were analysed. A total of 391 SNPs were detected, among which 6 coding SNPs were reported for the first time. Novel genetic alterations were detected in both azole and echinocandin resistance genes. A C. glabrata strain, which was resistant to echinocandins but highly susceptible to azoles, harboured an FKS2 S663P mutation plus a novel presumed loss-of-function CgPDR1 mutation. This isolate was from a patient with deep-seated and urinary candidiasis. Another C. glabrata isolate, with an MDR phenotype, carried a new FKS2 S663A mutation and a new putative gain-of-function CgPDR1 mutation (T370I); this isolate showed mutated (80%) and WT (20%) populations and was collected after 75 days of exposure to caspofungin from a patient who underwent complicated abdominal surgery. This study shows that NGS can be used for extensive assessment of genetic mutations involved in antifungal resistance. This type of wide genome approach will become very valuable for detecting mechanisms of resistance in clinical strains subjected to multidrug pressure. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Characterization of a new clinical yeast species, Candida tunisiensis sp. nov., isolated from a strain collection from Tunisian hospitals.

PubMed

Eddouzi, Jamel; Hofstetter, Valérie; Groenewald, Marizeth; Manai, Mohamed; Sanglard, Dominique

2013-01-01

From a collection of yeast isolates isolated from patients in Tunisian hospitals between September 2006 and July 2010, the yeast strain JEY63 (CBS 12513), isolated from a 50-year-old male that suffered from oral thrush, could not be identified to the species level using conventional methods used in clinical laboratories. These methods include matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS), germ tube formation, and the use of CHROMagar Candida and metabolic galleries. Sequence analysis of the nuclear rRNA (18S rRNA, 5.8S rRNA, and 26S rRNA) and internal transcribed spacer regions (ITS1 and ITS2) indicated that the ribosomal DNA sequences of this species were not yet reported. Multiple gene phylogenic analyses suggested that this isolate clustered at the base of the Dipodascaceae (Saccharomycetales, Saccharomycetes, and Ascomycota). JEY63 was named Candida tunisiensis sp. nov. according to several phenotypic criteria and its geographical origin. C. tunisiensis was able to grow at 42°C and does not form chlamydospores and hyphae but could grow as yeast and pseudohyphal forms. C. tunisiensis exhibited most probably a haploid genome with an estimated size of 10 Mb on at least three chromosomes. Using European Committee for Antimicrobial Susceptibility Testing (EUCAST) and Clinical and Laboratory Standards Institute (CLSI) Candida albicans susceptibility breakpoints as a reference, C. tunisiensis was resistant to fluconazole (MIC = 8 μg/ml), voriconazole (MIC = 0.5 μg/ml), itraconazole (MIC = 16 μg/ml), and amphotericin B (MIC = 4 μg/ml) but still susceptible to posaconazole (MIC = 0.008 μg/ml) and caspofungin (MIC = 0.5 μg/ml). In conclusion, MALDI-TOF MS permitted the early selection of an unusual isolate, which was still unreported in molecular databases but could not be unambiguously classified based on phylogenetic approaches.

Characterization of a New Clinical Yeast Species, Candida tunisiensis sp. nov., Isolated from a Strain Collection from Tunisian Hospitals

PubMed Central

Eddouzi, Jamel; Hofstetter, Valérie; Groenewald, Marizeth; Manai, Mohamed

2013-01-01

From a collection of yeast isolates isolated from patients in Tunisian hospitals between September 2006 and July 2010, the yeast strain JEY63 (CBS 12513), isolated from a 50-year-old male that suffered from oral thrush, could not be identified to the species level using conventional methods used in clinical laboratories. These methods include matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS), germ tube formation, and the use of CHROMagar Candida and metabolic galleries. Sequence analysis of the nuclear rRNA (18S rRNA, 5.8S rRNA, and 26S rRNA) and internal transcribed spacer regions (ITS1 and ITS2) indicated that the ribosomal DNA sequences of this species were not yet reported. Multiple gene phylogenic analyses suggested that this isolate clustered at the base of the Dipodascaceae (Saccharomycetales, Saccharomycetes, and Ascomycota). JEY63 was named Candida tunisiensis sp. nov. according to several phenotypic criteria and its geographical origin. C. tunisiensis was able to grow at 42°C and does not form chlamydospores and hyphae but could grow as yeast and pseudohyphal forms. C. tunisiensis exhibited most probably a haploid genome with an estimated size of 10 Mb on at least three chromosomes. Using European Committee for Antimicrobial Susceptibility Testing (EUCAST) and Clinical and Laboratory Standards Institute (CLSI) Candida albicans susceptibility breakpoints as a reference, C. tunisiensis was resistant to fluconazole (MIC = 8 μg/ml), voriconazole (MIC = 0.5 μg/ml), itraconazole (MIC = 16 μg/ml), and amphotericin B (MIC = 4 μg/ml) but still susceptible to posaconazole (MIC = 0.008 μg/ml) and caspofungin (MIC = 0.5 μg/ml). In conclusion, MALDI-TOF MS permitted the early selection of an unusual isolate, which was still unreported in molecular databases but could not be unambiguously classified based on phylogenetic approaches. PMID:23077122
Rapid Hypothesis Testing with Candida albicans through Gene Disruption with Short Homology Regions

PubMed Central

Wilson, R. Bryce; Davis, Dana; Mitchell, Aaron P.

1999-01-01

Disruption of newly identified genes in the pathogen Candida albicans is a vital step in determination of gene function. Several gene disruption methods described previously employ long regions of homology flanking a selectable marker. Here, we describe disruption of C. albicans genes with PCR products that have 50 to 60 bp of homology to a genomic sequence on each end of a selectable marker. We used the method to disrupt two known genes, ARG5 and ADE2, and two sequences newly identified through the Candida genome project, HRM101 and ENX3. HRM101 and ENX3 are homologous to genes in the conserved RIM101 (previously called RIM1) and PacC pathways of Saccharomyces cerevisiae and Aspergillus nidulans. We show that three independent hrm101/hrm101 mutants and two independent enx3/enx3 mutants are defective in filamentation on Spider medium. These observations argue that HRM101 and ENX3 sequences are indeed portions of genes and that the respective gene products have related functions. PMID:10074081
Invasive candidiasis and oral manifestations in premature newborns.

PubMed

Tinoco-Araujo, José Endrigo; Araújo, Diana Ferreira Gadelha; Barbosa, Patrícia Gomes; Santos, Paulo Sérgio da Silva; Medeiros, Ana Myriam Costa de

2013-01-01

To investigate prevalence of invasive candidiasis in a Neonatal Intensive Care Unit and to evaluate oral diseases and Candida spp. colonization in low birth weight preterm newborns. A descriptive epidemiological study performed in two stages. First, prevalence of candidiasis was analyzed in a database of 295 preterm patients admitted to hospital for over 10 days and birth weight less than 2,000g. In the second stage, oral changes and Candida spp. colonization were assessed in 65 patients weighing less than 2,000g, up to 4 week-old, hospitalized for over 10 days and presenting oral abnormalities compatible with fungal lesions. Swab samples were collected in the mouth to identify fungi. Prevalence of candidiasis was 5.4% in the database analyzed. It correlated with prolonged hospital length of stay (p<0.001), in average, 31 days, and 85% risk of developing infection in the first 25 days. It correlated with low birth weight (p<0.001), with mean of 1,140g. The most frequent alterations were white soft plaques, detachable, in oral mucosa and tongue. Intense oral colonization by Candida spp was observed (80%). The frequency of invasive candidiasis was low and correlated with low birth weight and prolonged hospital stay. The most common oral changes were white plaques compatible with pseudomembranous candidiasis and colonization by Candida spp. was above average.
Molecular epidemiology, phylogeny and evolution of Candida albicans.

PubMed

McManus, Brenda A; Coleman, David C

2014-01-01

A small number of Candida species form part of the normal microbial flora of mucosal surfaces in humans and may give rise to opportunistic infections when host defences are impaired. Candida albicans is by far the most prevalent commensal and pathogenic Candida species. Several different molecular typing approaches including multilocus sequence typing, multilocus microsatellite typing and DNA fingerprinting using C. albicans-specific repetitive sequence-containing DNA probes have yielded a wealth of information regarding the epidemiology and population structure of this species. Such studies revealed that the C. albicans population structure consists of multiple major and minor clades, some of which exhibit geographical or phenotypic enrichment and that C. albicans reproduction is predominantly clonal. Despite this, losses of heterozygosity by recombination, the existence of a parasexual cycle, toleration of a wide range of aneuploidies and the recent description of viable haploid strains have all demonstrated the extensive plasticity of the C. albicans genome. Recombination and gross chromosomal rearrangements are more common under stressful environmental conditions, and have played a significant role in the evolution of this opportunistic pathogen. Surprisingly, Candida dubliniensis, the closest relative of C. albicans exhibits more karyotype variability than C. albicans, but is significantly less adaptable to unfavourable environments. This disparity most likely reflects the evolutionary processes that occurred during or soon after the divergence of both species from their common ancestor. Whilst C. dubliniensis underwent significant gene loss and pseudogenisation, C. albicans expanded gene families considered to be important in virulence. It is likely that technological developments in whole genome sequencing and data analysis in coming years will facilitate its routine use for population structure, epidemiological investigations, and phylogenetic analyses of Candida species. These are likely to reveal more minor C. albicans clades and to enhance our understanding of the population biology of this versatile organism. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Genome Sequence of the Yeast Clavispora lusitaniae Type Strain CBS 6936.

PubMed

Durrens, Pascal; Klopp, Christophe; Biteau, Nicolas; Fitton-Ouhabi, Valérie; Dementhon, Karine; Accoceberry, Isabelle; Sherman, David J; Noël, Thierry

2017-08-03

Clavispora lusitaniae , an environmental saprophytic yeast belonging to the CTG clade of Candida , can behave occasionally as an opportunistic pathogen in humans. We report here the genome sequence of the type strain CBS 6936. Comparison with sequences of strain ATCC 42720 indicates conservation of chromosomal structure but significant nucleotide divergence. Copyright © 2017 Durrens et al.
Genome Sequence of the Yeast Clavispora lusitaniae Type Strain CBS 6936

PubMed Central

Klopp, Christophe; Biteau, Nicolas; Fitton-Ouhabi, Valérie; Dementhon, Karine; Accoceberry, Isabelle; Sherman, David J.; Noël, Thierry

2017-01-01

ABSTRACT Clavispora lusitaniae, an environmental saprophytic yeast belonging to the CTG clade of Candida, can behave occasionally as an opportunistic pathogen in humans. We report here the genome sequence of the type strain CBS 6936. Comparison with sequences of strain ATCC 42720 indicates conservation of chromosomal structure but significant nucleotide divergence. PMID:28774979
Isolation of Candida Species from Gastroesophageal Lesions among Pediatrics in Isfahan, Iran: Identification and Antifungal Susceptibility Testing of Clinical Isolates by E-test

PubMed Central

Salehi, Fatemeh; Esmaeili, Mehran; Mohammadi, Rasoul

2017-01-01

Background: Candida species can become opportunistic pathogens causing local or systemic invasive infections. Gastroesophageal candidiasis may depend on the Candida colonization and local damage of the mucosal barrier. Risk factors are gastric acid suppression, diabetes mellitus, chronic debilitating states such as carcinomas, and the use of systemic antibiotics and corticosteroids. The aim of this study is collection and molecular identification of Candida species from gastroesophageal lesions among pediatrics in Isfahan, and determination of minimum inhibitory concentration (MIC) ranges for clinical isolates. Materials and Methods: A total of 200 patients underwent endoscopy (130 specimens from gastritis and 70 samples from esophagitis) were included in this study between April 2015 and November 2015. All specimens were subcultured on sabouraud dextrose agar, and genomic DNA of all strains was extracted using boiling method. Polymerase chain reaction and DNA sequencing of the ITS1-5.8SrDNA-ITS2 region were used for the identification of all Candida strains. MIC ranges were determined for itraconazole (ITC), amphotericin B (AmB), and fluconazole (FLU) by E-test. Results: Twenty of 200 suspected patients (10%) were positive by direct microscopy and culture. Candida albicans was the most common species (60%) followed by Candida glabrata (30%), Candida parapsilosis (5%), and Candida kefyr (5%). MIC ranges were determined for FLU (0.125–8 μg/mL), ITC (0.008–0.75 μg/mL), and AmB (0.008–0.75 μg/mL), respectively. Conclusion: Every colonization of Candida species should be considered as a potentially factor of mucocutaneous candidiasis and should be treated with antifungal drugs. PMID:28904931
Genome sequence and physiological analysis of Yamadazyma laniorum f.a. sp. nov. and a reevaluation of the apocryphal xylose fermentation of its sister species, Candida tenuis.

PubMed

Haase, Max A B; Kominek, Jacek; Langdon, Quinn K; Kurtzman, Cletus P; Hittinger, Chris Todd

2017-05-01

Xylose fermentation is a rare trait that is immensely important to the cellulosic biofuel industry, and Candida tenuis is one of the few yeasts that has been reported with this trait. Here we report the isolation of two strains representing a candidate sister species to C. tenuis. Integrated analysis of genome sequence and physiology suggested the genetic basis of a number of traits, including variation between the novel species and C. tenuis in lactose metabolism due to the loss of genes encoding lactose permease and β-galactosidase in the former. Surprisingly, physiological characterization revealed that neither the type strain of C. tenuis nor this novel species fermented xylose in traditional assays. We reexamined three xylose-fermenting strains previously identified as C. tenuis and found that these strains belong to the genus Scheffersomyces and are not C. tenuis. We propose Yamadazyma laniorum f.a. sp. nov. to accommodate our new strains and designate its type strain as yHMH7 (=CBS 14780 = NRRL Y-63967T). Furthermore, we propose the transfer of Candida tenuis to the genus Yamadazyma as Yamadazyma tenuis comb. nov. This approach provides a roadmap for how integrated genome sequence and physiological analysis can yield insight into the mechanisms that generate yeast biodiversity. Published by Oxford University Press on behalf of FEMS 2017. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Molecular identification of Candida species isolated from gastro-oesophageal candidiasis in Tehran, Iran

PubMed Central

Mohammadi, Rasoul; Abdi, Saeed

2015-01-01

Aim: The aim of this investigation is identification of Candida strains isolated from patients with gastro-oesophageal candidiasis in Tehran, Iran. Background: Gastro-oesophageal candidiasis is a rare infection and appears mainly in debilitated or immunocompromised patients. Colonization by Candida spp. may occur in this region and the organism can remain for several months or years in the absence of inflammation. The main infection symptom is the presence of white plaques in gastro-oesophageal surface. C. albicans remains the most prevalent Candida spp. identified in gastrointestinal candidiasis. Regarding differences in susceptibilities to antifungal drugs among Candida spp., identification of isolates to the species level is significant to quick and appropriate therapy. Patients and methods: A total of 398 patients underwent gastrointestinal endoscopy during February 2012 to October 2014 were included in the present study. Histological sections from all endoscopic gastric and oesophageal biopsies were prepared, stained with Periodic acid–Schiff (PAS), and examined for the presence of fungal elements. Part of the biopsy sample was sub-cultured on sabouraud glucose agar. The genomic DNA of each strain was extracted using FTA® Elute MicroCards. Molecular identification of Candida isolates was performed by PCR-RFLP technique with the restriction enzyme HpaII. Results: Twenty-one out of 398 cases (5.2%) were found to have gastro-oesophageal candidiasis. Candida albicans was the main strain isolated from clinical samples (90.5%), followed by C. glabrata (4.7%), and C. parapsilosis (4.7%). Conclusion: Due to varying antifungal susceptibility of Candida spp. careful species designation for clinical isolates of Candida was recommended by a rapid and meticulous method like PCR-RFLP. PMID:26468349
Invasive candidiasis and oral manifestations in premature newborns

PubMed Central

Tinoco-Araujo, José Endrigo; Araújo, Diana Ferreira Gadelha; Barbosa, Patrícia Gomes; Santos, Paulo Sérgio da Silva; de Medeiros, Ana Myriam Costa

2013-01-01

ABSTRACT Objective: To investigate prevalence of invasive candidiasis in a Neonatal Intensive Care Unit and to evaluate oral diseases and Candida spp. colonization in low birth weight preterm newborns. Methods: A descriptive epidemiological study performed in two stages. First, prevalence of candidiasis was analyzed in a database of 295 preterm patients admitted to hospital for over 10 days and birth weight less than 2,000g. In the second stage, oral changes and Candida spp. colonization were assessed in 65 patients weighing less than 2,000g, up to 4 week-old, hospitalized for over 10 days and presenting oral abnormalities compatible with fungal lesions. Swab samples were collected in the mouth to identify fungi. Results: Prevalence of candidiasis was 5.4% in the database analyzed. It correlated with prolonged hospital length of stay (p<0.001), in average, 31 days, and 85% risk of developing infection in the first 25 days. It correlated with low birth weight (p<0.001), with mean of 1,140g. The most frequent alterations were white soft plaques, detachable, in oral mucosa and tongue. Intense oral colonization by Candida spp was observed (80%). Conclusions: The frequency of invasive candidiasis was low and correlated with low birth weight and prolonged hospital stay. The most common oral changes were white plaques compatible with pseudomembranous candidiasis and colonization by Candida spp. was above average. PMID:23579747
A multiplex nested PCR for the detection and identification of Candida species in blood samples of critically ill paediatric patients

PubMed Central

2014-01-01

Background Nosocomial candidaemia is associated with high mortality rates in critically ill paediatric patients; thus, the early detection and identification of the infectious agent is crucial for successful medical intervention. The PCR-based techniques have significantly increased the detection of Candida species in bloodstream infections. In this study, a multiplex nested PCR approach was developed for candidaemia detection in neonatal and paediatric intensive care patients. Methods DNA samples from the blood of 54 neonates and children hospitalised in intensive care units with suspected candidaemia were evaluated by multiplex nested PCR with specific primers designed to identify seven Candida species, and the results were compared with those obtained from blood cultures. Results The multiplex nested PCR had a detection limit of four Candida genomes/mL of blood for all Candida species. Blood cultures were positive in 14.8% of patients, whereas the multiplex nested PCR was positive in 24.0% of patients, including all culture-positive patients. The results obtained with the molecular technique were available within 24 hours, and the assay was able to identify Candida species with 100% of concordance with blood cultures. Additionally, the multiplex nested PCR detected dual candidaemia in three patients. Conclusions Our proposed PCR method may represent an effective tool for the detection and identification of Candida species in the context of candidaemia diagnosis in children, showing highly sensitive detection and the ability to identify the major species involved in this infection. PMID:25047415
A multiplex nested PCR for the detection and identification of Candida species in blood samples of critically ill paediatric patients.

PubMed

Taira, Cleison Ledesma; Okay, Thelma Suely; Delgado, Artur Figueiredo; Ceccon, Maria Esther Jurfest Rivero; de Almeida, Margarete Teresa Gottardo; Del Negro, Gilda Maria Barbaro

2014-07-21

Nosocomial candidaemia is associated with high mortality rates in critically ill paediatric patients; thus, the early detection and identification of the infectious agent is crucial for successful medical intervention. The PCR-based techniques have significantly increased the detection of Candida species in bloodstream infections. In this study, a multiplex nested PCR approach was developed for candidaemia detection in neonatal and paediatric intensive care patients. DNA samples from the blood of 54 neonates and children hospitalised in intensive care units with suspected candidaemia were evaluated by multiplex nested PCR with specific primers designed to identify seven Candida species, and the results were compared with those obtained from blood cultures. The multiplex nested PCR had a detection limit of four Candida genomes/mL of blood for all Candida species. Blood cultures were positive in 14.8% of patients, whereas the multiplex nested PCR was positive in 24.0% of patients, including all culture-positive patients. The results obtained with the molecular technique were available within 24 hours, and the assay was able to identify Candida species with 100% of concordance with blood cultures. Additionally, the multiplex nested PCR detected dual candidaemia in three patients. Our proposed PCR method may represent an effective tool for the detection and identification of Candida species in the context of candidaemia diagnosis in children, showing highly sensitive detection and the ability to identify the major species involved in this infection.
Evidence for suppression of immunity as a driver for genomic introgressions and host range expansion in races of Albugo candida, a generalist parasite

PubMed Central

McMullan, Mark; Gardiner, Anastasia; Bailey, Kate; Kemen, Eric; Ward, Ben J; Cevik, Volkan; Robert-Seilaniantz, Alexandre; Schultz-Larsen, Torsten; Balmuth, Alexi; Holub, Eric; van Oosterhout, Cock; Jones, Jonathan DG

2015-01-01

How generalist parasites with wide host ranges can evolve is a central question in parasite evolution. Albugo candida is an obligate biotrophic parasite that consists of many physiological races that each specialize on distinct Brassicaceae host species. By analyzing genome sequence assemblies of five isolates, we show they represent three races that are genetically diverged by ∼1%. Despite this divergence, their genomes are mosaic-like, with ∼25% being introgressed from other races. Sequential infection experiments show that infection by adapted races enables subsequent infection of hosts by normally non-infecting races. This facilitates introgression and the exchange of effector repertoires, and may enable the evolution of novel races that can undergo clonal population expansion on new hosts. We discuss recent studies on hybridization in other eukaryotes such as yeast, Heliconius butterflies, Darwin's finches, sunflowers and cichlid fishes, and the implications of introgression for pathogen evolution in an agro-ecological environment. DOI: http://dx.doi.org/10.7554/eLife.04550.001 PMID:25723966
Multiple Origins of the Pathogenic Yeast Candida orthopsilosis by Separate Hybridizations between Two Parental Species.

PubMed

Schröder, Markus S; Martinez de San Vicente, Kontxi; Prandini, Tâmara H R; Hammel, Stephen; Higgins, Desmond G; Bagagli, Eduardo; Wolfe, Kenneth H; Butler, Geraldine

2016-11-01

Mating between different species produces hybrids that are usually asexual and stuck as diploids, but can also lead to the formation of new species. Here, we report the genome sequences of 27 isolates of the pathogenic yeast Candida orthopsilosis. We find that most isolates are diploid hybrids, products of mating between two unknown parental species (A and B) that are 5% divergent in sequence. Isolates vary greatly in the extent of homogenization between A and B, making their genomes a mosaic of highly heterozygous regions interspersed with homozygous regions. Separate phylogenetic analyses of SNPs in the A- and B-derived portions of the genome produces almost identical trees of the isolates with four major clades. However, the presence of two mutually exclusive genotype combinations at the mating type locus, and recombinant mitochondrial genomes diagnostic of inter-clade mating, shows that the species C. orthopsilosis does not have a single evolutionary origin but was created at least four times by separate interspecies hybridizations between parents A and B. Older hybrids have lost more heterozygosity. We also identify two isolates with homozygous genomes derived exclusively from parent A, which are pure non-hybrid strains. The parallel emergence of the same hybrid species from multiple independent hybridization events is common in plant evolution, but is much less documented in pathogenic fungi.
The immune response against Candida spp. and Sporothrix schenckii.

PubMed

Martínez-Álvarez, José A; Pérez-García, Luis A; Flores-Carreón, Arturo; Mora-Montes, Héctor M

2014-01-01

Candida albicans is the main causative agent of systemic candidiasis, a condition with high mortality rates. The study of the interaction between C. albicans and immune system components has been thoroughly studied and nowadays there is a model for the anti-C. albicans immune response; however, little is known about the sensing of other pathogenic species of the Candida genus. Sporothrix schenckii is the causative agent of sporotrichosis, a subcutaneous mycosis, and thus far there is limited information about its interaction with the immune system. In this paper, we review the most recent information about the immune sensing of species from genus Candida and S. schenckii. Thoroughly searches in scientific journal databases were performed, looking for papers addressing either Candida- or Sporothrix-immune system interactions. There is a significant advance in the knowledge of non-C. albicans species of Candida and Sporothrix immune sensing; however, there are still relevant points to address, such as the specific contribution of pathogen-associated molecular patterns (PAMPs) for sensing by different immune cells and the immune receptors involved in such interactions. This manuscript is part of the series of works presented at the "V International Workshop: Molecular genetic approaches to the study of human pathogenic fungi" (Oaxaca, Mexico, 2012). Copyright © 2013 Revista Iberoamericana de Micología. Published by Elsevier Espana. All rights reserved.
Candida glabrata's Genome Plasticity Confers a Unique Pattern of Expressed Cell Wall Proteins.

PubMed

López-Fuentes, Eunice; Gutiérrez-Escobedo, Guadalupe; Timmermans, Bea; Van Dijck, Patrick; De Las Peñas, Alejandro; Castaño, Irene

2018-06-05

Candida glabrata is the second most common cause of candidemia, and its ability to adhere to different host cell types, to microorganisms, and to medical devices are important virulence factors. Here, we consider three characteristics that confer extraordinary advantages to C. glabrata within the host. (1) C. glabrata has a large number of genes encoding for adhesins most of which are localized at subtelomeric regions. The number and sequence of these genes varies substantially depending on the strain, indicating that C. glabrata can tolerate high genomic plasticity; (2) The largest family of CWPs (cell wall proteins) is the EPA (epithelial adhesin) family of adhesins. Epa1 is the major adhesin and mediates adherence to epithelial, endothelial and immune cells. Several layers of regulation like subtelomeric silencing, cis- acting regulatory regions, activators, nutritional signaling, and stress conditions tightly regulate the expression of many adhesin-encoding genes in C. glabrata , while many others are not expressed. Importantly, there is a connection between acquired resistance to xenobiotics and increased adherence; (3) Other subfamilies of adhesins mediate adherence to Candida albicans , allowing C. glabrata to efficiently invade the oral epithelium and form robust biofilms. It is noteworthy that every C. glabrata strain analyzed presents a unique pattern of CWPs at the cell surface.
An Integrated Molecular Database on Indian Insects.

PubMed

Pratheepa, Maria; Venkatesan, Thiruvengadam; Gracy, Gandhi; Jalali, Sushil Kumar; Rangheswaran, Rajagopal; Antony, Jomin Cruz; Rai, Anil

2018-01-01

MOlecular Database on Indian Insects (MODII) is an online database linking several databases like Insect Pest Info, Insect Barcode Information System (IBIn), Insect Whole Genome sequence, Other Genomic Resources of National Bureau of Agricultural Insect Resources (NBAIR), Whole Genome sequencing of Honey bee viruses, Insecticide resistance gene database and Genomic tools. This database was developed with a holistic approach for collecting information about phenomic and genomic information of agriculturally important insects. This insect resource database is available online for free at http://cib.res.in. http://cib.res.in/.
Gene flow contributes to diversification of the major fungal pathogen Candida albicans.

PubMed

Ropars, Jeanne; Maufrais, Corinne; Diogo, Dorothée; Marcet-Houben, Marina; Perin, Aurélie; Sertour, Natacha; Mosca, Kevin; Permal, Emmanuelle; Laval, Guillaume; Bouchier, Christiane; Ma, Laurence; Schwartz, Katja; Voelz, Kerstin; May, Robin C; Poulain, Julie; Battail, Christophe; Wincker, Patrick; Borman, Andrew M; Chowdhary, Anuradha; Fan, Shangrong; Kim, Soo Hyun; Le Pape, Patrice; Romeo, Orazio; Shin, Jong Hee; Gabaldon, Toni; Sherlock, Gavin; Bougnoux, Marie-Elisabeth; d'Enfert, Christophe

2018-06-08

Elucidating population structure and levels of genetic diversity and recombination is necessary to understand the evolution and adaptation of species. Candida albicans is the second most frequent agent of human fungal infections worldwide, causing high-mortality rates. Here we present the genomic sequences of 182 C. albicans isolates collected worldwide, including commensal isolates, as well as ones responsible for superficial and invasive infections, constituting the largest dataset to date for this major fungal pathogen. Although, C. albicans shows a predominantly clonal population structure, we find evidence of gene flow between previously known and newly identified genetic clusters, supporting the occurrence of (para)sexuality in nature. A highly clonal lineage, which experimentally shows reduced fitness, has undergone pseudogenization in genes required for virulence and morphogenesis, which may explain its niche restriction. Candida albicans thus takes advantage of both clonality and gene flow to diversify.
Coriandrum sativum L. (Coriander) Essential Oil: Antifungal Activity and Mode of Action on Candida spp., and Molecular Targets Affected in Human Whole-Genome Expression

PubMed Central

Freires, Irlan de Almeida; Murata, Ramiro Mendonça; Furletti, Vivian Fernandes; Sartoratto, Adilson; de Alencar, Severino Matias; Figueira, Glyn Mara; de Oliveira Rodrigues, Janaina Aparecida; Duarte, Marta Cristina Teixeira; Rosalen, Pedro Luiz

2014-01-01

Oral candidiasis is an opportunistic fungal infection of the oral cavity with increasingly worldwide prevalence and incidence rates. Novel specifically-targeted strategies to manage this ailment have been proposed using essential oils (EO) known to have antifungal properties. In this study, we aim to investigate the antifungal activity and mode of action of the EO from Coriandrum sativum L. (coriander) leaves on Candida spp. In addition, we detected the molecular targets affected in whole-genome expression in human cells. The EO phytochemical profile indicates monoterpenes and sesquiterpenes as major components, which are likely to negatively impact the viability of yeast cells. There seems to be a synergistic activity of the EO chemical compounds as their isolation into fractions led to a decreased antimicrobial effect. C. sativum EO may bind to membrane ergosterol, increasing ionic permeability and causing membrane damage leading to cell death, but it does not act on cell wall biosynthesis-related pathways. This mode of action is illustrated by photomicrographs showing disruption in biofilm integrity caused by the EO at varied concentrations. The EO also inhibited Candida biofilm adherence to a polystyrene substrate at low concentrations, and decreased the proteolytic activity of Candida albicans at minimum inhibitory concentration. Finally, the EO and its selected active fraction had low cytotoxicity on human cells, with putative mechanisms affecting gene expression in pathways involving chemokines and MAP-kinase (proliferation/apoptosis), as well as adhesion proteins. These findings highlight the potential antifungal activity of the EO from C. sativum leaves and suggest avenues for future translational toxicological research. PMID:24901768
The Sequenced Angiosperm Genomes and Genome Databases.

PubMed

Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

2018-01-01

Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.

The Sequenced Angiosperm Genomes and Genome Databases

PubMed Central

Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

2018-01-01

Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology. PMID:29706973
MIPS: a database for genomes and protein sequences.

PubMed Central

Mewes, H W; Heumann, K; Kaps, A; Mayer, K; Pfeiffer, F; Stocker, S; Frishman, D

1999-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried near Munich, Germany, develops and maintains genome oriented databases. It is commonplace that the amount of sequence data available increases rapidly, but not the capacity of qualified manual annotation at the sequence databases. Therefore, our strategy aims to cope with the data stream by the comprehensive application of analysis tools to sequences of complete genomes, the systematic classification of protein sequences and the active support of sequence analysis and functional genomics projects. This report describes the systematic and up-to-date analysis of genomes (PEDANT), a comprehensive database of the yeast genome (MYGD), a database reflecting the progress in sequencing the Arabidopsis thaliana genome (MATD), the database of assembled, annotated human EST clusters (MEST), and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). MIPS provides access through its WWW server (http://www.mips.biochem.mpg.de) to a spectrum of generic databases, including the above mentioned as well as a database of protein families (PROTFAM), the MITOP database, and the all-against-all FASTA database. PMID:9847138
Genome Comparison of Candida orthopsilosis Clinical Strains Reveals the Existence of Hybrids between Two Distinct Subspecies

PubMed Central

Pryszcz, Leszek P.; Németh, Tibor; Gácser, Attila; Gabaldón, Toni

2014-01-01

The Candida parapsilosis species complex comprises a group of emerging human pathogens of varying virulence. This complex was recently subdivided into three different species: C. parapsilosis sensu stricto, C. metapsilosis, and C. orthopsilosis. Within the latter, at least two clearly distinct subspecies seem to be present among clinical isolates (Type 1 and Type 2). To gain insight into the genomic differences between these subspecies, we undertook the sequencing of a clinical isolate classified as Type 1 and compared it with the available sequence of a Type 2 clinical strain. Unexpectedly, the analysis of the newly sequenced strain revealed a highly heterozygous genome, which we show to be the consequence of a hybridization event between both identified subspecies. This implicitly suggests that C. orthopsilosis is able to mate, a so-far unanswered question. The resulting hybrid shows a chimeric genome that maintains a similar gene dosage from both parental lineages and displays ongoing loss of heterozygosity. Several of the differences found between the gene content in both strains relate to virulent-related families, with the hybrid strain presenting a higher copy number of genes coding for efflux pumps or secreted lipases. Remarkably, two clinical strains isolated from distant geographical locations (Texas and Singapore) are descendants of the same hybrid line, raising the intriguing possibility of a relationship between the hybridization event and the global spread of a virulent clone. PMID:24747362
gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes.

PubMed

Nakagawa, So; Takahashi, Mahoko Ueda

2016-01-01

In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species.Database URL: http://geve.med.u-tokai.ac.jp. © The Author(s) 2016. Published by Oxford University Press.
gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes

PubMed Central

Nakagawa, So; Takahashi, Mahoko Ueda

2016-01-01

In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species. Database URL: http://geve.med.u-tokai.ac.jp PMID:27242033
Candida Infections and Human Defensins.

PubMed

Polesello, Vania; Segat, Ludovica; Crovella, Sergio; Zupin, Luisa

2017-01-01

Candida species infections are an important worldwide health issue since they do not only affect immunocompromised patients but also healthy individuals. The host developed different mechanisms of protection against Candida infections; specifically the immune system and the innate immune response are the first line of defence. Defensis are a group of antimicrobial peptides, components of the innate immunity, produced at mucosal level and known to be active against bacteria, virus but also fungi. The aim of the current work was to review all previous studies in literature that analysed defensins in the context of Candida spp. infections, in order to investigate and clarify the exact mechanisms of defensins anti-fungal action. Several studies were identified from 1985 to 2017 (9 works form years 1985 to 1999, 44 works ranging from 2000 to 2009 and 35 from 2010 to 2017) searched in two electronic databases (PubMed and Google Scholar). The main key words used for the research were "Candida", "Defensins"," Innate immune system","fungi". The findings of the reviewed studies highlight the pivotal role of defensins antimicrobial peptides in the immune response against Candida infections, since they are able to discriminate host cell from fungi: defensins are able to recognize the pathogens cell wall (different in composition from the human ones), and to disrupt it through membrane permeabilization. However, further research is needed to explain completely defensins' mechanisms of action to fight C. albicans (and other Candida spp.) infections, being the information fragmentary and only in part elucidated. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Tomato functional genomics database (TFGD): a comprehensive collection and analysis package for tomato functional genomics

USDA-ARS?s Scientific Manuscript database

Tomato Functional Genomics Database (TFGD; http://ted.bti.cornell.edu) provides a comprehensive systems biology resource to store, mine, analyze, visualize and integrate large-scale tomato functional genomics datasets. The database is expanded from the previously described Tomato Expression Database...
MIPS: a database for genomes and protein sequences

PubMed Central

Mewes, H. W.; Frishman, D.; Güldener, U.; Mannhaupt, G.; Mayer, K.; Mokrejs, M.; Morgenstern, B.; Münsterkötter, M.; Rudd, S.; Weil, B.

2002-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz–Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91–93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155–158; Barker et al. (2001) Nucleic Acids Res., 29, 29–32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de). PMID:11752246
MIPS: a database for genomes and protein sequences.

PubMed

Mewes, H W; Frishman, D; Güldener, U; Mannhaupt, G; Mayer, K; Mokrejs, M; Morgenstern, B; Münsterkötter, M; Rudd, S; Weil, B

2002-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).
Mitochondrial Telomeres as Molecular Markers for Identification of the Opportunistic Yeast Pathogen Candida parapsilosis

PubMed Central

Nosek, Jozef; Tomáška, L'ubomír; Ryčovská, Adriana; Fukuhara, Hiroshi

2002-01-01

Recent studies have demonstrated that a large number of organisms carry linear mitochondrial DNA molecules possessing specialized telomeric structures at their ends. Based on this specific structural feature of linear mitochondrial genomes, we have developed an approach for identification of the opportunistic yeast pathogen Candida parapsilosis. The strategy for identification of C. parapsilosis strains is based on PCR amplification of specific DNA sequences derived from the mitochondrial telomere region. This assay is complemented by immunodetection of a protein component of mitochondrial telomeres. The results demonstrate that mitochondrial telomeres represent specific molecular markers with potential applications in yeast diagnostics and taxonomy. PMID:11923346
MIPS: analysis and annotation of proteins from whole genomes

PubMed Central

Mewes, H. W.; Amid, C.; Arnold, R.; Frishman, D.; Güldener, U.; Mannhaupt, G.; Münsterkötter, M.; Pagel, P.; Strack, N.; Stümpflen, V.; Warfsmann, J.; Ruepp, A.

2004-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein–protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de). PMID:14681354
MIPS: analysis and annotation of proteins from whole genomes.

PubMed

Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A

2004-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).
Interleukin-2 and other cytokines in candidiasis: expression, clinical significance, and future therapeutic targets.

PubMed

Rodríguez-Cerdeira, Carmen; Carnero-Gregorio, Miguel; López-Barcenas, Adriana; Fabbrocini, Gabriella; Sanchez-Blanco, Elena; Alba-Menendez, Alfonso; Guzmán, Roberto Arenas

2018-06-01

Susceptibility to Candida spp. infection is largely determined by the status of host immunity, whether immunocompromised/immunodeficient or immunocompetent. Interleukin-2 (IL-2), a potent lymphoid cell growth factor, is a four-α-helix bundle cytokine induced by activated T cells with two important roles: the activation and maintenance of immune responses, and lymphocyte production and differentiation. We reviewed the roles of cytokines as immune stimulators and suppressors of Candida spp. infections as an update on this continuously evolving field. We performed a comprehensive search of the Cochrane Central Register of Controlled Trials, Medline (PubMed), and Embase databases for articles published from March 2010 to March 2016 using the following search terms: interleukins, interleukin-2, Candida spp., and immunosuppression. Data from our own studies were also reviewed. Here, we provide an overview focusing on the ability of IL-2 to induce a large panel of trafficking receptors in skin inflammation and control T helper (Th)2 cytokine production in response to contact with Candida spp. Immunocompromised patients have reduced capacity to secrete Th1-related cytokines such as IL-2. The ability to secrete the Th1-related cytokine IL-2 is low in immunocompromised patients. This prevents an efficient Th1 immune response to Candida spp. antigens, making immunocompromised patients more susceptible to candidal infections.
A web-based genomic sequence database for the Streptomycetaceae: a tool for systematics and genome mining

USDA-ARS?s Scientific Manuscript database

The ARS Microbial Genome Sequence Database (http://199.133.98.43), a web-based database server, was established utilizing the BIGSdb (Bacterial Isolate Genomics Sequence Database) software package, developed at Oxford University, as a tool to manage multi-locus sequence data for the family Streptomy...
Unlimited Thirst for Genome Sequencing, Data Interpretation, and Database Usage in Genomic Era: The Road towards Fast-Track Crop Plant Improvement

PubMed Central

Govindaraj, Mahalingam

2015-01-01

The number of sequenced crop genomes and associated genomic resources is growing rapidly with the advent of inexpensive next generation sequencing methods. Databases have become an integral part of all aspects of science research, including basic and applied plant and animal sciences. The importance of databases keeps increasing as the volume of datasets from direct and indirect genomics, as well as other omics approaches, keeps expanding in recent years. The databases and associated web portals provide at a minimum a uniform set of tools and automated analysis across a wide range of crop plant genomes. This paper reviews some basic terms and considerations in dealing with crop plant databases utilization in advancing genomic era. The utilization of databases for variation analysis with other comparative genomics tools, and data interpretation platforms are well described. The major focus of this review is to provide knowledge on platforms and databases for genome-based investigations of agriculturally important crop plants. The utilization of these databases in applied crop improvement program is still being achieved widely; otherwise, the end for sequencing is not far away. PMID:25874133
Identification and susceptibility of clinical isolates of Candida spp. to killer toxins.

PubMed

Robledo-Leal, E; Rivera-Morales, L G; Sangorrín, M P; González, G M; Ramos-Alfano, G; Adame-Rodriguez, J M; Alcocer-Gonzalez, J M; Arechiga-Carvajal, E T; Rodriguez-Padilla, C

2018-02-01

Although invasive infections and mortality caused by Candida species are increasing among compromised patients, resistance to common antifungal agents is also an increasing problem. We analyzed 60 yeasts isolated from patients with invasive candidiasis using a PCR/RFLP strategy based on the internal transcribed spacer (ITS2) region to identify different Candida pathogenic species. PCR analysis was performed from genomic DNA with a primer pair of the ITS2-5.8S rDNA region. PCR-positive samples were characterized by RFLP. Restriction resulted in 23 isolates identified as C. albicans using AlwI, 24 isolates as C. parapsilosis using RsaI, and 13 as C. tropicalis using XmaI. Then, a group of all isolates were evaluated for their susceptibility to a panel of previously described killer yeasts, resulting in 75% being susceptible to at least one killer yeast while the remaining were not inhibited by any strain. C. albicans was the most susceptible group while C. tropicalis had the fewest inhibitions. No species-specific pattern of inhibition was obtained with this panel of killer yeasts. Metschnikowia pulcherrima, Pichia kluyveri and Wickerhamomyces anomalus were the strains that inhibited the most isolates of Candida spp.
Use of restriction fragment length polymorphism to identify Candida species, related to onychomycosis

PubMed Central

Mohammadi, Rasoul; Badiee, Parisa; Badali, Hamid; Abastabar, Mahdi; Safa, Ahmad Hosseini; Hadipour, Mahboubeh; Yazdani, Hajar; Heshmat, Farnaz

2015-01-01

Background: Onychomycosis is one of the most common clinical forms of fungal infections due to both filamentous fungi and yeasts. The genus of Candida is one of the most prominent causes of onychomycosis in all around the world. Although Candida albicans is still the most frequent cause of nail infections, use of broad-spectrum antifungal agents has led to a shift in the etiology of C. albicans to non-albicans species. The aim of the present study is rapid and precise identification of candida species isolated from nail infection by using of PCR-RFLP technique. Materials and Methods: A total of 360 clinical yeast strains were collected from nail infections in Iran. Genomic DNA was extracted using FTA; cards. ITS1-5.8SrDNA-ITS2 region was amplified using universal primers and subsequently products were digested with the restriction enzyme MspI. For identification of newly described species (C. parapsilosis complex), the SADH gene was amplified, followed by digestion with Nla III restriction enzyme. Results: Candida albicans was the most commonly isolated species (41.1%), followed by C. parapsilosis (21.4%), C. tropicalis (12.8%), C. kefyr (9.4%), C. krusei (5.5%), C. orthopsilosis (4.1%), C. glabrata (2.8%), C. guilliermondii (1.4%), C. rugosa (0.8%), and C. lusitaniae (0.5%). Patients in the age groups of 51-60 and 81-90 years had the highest and lowest distribution of positive specimens, respectively. Conclusion: Rapid and precise identification of Candida species from clinical specimens lead to appropriate therapeutic plans. PMID:26015921
Standards for Clinical Grade Genomic Databases.

PubMed

Yohe, Sophia L; Carter, Alexis B; Pfeifer, John D; Crawford, James M; Cushman-Vokoun, Allison; Caughron, Samuel; Leonard, Debra G B

2015-11-01

Next-generation sequencing performed in a clinical environment must meet clinical standards, which requires reproducibility of all aspects of the testing. Clinical-grade genomic databases (CGGDs) are required to classify a variant and to assist in the professional interpretation of clinical next-generation sequencing. Applying quality laboratory standards to the reference databases used for sequence-variant interpretation presents a new challenge for validation and curation. To define CGGD and the categories of information contained in CGGDs and to frame recommendations for the structure and use of these databases in clinical patient care. Members of the College of American Pathologists Personalized Health Care Committee reviewed the literature and existing state of genomic databases and developed a framework for guiding CGGD development in the future. Clinical-grade genomic databases may provide different types of information. This work group defined 3 layers of information in CGGDs: clinical genomic variant repositories, genomic medical data repositories, and genomic medicine evidence databases. The layers are differentiated by the types of genomic and medical information contained and the utility in assisting with clinical interpretation of genomic variants. Clinical-grade genomic databases must meet specific standards regarding submission, curation, and retrieval of data, as well as the maintenance of privacy and security. These organizing principles for CGGDs should serve as a foundation for future development of specific standards that support the use of such databases for patient care.
Private and Efficient Query Processing on Outsourced Genomic Databases.

PubMed

Ghasemi, Reza; Al Aziz, Md Momin; Mohammed, Noman; Dehkordi, Massoud Hadian; Jiang, Xiaoqian

2017-09-01

Applications of genomic studies are spreading rapidly in many domains of science and technology such as healthcare, biomedical research, direct-to-consumer services, and legal and forensic. However, there are a number of obstacles that make it hard to access and process a big genomic database for these applications. First, sequencing genomic sequence is a time consuming and expensive process. Second, it requires large-scale computation and storage systems to process genomic sequences. Third, genomic databases are often owned by different organizations, and thus, not available for public usage. Cloud computing paradigm can be leveraged to facilitate the creation and sharing of big genomic databases for these applications. Genomic data owners can outsource their databases in a centralized cloud server to ease the access of their databases. However, data owners are reluctant to adopt this model, as it requires outsourcing the data to an untrusted cloud service provider that may cause data breaches. In this paper, we propose a privacy-preserving model for outsourcing genomic data to a cloud. The proposed model enables query processing while providing privacy protection of genomic databases. Privacy of the individuals is guaranteed by permuting and adding fake genomic records in the database. These techniques allow cloud to evaluate count and top-k queries securely and efficiently. Experimental results demonstrate that a count and a top-k query over 40 Single Nucleotide Polymorphisms (SNPs) in a database of 20 000 records takes around 100 and 150 s, respectively.
Private and Efficient Query Processing on Outsourced Genomic Databases

PubMed Central

Ghasemi, Reza; Al Aziz, Momin; Mohammed, Noman; Dehkordi, Massoud Hadian; Jiang, Xiaoqian

2017-01-01

Applications of genomic studies are spreading rapidly in many domains of science and technology such as healthcare, biomedical research, direct-to-consumer services, and legal and forensic. However, there are a number of obstacles that make it hard to access and process a big genomic database for these applications. First, sequencing genomic sequence is a time-consuming and expensive process. Second, it requires large-scale computation and storage systems to processes genomic sequences. Third, genomic databases are often owned by different organizations and thus not available for public usage. Cloud computing paradigm can be leveraged to facilitate the creation and sharing of big genomic databases for these applications. Genomic data owners can outsource their databases in a centralized cloud server to ease the access of their databases. However, data owners are reluctant to adopt this model, as it requires outsourcing the data to an untrusted cloud service provider that may cause data breaches. In this paper, we propose a privacy-preserving model for outsourcing genomic data to a cloud. The proposed model enables query processing while providing privacy protection of genomic databases. Privacy of the individuals is guaranteed by permuting and adding fake genomic records in the database. These techniques allow cloud to evaluate count and top-k queries securely and efficiently. Experimental results demonstrate that a count and a top-k query over 40 SNPs in a database of 20,000 records takes around 100 and 150 seconds, respectively. PMID:27834660

Assembly of a phased diploid Candida albicans genome facilitates allele-specific measurements and provides a simple model for repeat and indel structure

PubMed Central

2013-01-01

Background Candida albicans is a ubiquitous opportunistic fungal pathogen that afflicts immunocompromised human hosts. With rare and transient exceptions the yeast is diploid, yet despite its clinical relevance the respective sequences of its two homologous chromosomes have not been completely resolved. Results We construct a phased diploid genome assembly by deep sequencing a standard laboratory wild-type strain and a panel of strains homozygous for particular chromosomes. The assembly has 700-fold coverage on average, allowing extensive revision and expansion of the number of known SNPs and indels. This phased genome significantly enhances the sensitivity and specificity of allele-specific expression measurements by enabling pooling and cross-validation of signal across multiple polymorphic sites. Additionally, the diploid assembly reveals pervasive and unexpected patterns in allelic differences between homologous chromosomes. Firstly, we see striking clustering of indels, concentrated primarily in the repeat sequences in promoters. Secondly, both indels and their repeat-sequence substrate are enriched near replication origins. Finally, we reveal an intimate link between repeat sequences and indels, which argues that repeat length is under selective pressure for most eukaryotes. This connection is described by a concise one-parameter model that explains repeat-sequence abundance in C. albicans as a function of the indel rate, and provides a general framework to interpret repeat abundance in species ranging from bacteria to humans. Conclusions The phased genome assembly and insights into repeat plasticity will be valuable for better understanding allele-specific phenomena and genome evolution. PMID:24025428
ReprDB and panDB: minimalist databases with maximal microbial representation.

PubMed

Zhou, Wei; Gay, Nicole; Oh, Julia

2018-01-18

Profiling of shotgun metagenomic samples is hindered by a lack of unified microbial reference genome databases that (i) assemble genomic information from all open access microbial genomes, (ii) have relatively small sizes, and (iii) are compatible to various metagenomic read mapping tools. Moreover, computational tools to rapidly compile and update such databases to accommodate the rapid increase in new reference genomes do not exist. As a result, database-guided analyses often fail to profile a substantial fraction of metagenomic shotgun sequencing reads from complex microbiomes. We report pipelines that efficiently traverse all open access microbial genomes and assemble non-redundant genomic information. The pipelines result in two species-resolution microbial reference databases of relatively small sizes: reprDB, which assembles microbial representative or reference genomes, and panDB, for which we developed a novel iterative alignment algorithm to identify and assemble non-redundant genomic regions in multiple sequenced strains. With the databases, we managed to assign taxonomic labels and genome positions to the majority of metagenomic reads from human skin and gut microbiomes, demonstrating a significant improvement over a previous database-guided analysis on the same datasets. reprDB and panDB leverage the rapid increases in the number of open access microbial genomes to more fully profile metagenomic samples. Additionally, the databases exclude redundant sequence information to avoid inflated storage or memory space and indexing or analyzing time. Finally, the novel iterative alignment algorithm significantly increases efficiency in pan-genome identification and can be useful in comparative genomic analyses.
The integrated web service and genome database for agricultural plants with biotechnology information.

PubMed

Kim, Changkug; Park, Dongsuk; Seol, Youngjoo; Hahn, Jangho

2011-01-01

The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage.
Candida albicans Iff11, a secreted protein required for cell wall structure and virulence.

PubMed

Bates, Steven; de la Rosa, José M; MacCallum, Donna M; Brown, Alistair J P; Gow, Neil A R; Odds, Frank C

2007-06-01

The Candida albicans cell wall is the immediate point of contact with the host and is implicated in the host-fungal interaction and virulence. To date, a number of cell wall proteins have been identified and associated with virulence. Analysis of the C. albicans genome has identified the IFF gene family as encoding the largest family of cell wall-related proteins. This family is also conserved in a range of other Candida species. Iff11 differs from other family members in lacking a GPI anchor, and we have demonstrated it to be O glycosylated and secreted in C. albicans. A null mutant lacking IFF11 was hypersensitive to cell wall-damaging agents, suggesting a role in cell wall organization. In a murine model of systemic infection the null mutant was highly attenuated in virulence, and survival-standardized infections suggest it is required to establish an infection. This work provides the first evidence of the importance of this gene family in the host-fungal interaction and virulence.
Candida utilis and Cyberlindnera (Pichia) jadinii: yeast relatives with expanding applications.

PubMed

Buerth, Christoph; Tielker, Denis; Ernst, Joachim F

2016-08-01

The yeast Candida utilis is used as a food additive and as a host for heterologous gene expression to produce various metabolites and proteins. Reliable protocols for intracellular production of recombinant proteins are available for C. utilis and have now been expanded to secrete proteins into the growth medium or to achieve surface display by linkage to a cell wall protein. A recombinant C. utilis strain was recently shown to induce oral tolerance in a mouse model of multiple sclerosis suggesting future applications in autoimmune therapy. Whole genome sequencing of C. utilis and its presumed parent Cyberlindnera (Pichia) jadinii demonstrated different ploidy but high sequence identity, consistent with identical recombinant technologies for both yeasts. C. jadinii was recently described as an antagonist to the important human fungal pathogen Candida albicans suggesting its use as a probiotic agent. The review summarizes the status of recombinant protein production in C. utilis, as well as current and future biotechnological and medical applications of C. utilis and C. jadinii.
The integrated web service and genome database for agricultural plants with biotechnology information

PubMed Central

Kim, ChangKug; Park, DongSuk; Seol, YoungJoo; Hahn, JangHo

2011-01-01

The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage. PMID:21887015
MALDI-TOF MS as a tool to identify foodborne yeasts and yeast-like fungi.

PubMed

Quintilla, Raquel; Kolecka, Anna; Casaregola, Serge; Daniel, Heide M; Houbraken, Jos; Kostrzewa, Markus; Boekhout, Teun; Groenewald, Marizeth

2018-02-02

Since food spoilage by yeasts causes high economic losses, fast and accurate identifications of yeasts associated with food and food-related products are important for the food industry. In this study the efficiency of the matrix assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) to identify food related yeasts was evaluated. A CBS in-house MALDI-TOF MS database was created and later challenged with a blinded test set of 146 yeast strains obtained from food and food related products. Ninety eight percent of the strains were correctly identified with log score values>1.7. One strain, Mrakia frigida, gained a correct identification with a score value<1.7. Two strains could not be identified at first as they represented a mix of two different species. These mixes were Rhodotorula babjevae with Meyerozyma caribbica and Clavispora lusitaniae with Debaryomyces hansenii. After separation, all four species could be correctly identified with scores>1.7. Ambiguous identifications were observed due to two incorrect reference mass spectra's found in the commercial database BDAL v.4.0, namely Candida sake DSM 70763 which was re-identified as Candida oleophila, and Candida inconspicua DSM 70631 which was re-identified as Pichia membranifaciens. MALDI-TOF MS can distinguish between most of the species, but for some species complexes, such as the Kazachstania telluris and Mrakia frigida complexes, MALDI-TOF MS showed limited resolution and identification of sibling species was sometimes problematic. Despite this, we showed that the MALDI-TOF MS is applicable for routine identification and validation of foodborne yeasts, but a further update of the commercial reference databases is needed. Copyright © 2017 Elsevier B.V. All rights reserved.
The Elusive Anti-Candida Vaccine: Lessons From the Past and Opportunities for the Future

PubMed Central

Tso, Gloria Hoi Wan; Reales-Calderon, Jose Antonio; Pavelka, Norman

2018-01-01

Candidemia is a bloodstream fungal infection caused by Candida species and is most commonly observed in hospitalized patients. Even with proper antifungal drug treatment, mortality rates remain high at 40–50%. Therefore, prophylactic or preemptive antifungal medications are currently recommended in order to prevent infections in high-risk patients. Moreover, the majority of women experience at least one episode of vulvovaginal candidiasis (VVC) throughout their lifetime and many of them suffer from recurrent VVC (RVVC) with frequent relapses for the rest of their lives. While there currently exists no definitive cure, the only available treatment for RVVC is again represented by antifungal drug therapy. However, due to the limited number of existing antifungal drugs, their associated side effects and the increasing occurrence of drug resistance, other approaches are greatly needed. An obvious prevention measure for candidemia or RVVC relapse would be to immunize at-risk patients with a vaccine effective against Candida infections. In spite of the advanced and proven techniques successfully applied to the development of antibacterial or antiviral vaccines, however, no antifungal vaccine is still available on the market. In this review, we first summarize various efforts to date in the development of anti-Candida vaccines, highlighting advantages and disadvantages of each strategy. We next unfold and discuss general hurdles encountered along these efforts, such as the existence of large genomic variation and phenotypic plasticity across Candida strains and species, and the difficulty in mounting protective immune responses in immunocompromised or immunosuppressed patients. Lastly, we review the concept of “trained immunity” and discuss how induction of this rapid and nonspecific immune response may potentially open new and alternative preventive strategies against opportunistic infections by Candida species and potentially other pathogens. PMID:29755472
Molecular identification and distribution profile of Candida species isolated from Iranian patients.

PubMed

Mohammadi, Rasoul; Mirhendi, Hossein; Rezaei-Matehkolaei, Ali; Ghahri, Mohammad; Shidfar, Mohammad Reza; Jalalizand, Nilufar; Makimura, Koichi

2013-08-01

A total of 855 yeast strains isolated from different clinical specimens, mainly nail (42%) and vulva-vagina (25%) were identified by a set of polymerase chain reaction-restriction fragment length polymorphisms (PCR-RFLP). Genomic DNA was extracted from fresh colonies using Whatman FTA Card technology. PCR assays were performed on the complete ribosomal DNA internal transcribed spacer (rDNA-ITS) region for all isolates and species identification was carried out through their specific electrophoretic profiles after digestion with the enzyme MspI. Those isolates suspected as Candida parapsilosis group were then subjected to amplification of the secondary alcohol dehydrogenase (SADH) gene and restriction digestion with NlaIII enzyme. In total, 71.1% of the strains were obtained from females and 28.9% from males. The age group of 31-40 years consisted of the highest frequency of patients with candidiasis. Candida albicans was the predominant species (58.6%) followed by C. parapsilosis (11.0%), C. glabrata (8.3%), C. tropicalis (7.0%), C. kefyr (5.8%), C. krusei (4.4%), C. orthopsilosis (2.1%), and C. guilliermondii (0.6%). A few strains of C. lusitaniae, C. rugosa, C. intermedia, C. inconspicua, C. neoformans and S. cerevisiae were isolated. We could not identify 8 (0.9%) isolates. Candida albicans remains the most frequently species isolated from Iranian patients; however, the number of non-C. albicans Candida species looks to be increasing. The simple and reliable PCR-RFLP system used in the study has the potential to identify most clinically isolated yeasts.
PGSB/MIPS PlantsDB Database Framework for the Integration and Analysis of Plant Genome Data.

PubMed

Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai; Gundlach, Heidrun; Mayer, Klaus F X

2017-01-01

Plant Genome and Systems Biology (PGSB), formerly Munich Institute for Protein Sequences (MIPS) PlantsDB, is a database framework for the integration and analysis of plant genome data, developed and maintained for more than a decade now. Major components of that framework are genome databases and analysis resources focusing on individual (reference) genomes providing flexible and intuitive access to data. Another main focus is the integration of genomes from both model and crop plants to form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny). Data exchange and integrated search functionality with/over many plant genome databases is provided within the transPLANT project.
Cloning and characterization of a Candida albicans maltase gene involved in sucrose utilization.

PubMed Central

Geber, A; Williamson, P R; Rex, J H; Sweeney, E C; Bennett, J E

1992-01-01

In order to isolate the structural gene involved in sucrose utilization, we screened a sucrose-induced Candida albicans cDNA library for clones expressing alpha-glucosidase activity. The C. albicans maltase structural gene (CAMAL2) was isolated. No other clones expressing alpha-glucosidase activity. were detected. A genomic CAMAL2 clone was obtained by screening a size-selected genomic library with the cDNA clone. DNA sequence analysis reveals that CAMAL2 encodes a 570-amino-acid protein which shares 50% identity with the maltase structural gene (MAL62) of Saccharomyces carlsbergensis. The substrate specificity of the recombinant protein purified from Escherichia coli identifies the enzyme as a maltase. Northern (RNA) analysis reveals that transcription of CAMAL2 is induced by maltose and sucrose and repressed by glucose. These results suggest that assimilation of sucrose in C. albicans relies on an inducible maltase enzyme. The family of genes controlling sucrose utilization in C. albicans shares similarities with the MAL gene family of Saccharomyces cerevisiae and provides a model system for studying gene regulation in this pathogenic yeast. Images PMID:1400249
Recent updates and developments to plant genome size databases

PubMed Central

Garcia, Sònia; Leitch, Ilia J.; Anadon-Rosell, Alba; Canela, Miguel Á.; Gálvez, Francisco; Garnatje, Teresa; Gras, Airy; Hidalgo, Oriane; Johnston, Emmeline; Mas de Xaxars, Gemma; Pellicer, Jaume; Siljak-Yakovlev, Sonja; Vallès, Joan; Vitales, Daniel; Bennett, Michael D.

2014-01-01

Two plant genome size databases have been recently updated and/or extended: the Plant DNA C-values database (http://data.kew.org/cvalues), and GSAD, the Genome Size in Asteraceae database (http://www.asteraceaegenomesize.com). While the first provides information on nuclear DNA contents across land plants and some algal groups, the second is focused on one of the largest and most economically important angiosperm families, Asteraceae. Genome size data have numerous applications: they can be used in comparative studies on genome evolution, or as a tool to appraise the cost of whole-genome sequencing programs. The growing interest in genome size and increasing rate of data accumulation has necessitated the continued update of these databases. Currently, the Plant DNA C-values database (Release 6.0, Dec. 2012) contains data for 8510 species, while GSAD has 1219 species (Release 2.0, June 2013), representing increases of 17 and 51%, respectively, in the number of species with genome size data, compared with previous releases. Here we provide overviews of the most recent releases of each database, and outline new features of GSAD. The latter include (i) a tool to visually compare genome size data between species, (ii) the option to export data and (iii) a webpage containing information about flow cytometry protocols. PMID:24288377
Global Analysis of the Fungal Microbiome in Cystic Fibrosis Patients Reveals Loss of Function of the Transcriptional Repressor Nrg1 as a Mechanism of Pathogen Adaptation

PubMed Central

Kim, Sang Hu; Clark, Shawn T.; Surendra, Anuradha; Copeland, Julia K.; Wang, Pauline W.; Ammar, Ron; Collins, Cathy; Tullis, D. Elizabeth; Nislow, Corey; Hwang, David M.; Guttman, David S.; Cowen, Leah E.

2015-01-01

The microbiome shapes diverse facets of human biology and disease, with the importance of fungi only beginning to be appreciated. Microbial communities infiltrate diverse anatomical sites as with the respiratory tract of healthy humans and those with diseases such as cystic fibrosis, where chronic colonization and infection lead to clinical decline. Although fungi are frequently recovered from cystic fibrosis patient sputum samples and have been associated with deterioration of lung function, understanding of species and population dynamics remains in its infancy. Here, we coupled high-throughput sequencing of the ribosomal RNA internal transcribed spacer 1 (ITS1) with phenotypic and genotypic analyses of fungi from 89 sputum samples from 28 cystic fibrosis patients. Fungal communities defined by sequencing were concordant with those defined by culture-based analyses of 1,603 isolates from the same samples. Different patients harbored distinct fungal communities. There were detectable trends, however, including colonization with Candida and Aspergillus species, which was not perturbed by clinical exacerbation or treatment. We identified considerable inter- and intra-species phenotypic variation in traits important for host adaptation, including antifungal drug resistance and morphogenesis. While variation in drug resistance was largely between species, striking variation in morphogenesis emerged within Candida species. Filamentation was uncoupled from inducing cues in 28 Candida isolates recovered from six patients. The filamentous isolates were resistant to the filamentation-repressive effects of Pseudomonas aeruginosa, implicating inter-kingdom interactions as the selective force. Genome sequencing revealed that all but one of the filamentous isolates harbored mutations in the transcriptional repressor NRG1; such mutations were necessary and sufficient for the filamentous phenotype. Six independent nrg1 mutations arose in Candida isolates from different patients, providing a poignant example of parallel evolution. Together, this combined clinical-genomic approach provides a high-resolution portrait of the fungal microbiome of cystic fibrosis patient lungs and identifies a genetic basis of pathogen adaptation. PMID:26588216
Global Analysis of the Fungal Microbiome in Cystic Fibrosis Patients Reveals Loss of Function of the Transcriptional Repressor Nrg1 as a Mechanism of Pathogen Adaptation.

PubMed

Kim, Sang Hu; Clark, Shawn T; Surendra, Anuradha; Copeland, Julia K; Wang, Pauline W; Ammar, Ron; Collins, Cathy; Tullis, D Elizabeth; Nislow, Corey; Hwang, David M; Guttman, David S; Cowen, Leah E

2015-11-01

The microbiome shapes diverse facets of human biology and disease, with the importance of fungi only beginning to be appreciated. Microbial communities infiltrate diverse anatomical sites as with the respiratory tract of healthy humans and those with diseases such as cystic fibrosis, where chronic colonization and infection lead to clinical decline. Although fungi are frequently recovered from cystic fibrosis patient sputum samples and have been associated with deterioration of lung function, understanding of species and population dynamics remains in its infancy. Here, we coupled high-throughput sequencing of the ribosomal RNA internal transcribed spacer 1 (ITS1) with phenotypic and genotypic analyses of fungi from 89 sputum samples from 28 cystic fibrosis patients. Fungal communities defined by sequencing were concordant with those defined by culture-based analyses of 1,603 isolates from the same samples. Different patients harbored distinct fungal communities. There were detectable trends, however, including colonization with Candida and Aspergillus species, which was not perturbed by clinical exacerbation or treatment. We identified considerable inter- and intra-species phenotypic variation in traits important for host adaptation, including antifungal drug resistance and morphogenesis. While variation in drug resistance was largely between species, striking variation in morphogenesis emerged within Candida species. Filamentation was uncoupled from inducing cues in 28 Candida isolates recovered from six patients. The filamentous isolates were resistant to the filamentation-repressive effects of Pseudomonas aeruginosa, implicating inter-kingdom interactions as the selective force. Genome sequencing revealed that all but one of the filamentous isolates harbored mutations in the transcriptional repressor NRG1; such mutations were necessary and sufficient for the filamentous phenotype. Six independent nrg1 mutations arose in Candida isolates from different patients, providing a poignant example of parallel evolution. Together, this combined clinical-genomic approach provides a high-resolution portrait of the fungal microbiome of cystic fibrosis patient lungs and identifies a genetic basis of pathogen adaptation.
MIPS: a database for protein sequences and complete genomes.

PubMed Central

Mewes, H W; Hani, J; Pfeiffer, F; Frishman, D

1998-01-01

The MIPS group [Munich Information Center for Protein Sequences of the German National Center for Environment and Health (GSF)] at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, is involved in a number of data collection activities, including a comprehensive database of the yeast genome, a database reflecting the progress in sequencing the Arabidopsis thaliana genome, the systematic analysis of other small genomes and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). Through its WWW server (http://www.mips.biochem.mpg.de ) MIPS provides access to a variety of generic databases, including a database of protein families as well as automatically generated data by the systematic application of sequence analysis algorithms. The yeast genome sequence and its related information was also compiled on CD-ROM to provide dynamic interactive access to the 16 chromosomes of the first eukaryotic genome unraveled. PMID:9399795
The Effects of Signal Erosion and Core Genome Reduction on the Identification of Diagnostic Markers

DTIC Science & Technology

2016-09-20

31 diagnostics for the identification of bacterial pathogens. To do this effectively, 32 genomics databases must be comprehensive to identify the...diverse B. 118 pseudomallei/mallei strains were sequenced, assembled, and deposited in public 119 databases (Supplemental Table 1); these genomes were...combined with 160 B. 120 pseudomallei/mallei genome assemblies already in public databases . Most of the 121 genomes (n=779) in this study were
Molecular cloning of a gene encoding translation initiation factor (TIF) from Candida albicans.

PubMed

Mirbod, F; Nakashima, S; Kitajima, Y; Ghannoum, M A; Cannon, R D; Nozawa, Y

1996-01-01

The differential display technique was applied to compare mRNAs from two clinical isolates of Candida albicans with different virulence; high (potent strain, 16240) and low (weak strain, 18084) extracellular phospholipase activities. Complementary DNA fragments corresponding to several apparently differentially expressed mRNAs were recovered and sequenced. A complementary DNA fragment seen distinctly in the potent phospholipase producing strain was highly homologous to the yeast translation initiation factor (TIF). The selected DNA fragment was then used as a probe to isolate its corresponding complementary DNA clone from a library of C. albicans genomic DNA. The sequence of isolated gene revealed an open reading frame of 1194 nucleotides with the potential to encode a protein of 397 amino acids with a predicted molecular weight of 43 kDa. Over its entire length, the amino acid sequence showed strong homology (78-89%) to Saccharomyces cerevisiae TIF and (63-80%) to mouse eIF-4A proteins. Therefore, our C. albicans gene was identified to be TIF (Ca TIF). Northern blot analysis in the two strains of C. albicans revealed that Ca TIF expression is 1.5-fold higher in the potent phospholipase producing strain. The restriction endonuclease digestion of genomic DNA from this potent strain revealed at least two hybridized bands in Southern blot analysis, suggesting two or more closely related sequences in the C. albicans genome.
Comparative study on fermentation performance in the genome shuffled Candida versatilis and wild-type salt tolerant yeast strain.

PubMed

Qi, Wei; Guo, Hong-Lian; Wang, Chun-Ling; Hou, Li-Hua; Cao, Xiao-Hong; Liu, Jin-Fu; Lu, Fu-Ping

2017-01-01

The fermentation performance of a genome-shuffled strain of Candida versatilis S3-5, isolated for improved tolerance to salt, and wild-type (WT) strain were analysed. The fermentation parameters, such as growth, reducing sugar, ethanol, organic acids and volatile compounds, were detected during soy sauce fermentation process. The results showed that ethanol produced by the genome shuffled strain S3-5 was increasing at a faster rate and to a greater extent than WT. At the end of the fermentation, malic acid, citric acid and succinic acid formed in tricarboxylic acid cycle after S3-5 treatment elevated by 39.20%, 6.85% and 17.09% compared to WT, respectively. Moreover, flavour compounds such as phenethyl acetate, ethyl vanillate, ethyl acetate, isoamyl acetate, ethyl myristate, ethyl pentadecanoate, ethyl palmitate and phenylacetaldehyde produced by S3-5 were 2.26, 2.12, 2.87, 34.41, 6.32, 13.64, 2.23 and 78.85 times as compared to WT. S3-5 exhibited enhanced metabolic ability as compared to the wild-type strain, improved conversion of sugars to ethanol, metabolism of organic acid and formation of volatile compounds, especially esters, Moreover, S3-5 might be an ester-flavour type salt-tolerant yeast. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
Brassica ASTRA: an integrated database for Brassica genomic research.

PubMed

Love, Christopher G; Robinson, Andrew J; Lim, Geraldine A C; Hopkins, Clare J; Batley, Jacqueline; Barker, Gary; Spangenberg, German C; Edwards, David

2005-01-01

Brassica ASTRA is a public database for genomic information on Brassica species. The database incorporates expressed sequences with Swiss-Prot and GenBank comparative sequence annotation as well as secondary Gene Ontology (GO) annotation derived from the comparison with Arabidopsis TAIR GO annotations. Simple sequence repeat molecular markers are identified within resident sequences and mapped onto the closely related Arabidopsis genome sequence. Bacterial artificial chromosome (BAC) end sequences derived from the Multinational Brassica Genome Project are also mapped onto the Arabidopsis genome sequence enabling users to identify candidate Brassica BACs corresponding to syntenic regions of Arabidopsis. This information is maintained in a MySQL database with a web interface providing the primary means of interrogation. The database is accessible at http://hornbill.cspp.latrobe.edu.au.
The Ruby UCSC API: accessing the UCSC genome database using Ruby.

PubMed

Mishima, Hiroyuki; Aerts, Jan; Katayama, Toshiaki; Bonnal, Raoul J P; Yoshiura, Koh-ichiro

2012-09-21

The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast.The API uses the bin index-if available-when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby). Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/.

The Ruby UCSC API: accessing the UCSC genome database using Ruby

PubMed Central

2012-01-01

Background The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. Results The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast. The API uses the bin index—if available—when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby). Conclusions Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/. PMID:22994508
The salinity tolerant poplar database (STPD): a comprehensive database for studying tree salt-tolerant adaption and poplar genomics.

PubMed

Ma, Yazhen; Xu, Ting; Wan, Dongshi; Ma, Tao; Shi, Sheng; Liu, Jianquan; Hu, Quanjun

2015-03-17

Soil salinity is a significant factor that impairs plant growth and agricultural productivity, and numerous efforts are underway to enhance salt tolerance of economically important plants. Populus species are widely cultivated for diverse uses. Especially, they grow in different habitats, from salty soil to mesophytic environment, and are therefore used as a model genus for elucidating physiological and molecular mechanisms of stress tolerance in woody plants. The Salinity Tolerant Poplar Database (STPD) is an integrative database for salt-tolerant poplar genome biology. Currently the STPD contains Populus euphratica genome and its related genetic resources. P. euphratica, with a preference of the salty habitats, has become a valuable genetic resource for the exploitation of tolerance characteristics in trees. This database contains curated data including genomic sequence, genes and gene functional information, non-coding RNA sequences, transposable elements, simple sequence repeats and single nucleotide polymorphisms information of P. euphratica, gene expression data between P. euphratica and Populus tomentosa, and whole-genome alignments between Populus trichocarpa, P. euphratica and Salix suchowensis. The STPD provides useful searching and data mining tools, including GBrowse genome browser, BLAST servers and genome alignments viewer, which can be used to browse genome regions, identify similar sequences and visualize genome alignments. Datasets within the STPD can also be downloaded to perform local searches. A new Salinity Tolerant Poplar Database has been developed to assist studies of salt tolerance in trees and poplar genomics. The database will be continuously updated to incorporate new genome-wide data of related poplar species. This database will serve as an infrastructure for researches on the molecular function of genes, comparative genomics, and evolution in closely related species as well as promote advances in molecular breeding within Populus. The STPD can be accessed at http://me.lzu.edu.cn/stpd/ .
CyanoBase: the cyanobacteria genome database update 2010.

PubMed

Nakao, Mitsuteru; Okamoto, Shinobu; Kohara, Mitsuyo; Fujishiro, Tsunakazu; Fujisawa, Takatomo; Sato, Shusei; Tabata, Satoshi; Kaneko, Takakazu; Nakamura, Yasukazu

2010-01-01

CyanoBase (http://genome.kazusa.or.jp/cyanobase) is the genome database for cyanobacteria, which are model organisms for photosynthesis. The database houses cyanobacteria species information, complete genome sequences, genome-scale experiment data, gene information, gene annotations and mutant information. In this version, we updated these datasets and improved the navigation and the visual display of the data views. In addition, a web service API now enables users to retrieve the data in various formats with other tools, seamlessly.
GenomeHubs: simple containerized setup of a custom Ensembl database and web server for any species

PubMed Central

Kumar, Sujai; Stevens, Lewis; Blaxter, Mark

2017-01-01

Abstract As the generation and use of genomic datasets is becoming increasingly common in all areas of biology, the need for resources to collate, analyse and present data from one or more genome projects is becoming more pressing. The Ensembl platform is a powerful tool to make genome data and cross-species analyses easily accessible through a web interface and a comprehensive application programming interface. Here we introduce GenomeHubs, which provide a containerized environment to facilitate the setup and hosting of custom Ensembl genome browsers. This simplifies mirroring of existing content and import of new genomic data into the Ensembl database schema. GenomeHubs also provide a set of analysis containers to decorate imported genomes with results of standard analyses and functional annotations and support export to flat files, including EMBL format for submission of assemblies and annotations to International Nucleotide Sequence Database Collaboration. Database URL: http://GenomeHubs.org PMID:28605774
DNA transformations of Candida tropicalis with replicating and integrative vectors.

PubMed

Sanglard, D; Fiechter, A

1992-12-01

The alkane-assimilating yeast Candida tropicalis was used as a host for DNA transformations. A stable ade2 mutant (Ha900) obtained by UV-mutagenesis was used as a recipient for different vectors carrying selectable markers. A first vector, pMK16, that was developed for the transformation of C. albicans and carries an ADE2 gene marker and a Candida autonomously replicating sequence (CARS) element promoting autonomous replication, was compatible for transforming Ha900. Two transformant types were observed: (i) pink transformants which easily lose pMK16 under non-selective growth conditions; (ii) white transformants, in which the same plasmid exhibited a higher mitotic stability. In both cases pMK16 could be rescued from these cells in Escherichia coli. A second vector, pADE2, containing the isolated C. tropicalis ADE2, gene, was used to transform Ha900. This vector integrated in the yeast genome at homologous sites of the ade2 locus. Different integration types were observed at one or both ade2 alleles in single or in tandem repeats.
Identification of fungi in shotgun metagenomics datasets

PubMed Central

Donovan, Paul D.; Gonzalez, Gabriel; Higgins, Desmond G.

2018-01-01

Metagenomics uses nucleic acid sequencing to characterize species diversity in different niches such as environmental biomes or the human microbiome. Most studies have used 16S rRNA amplicon sequencing to identify bacteria. However, the decreasing cost of sequencing has resulted in a gradual shift away from amplicon analyses and towards shotgun metagenomic sequencing. Shotgun metagenomic data can be used to identify a wide range of species, but have rarely been applied to fungal identification. Here, we develop a sequence classification pipeline, FindFungi, and use it to identify fungal sequences in public metagenome datasets. We focus primarily on animal metagenomes, especially those from pig and mouse microbiomes. We identified fungi in 39 of 70 datasets comprising 71 fungal species. At least 11 pathogenic species with zoonotic potential were identified, including Candida tropicalis. We identified Pseudogymnoascus species from 13 Antarctic soil samples initially analyzed for the presence of bacteria capable of degrading diesel oil. We also show that Candida tropicalis and Candida loboi are likely the same species. In addition, we identify several examples where contaminating DNA was erroneously included in fungal genome assemblies. PMID:29444186
dBBQs: dataBase of Bacterial Quality scores.

PubMed

Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David

2017-12-28

It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database. Prokaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses. dBBQs (available at http://arc-gem.uams.edu/dbbqs ) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose.
Association Between Fungal Contamination and Eye Bank-Prepared Endothelial Keratoplasty Tissue: Temperature-Dependent Risk Factors and Antifungal Supplementation of Optisol-Gentamicin and Streptomycin.

PubMed

Brothers, Kimberly M; Shanks, Robert M Q; Hurlbert, Susan; Kowalski, Regis P; Tu, Elmer Y

2017-11-01

Fungal contamination and infection from donor tissues processed for endothelial keratoplasty is a growing concern, prompting analysis of donor tissues after processing. To determine whether eyebank-processed endothelial keratoplasty tissue is at higher risk of contamination than unprocessed tissue and to model eyebank processing with regard to room temperature exposure on Candida growth in optisol-gentamicin and streptomycin (GS) with and without antifungal supplementation. An examination of the 2013 Eversight Eyebank Study follow-up database for risk factors associated with post-keratoplasty infection identified an increased risk of positive fungal rim culture results in tissue processed for endothelial keratoplasty vs unprocessed tissue. Processing steps at room temperature were hypothesized as a potential risk factor for promotion of fungal growth between these 2 processes. Candida albicans, Candida glabrata, and Candida parapsilosis endophthalmitis isolates were each inoculated into optisol-GS and subjected to 2 different room temperature incubation regimens reflective of current corneal tissue handling protocols. Eversight Eyebank Study outcomes and measures were follow-up inquiries from 6592 corneal transplants. Efficacy study outcomes and measures were fungal colony-forming units from inoculated vials of optisol-GS taken at 2 different processing temperatures. Donor rim culture results were 3 times more likely to be positive for fungi in endothelial keratoplasty-processed eyes (1.14%) than for other uses (0.37%) (difference, 0.77%; 95% CI, 0.17-.1.37) (P = .009). In vitro, increased room temperature incubation of optisol-GS increased growth of Candida species over time. The addition of caspofungin and voriconazole decreased growth of Candida in a species-dependent manner. Detectable Candida growth in donor rim cultures, associated with a higher rate of post keratoplasty infection, is seen in endothelial keratoplasty tissue vs other uses at the time of transplantation, likely owing in part to eyebank preparation processes extending the time of tissue warming. Reduced room temperature incubation and the addition of antifungal agents decreased growth of Candida species in optisol-GS and should be further explored to reduce the risk of infection.
Plant Genome Resources at the National Center for Biotechnology Information

PubMed Central

Wheeler, David L.; Smith-White, Brian; Chetvernin, Vyacheslav; Resenchuk, Sergei; Dombrowski, Susan M.; Pechous, Steven W.; Tatusova, Tatiana; Ostell, James

2005-01-01

The National Center for Biotechnology Information (NCBI) integrates data from more than 20 biological databases through a flexible search and retrieval system called Entrez. A core Entrez database, Entrez Nucleotide, includes GenBank and is tightly linked to the NCBI Taxonomy database, the Entrez Protein database, and the scientific literature in PubMed. A suite of more specialized databases for genomes, genes, gene families, gene expression, gene variation, and protein domains dovetails with the core databases to make Entrez a powerful system for genomic research. Linked to the full range of Entrez databases is the NCBI Map Viewer, which displays aligned genetic, physical, and sequence maps for eukaryotic genomes including those of many plants. A specialized plant query page allow maps from all plant genomes covered by the Map Viewer to be searched in tandem to produce a display of aligned maps from several species. PlantBLAST searches against the sequences shown in the Map Viewer allow BLAST alignments to be viewed within a genomic context. In addition, precomputed sequence similarities, such as those for proteins offered by BLAST Link, enable fluid navigation from unannotated to annotated sequences, quickening the pace of discovery. NCBI Web pages for plants, such as Plant Genome Central, complete the system by providing centralized access to NCBI's genomic resources as well as links to organism-specific Web pages beyond NCBI. PMID:16010002
CyanoBase: the cyanobacteria genome database update 2010

PubMed Central

Nakao, Mitsuteru; Okamoto, Shinobu; Kohara, Mitsuyo; Fujishiro, Tsunakazu; Fujisawa, Takatomo; Sato, Shusei; Tabata, Satoshi; Kaneko, Takakazu; Nakamura, Yasukazu

2010-01-01

CyanoBase (http://genome.kazusa.or.jp/cyanobase) is the genome database for cyanobacteria, which are model organisms for photosynthesis. The database houses cyanobacteria species information, complete genome sequences, genome-scale experiment data, gene information, gene annotations and mutant information. In this version, we updated these datasets and improved the navigation and the visual display of the data views. In addition, a web service API now enables users to retrieve the data in various formats with other tools, seamlessly. PMID:19880388
Identification and screening of potent antimicrobial peptides in arthropod genomes.

PubMed

Duwadi, Deepesh; Shrestha, Anishma; Yilma, Binyam; Kozlovski, Itamar; Sa-Eed, Munaya; Dahal, Nikesh; Jukosky, James

2018-05-01

Using tBLASTn and BLASTp searches, we queried recently sequenced arthropod genomes and expressed sequence tags (ESTs) using a database of known arthropod cecropins, defensins, and attacins. We identified and synthesized 6 potential AMPs and screened them for antimicrobial activity. Using radial diffusion assays and microtiter antimicrobial assays, we assessed the in vitro antimicrobial effects of these peptides against several human pathogens including Gram-positive and Gram-negative bacteria and fungi. We also conducted hemolysis assays to examine the cytotoxicity of these peptides to mammalian cells. Four of the six peptides identified showed antimicrobial effects in these assays. We also created truncated versions of these four peptides to assay their antimicrobial activity. Two cecropins derived from the monarch butterfly genome (Danaus plexippus), DAN1 and DAN2, showed minimum inhibitory concentrations (MICs) in the range of 2-16 μg/ml when screened against Gram-negative bacteria. HOLO1 and LOUDEF1, two defensin-like peptides derived from red flour beetle (Tribolium castaneum) and human body louse (Pediculus humanus humanus), respectively, exhibited MICs in the range of 13-25 μg/ml against Gram-positive bacteria. Furthermore, HOLO1 showed an MIC less than 5 μg/ml against the fungal species Candida albicans. These peptides exhibited no hemolytic activity at concentrations up to 200 μg/ml. The truncated peptides derived from DAN2 and HOLO1 showed very little antimicrobial activity. Our experiments show that the peptides DAN1, DAN2, HOLO1, and LOUDEF1 showed potent antimicrobial activity in vitro against common human pathogens, did not lyse mammalian red blood cells, and indicates their potential as templates for novel therapeutic agents against microbial infection. Copyright © 2018 Elsevier Inc. All rights reserved.
CottonDB: A resource for cotton genome research

USDA-ARS?s Scientific Manuscript database

CottonDB (http://cottondb.org/) is a database and web resource for cotton genomic and genetic research. Created in 1995, CottonDB was among the first plant genome databases established by the USDA-ARS. Accessed through a website interface, the database aims to be a convenient, inclusive medium of ...
The Giardia genome project database.

PubMed

McArthur, A G; Morrison, H G; Nixon, J E; Passamaneck, N Q; Kim, U; Hinkle, G; Crocker, M K; Holder, M E; Farr, R; Reich, C I; Olsen, G E; Aley, S B; Adam, R D; Gillin, F D; Sogin, M L

2000-08-15

The Giardia genome project database provides an online resource for Giardia lamblia (WB strain, clone C6) genome sequence information. The database includes edited single-pass reads, the results of BLASTX searches, and details of progress towards sequencing the entire 12 million-bp Giardia genome. Pre-sorted BLASTX results can be retrieved based on keyword searches and BLAST searches of the high throughput Giardia data can be initiated from the web site or through NCBI. Descriptions of the genomic DNA libraries, project protocols and summary statistics are also available. Although the Giardia genome project is ongoing, new sequences are made available on a bi-monthly basis to ensure that researchers have access to information that may assist them in the search for genes and their biological function. The current URL of the Giardia genome project database is www.mbl.edu/Giardia.
Candida endocarditis: systematic literature review from 1997 to 2014 and analysis of 29 cases from the Italian Study of Endocarditis.

PubMed

Giuliano, Simone; Guastalegname, Maurizio; Russo, Alessandro; Falcone, Marco; Ravasio, Veronica; Rizzi, Marco; Bassetti, Matteo; Viale, Pierluigi; Pasticci, Maria Bruna; Durante-Mangoni, Emanuele; Venditti, Mario

2017-09-01

Candida Endocarditis (CE) is a deadly disease. It is of paramount importance to assess risk factors for acquisition of both Candida native (NVE) and prosthetic (PVE) valve endocarditis and relate clinical features and treatment strategies with the outcome of the disease. Areas covered: We searched the literature using the Pubmed database. Cases of CE from the Italian Study on Endocarditis (SEI) were also included. Overall, 140 cases of CE were analyzed. Patients with a history of abdominal surgery and antibiotic exposure had higher probability of developing NVE than PVE. In the PVE group, time to onset of CE was significantly lower for biological prosthesis compared to mechanical prosthesis. In the whole population, greater age and longer time to diagnosis were associated with increased likelihood of death. Patients with effective anti-biofilm treatment, patients who underwent cardiac surgery and patients who were administered chronic suppressive antifungal treatment showed increased survival. For PVE, moderate active anti-biofilm and highly active anti-biofilm treatment were associated with lower mortality. Expert commentary: Both NVE and PVE could be considered biofilm-related diseases, pathogenetically characterized by Candida intestinal translocation and initial transient candidemia. Cardiac surgery, EAB treatment and chronic suppressive therapy might be crucial in increasing patient survival.
MOSAIC: an online database dedicated to the comparative genomics of bacterial strains at the intra-species level.

PubMed

Chiapello, Hélène; Gendrault, Annie; Caron, Christophe; Blum, Jérome; Petit, Marie-Agnès; El Karoui, Meriem

2008-11-27

The recent availability of complete sequences for numerous closely related bacterial genomes opens up new challenges in comparative genomics. Several methods have been developed to align complete genomes at the nucleotide level but their use and the biological interpretation of results are not straightforward. It is therefore necessary to develop new resources to access, analyze, and visualize genome comparisons. Here we present recent developments on MOSAIC, a generalist comparative bacterial genome database. This database provides the bacteriologist community with easy access to comparisons of complete bacterial genomes at the intra-species level. The strategy we developed for comparison allows us to define two types of regions in bacterial genomes: backbone segments (i.e., regions conserved in all compared strains) and variable segments (i.e., regions that are either specific to or variable in one of the aligned genomes). Definition of these segments at the nucleotide level allows precise comparative and evolutionary analyses of both coding and non-coding regions of bacterial genomes. Such work is easily performed using the MOSAIC Web interface, which allows browsing and graphical visualization of genome comparisons. The MOSAIC database now includes 493 pairwise comparisons and 35 multiple maximal comparisons representing 78 bacterial species. Genome conserved regions (backbones) and variable segments are presented in various formats for further analysis. A graphical interface allows visualization of aligned genomes and functional annotations. The MOSAIC database is available online at http://genome.jouy.inra.fr/mosaic.
Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources.

PubMed

Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

2015-01-01

The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/. © The Author(s) 2015. Published by Oxford University Press.
The COG database: a tool for genome-scale analysis of protein functions and evolution

PubMed Central

Tatusov, Roman L.; Galperin, Michael Y.; Natale, Darren A.; Koonin, Eugene V.

2000-01-01

Rational classification of proteins encoded in sequenced genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies. The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete genomes of bacteria, archaea and eukaryotes (http://www.ncbi.nlm.nih.gov/COG ). The COGs were constructed by applying the criterion of consistency of genome-specific best hits to the results of an exhaustive comparison of all protein sequences from these genomes. The database comprises 2091 COGs that include 56–83% of the gene products from each of the complete bacterial and archaeal genomes and ~35% of those from the yeast Saccharomyces cerevisiae genome. The COG database is accompanied by the COGNITOR program that is used to fit new proteins into the COGs and can be applied to functional and phylogenetic annotation of newly sequenced genomes. PMID:10592175
Genomics and the making of yeast biodiversity.

PubMed

Hittinger, Chris Todd; Rokas, Antonis; Bai, Feng-Yan; Boekhout, Teun; Gonçalves, Paula; Jeffries, Thomas W; Kominek, Jacek; Lachance, Marc-André; Libkind, Diego; Rosa, Carlos A; Sampaio, José Paulo; Kurtzman, Cletus P

2015-12-01

Yeasts are unicellular fungi that do not form fruiting bodies. Although the yeast lifestyle has evolved multiple times, most known species belong to the subphylum Saccharomycotina (syn. Hemiascomycota, hereafter yeasts). This diverse group includes the premier eukaryotic model system, Saccharomyces cerevisiae; the common human commensal and opportunistic pathogen, Candida albicans; and over 1000 other known species (with more continuing to be discovered). Yeasts are found in every biome and continent and are more genetically diverse than angiosperms or chordates. Ease of culture, simple life cycles, and small genomes (∼10-20Mbp) have made yeasts exceptional models for molecular genetics, biotechnology, and evolutionary genomics. Here we discuss recent developments in understanding the genomic underpinnings of the making of yeast biodiversity, comparing and contrasting natural and human-associated evolutionary processes. Only a tiny fraction of yeast biodiversity and metabolic capabilities has been tapped by industry and science. Expanding the taxonomic breadth of deep genomic investigations will further illuminate how genome function evolves to encode their diverse metabolisms and ecologies. Copyright © 2015 Elsevier Ltd. All rights reserved.
Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

PubMed

Jayakodi, Murukarthick; Choi, Beom-Soon; Lee, Sang-Choon; Kim, Nam-Hoon; Park, Jee Young; Jang, Woojong; Lakshmanan, Meiyappan; Mohan, Shobhana V G; Lee, Dong-Yup; Yang, Tae-Jin

2018-04-12

The ginseng (Panax ginseng C.A. Meyer) is a perennial herbaceous plant that has been used in traditional oriental medicine for thousands of years. Ginsenosides, which have significant pharmacological effects on human health, are the foremost bioactive constituents in this plant. Having realized the importance of this plant to humans, an integrated omics resource becomes indispensable to facilitate genomic research, molecular breeding and pharmacological study of this herb. The first draft genome sequences of P. ginseng cultivar "Chunpoong" were reported recently. Here, using the draft genome, transcriptome, and functional annotation datasets of P. ginseng, we have constructed the Ginseng Genome Database http://ginsengdb.snu.ac.kr /, the first open-access platform to provide comprehensive genomic resources of P. ginseng. The current version of this database provides the most up-to-date draft genome sequence (of approximately 3000 Mbp of scaffold sequences) along with the structural and functional annotations for 59,352 genes and digital expression of genes based on transcriptome data from different tissues, growth stages and treatments. In addition, tools for visualization and the genomic data from various analyses are provided. All data in the database were manually curated and integrated within a user-friendly query page. This database provides valuable resources for a range of research fields related to P. ginseng and other species belonging to the Apiales order as well as for plant research communities in general. Ginseng genome database can be accessed at http://ginsengdb.snu.ac.kr /.
MIPS PlantsDB: a database framework for comparative plant genome research.

PubMed

Nussbaumer, Thomas; Martis, Mihaela M; Roessner, Stephan K; Pfeifer, Matthias; Bader, Kai C; Sharma, Sapna; Gundlach, Heidrun; Spannagl, Manuel

2013-01-01

The rapidly increasing amount of plant genome (sequence) data enables powerful comparative analyses and integrative approaches and also requires structured and comprehensive information resources. Databases are needed for both model and crop plant organisms and both intuitive search/browse views and comparative genomics tools should communicate the data to researchers and help them interpret it. MIPS PlantsDB (http://mips.helmholtz-muenchen.de/plant/genomes.jsp) was initially described in NAR in 2007 [Spannagl,M., Noubibou,O., Haase,D., Yang,L., Gundlach,H., Hindemitt, T., Klee,K., Haberer,G., Schoof,H. and Mayer,K.F. (2007) MIPSPlantsDB-plant database resource for integrative and comparative plant genome research. Nucleic Acids Res., 35, D834-D840] and was set up from the start to provide data and information resources for individual plant species as well as a framework for integrative and comparative plant genome research. PlantsDB comprises database instances for tomato, Medicago, Arabidopsis, Brachypodium, Sorghum, maize, rice, barley and wheat. Building up on that, state-of-the-art comparative genomics tools such as CrowsNest are integrated to visualize and investigate syntenic relationships between monocot genomes. Results from novel genome analysis strategies targeting the complex and repetitive genomes of triticeae species (wheat and barley) are provided and cross-linked with model species. The MIPS Repeat Element Database (mips-REdat) and Catalog (mips-REcat) as well as tight connections to other databases, e.g. via web services, are further important components of PlantsDB.

MIPS PlantsDB: a database framework for comparative plant genome research

PubMed Central

Nussbaumer, Thomas; Martis, Mihaela M.; Roessner, Stephan K.; Pfeifer, Matthias; Bader, Kai C.; Sharma, Sapna; Gundlach, Heidrun; Spannagl, Manuel

2013-01-01

The rapidly increasing amount of plant genome (sequence) data enables powerful comparative analyses and integrative approaches and also requires structured and comprehensive information resources. Databases are needed for both model and crop plant organisms and both intuitive search/browse views and comparative genomics tools should communicate the data to researchers and help them interpret it. MIPS PlantsDB (http://mips.helmholtz-muenchen.de/plant/genomes.jsp) was initially described in NAR in 2007 [Spannagl,M., Noubibou,O., Haase,D., Yang,L., Gundlach,H., Hindemitt, T., Klee,K., Haberer,G., Schoof,H. and Mayer,K.F. (2007) MIPSPlantsDB–plant database resource for integrative and comparative plant genome research. Nucleic Acids Res., 35, D834–D840] and was set up from the start to provide data and information resources for individual plant species as well as a framework for integrative and comparative plant genome research. PlantsDB comprises database instances for tomato, Medicago, Arabidopsis, Brachypodium, Sorghum, maize, rice, barley and wheat. Building up on that, state-of-the-art comparative genomics tools such as CrowsNest are integrated to visualize and investigate syntenic relationships between monocot genomes. Results from novel genome analysis strategies targeting the complex and repetitive genomes of triticeae species (wheat and barley) are provided and cross-linked with model species. The MIPS Repeat Element Database (mips-REdat) and Catalog (mips-REcat) as well as tight connections to other databases, e.g. via web services, are further important components of PlantsDB. PMID:23203886
GenColors-based comparative genome databases for small eukaryotic genomes.

PubMed

Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

2013-01-01

Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.
MIPS plant genome information resources.

PubMed

Spannagl, Manuel; Haberer, Georg; Ernst, Rebecca; Schoof, Heiko; Mayer, Klaus F X

2007-01-01

The Munich Institute for Protein Sequences (MIPS) has been involved in maintaining plant genome databases since the Arabidopsis thaliana genome project. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable data sets for model plant genomes as a backbone against which experimental data, for example from high-throughput functional genomics, can be organized and evaluated. In addition, model genomes also form a scaffold for comparative genomics, and much can be learned from genome-wide evolutionary studies.
Systematic Phenotyping of a Large-Scale Candida glabrata Deletion Collection Reveals Novel Antifungal Tolerance Genes

PubMed Central

Hiller, Ekkehard; Istel, Fabian; Tscherner, Michael; Brunke, Sascha; Ames, Lauren; Firon, Arnaud; Green, Brian; Cabral, Vitor; Marcet-Houben, Marina; Jacobsen, Ilse D.; Quintin, Jessica; Seider, Katja; Frohner, Ingrid; Glaser, Walter; Jungwirth, Helmut; Bachellier-Bassi, Sophie; Chauvel, Murielle; Zeidler, Ute; Ferrandon, Dominique; Gabaldón, Toni; Hube, Bernhard; d'Enfert, Christophe; Rupp, Steffen; Cormack, Brendan; Haynes, Ken; Kuchler, Karl

2014-01-01

The opportunistic fungal pathogen Candida glabrata is a frequent cause of candidiasis, causing infections ranging from superficial to life-threatening disseminated disease. The inherent tolerance of C. glabrata to azole drugs makes this pathogen a serious clinical threat. To identify novel genes implicated in antifungal drug tolerance, we have constructed a large-scale C. glabrata deletion library consisting of 619 unique, individually bar-coded mutant strains, each lacking one specific gene, all together representing almost 12% of the genome. Functional analysis of this library in a series of phenotypic and fitness assays identified numerous genes required for growth of C. glabrata under normal or specific stress conditions, as well as a number of novel genes involved in tolerance to clinically important antifungal drugs such as azoles and echinocandins. We identified 38 deletion strains displaying strongly increased susceptibility to caspofungin, 28 of which encoding proteins that have not previously been linked to echinocandin tolerance. Our results demonstrate the potential of the C. glabrata mutant collection as a valuable resource in functional genomics studies of this important fungal pathogen of humans, and to facilitate the identification of putative novel antifungal drug target and virulence genes. PMID:24945925
Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine

PubMed Central

Elsik, Christine G.; Tayal, Aditi; Diesh, Colin M.; Unni, Deepak R.; Emery, Marianne L.; Nguyen, Hung N.; Hagen, Darren E.

2016-01-01

We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564
The Importance of Biological Databases in Biological Discovery.

PubMed

Baxevanis, Andreas D; Bateman, Alex

2015-06-19

Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, The Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource, are also covered. Non-sequence-centric databases, such as Online Mendelian Inheritance in Man (OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and Genomes (KEGG), are also discussed. Copyright © 2015 John Wiley & Sons, Inc.
CyanoClust: comparative genome resources of cyanobacteria and plastids.

PubMed

Sasaki, Naobumi V; Sato, Naoki

2010-01-01

Cyanobacteria, which perform oxygen-evolving photosynthesis as do chloroplasts of plants and algae, are one of the best-studied prokaryotic phyla and one from which many representative genomes have been sequenced. Lack of a suitable comparative genomic database has been a problem in cyanobacterial genomics because many proteins involved in physiological functions such as photosynthesis and nitrogen fixation are not catalogued in commonly used databases, such as Clusters of Orthologous Proteins (COG). CyanoClust is a database of homolog groups in cyanobacteria and plastids that are produced by the program Gclust. We have developed a web-server system for the protein homology database featuring cyanobacteria and plastids. Database URL: http://cyanoclust.c.u-tokyo.ac.jp/.
Rapid storage and retrieval of genomic intervals from a relational database system using nested containment lists

PubMed Central

Wiley, Laura K.; Sivley, R. Michael; Bush, William S.

2013-01-01

Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist PMID:23894185
Rapid storage and retrieval of genomic intervals from a relational database system using nested containment lists.

PubMed

Wiley, Laura K; Sivley, R Michael; Bush, William S

2013-01-01

Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist.
The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide

PubMed Central

Liolios, Konstantinos; Tavernarakis, Nektarios; Hugenholtz, Philip; Kyrpides, Nikos C.

2006-01-01

The Genomes On Line Database (GOLD) is a web resource for comprehensive access to information regarding complete and ongoing genome sequencing projects worldwide. The database currently incorporates information on over 1500 sequencing projects, of which 294 have been completed and the data deposited in the public databases. GOLD v.2 has been expanded to provide information related to organism properties such as phenotype, ecotype and disease. Furthermore, project relevance and availability information is now included. GOLD is available at . It is also mirrored at the Institute of Molecular Biology and Biotechnology, Crete, Greece at PMID:16381880
The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification

PubMed Central

Reddy, T.B.K.; Thomas, Alex D.; Stamatis, Dimitri; Bertsch, Jon; Isbandi, Michelle; Jansson, Jakob; Mallajosyula, Jyothi; Pagani, Ioanna; Lobos, Elizabeth A.; Kyrpides, Nikos C.

2015-01-01

The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Here we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards. PMID:25348402
Pseudomonas Genome Database: facilitating user-friendly, comprehensive comparisons of microbial genomes.

PubMed

Winsor, Geoffrey L; Van Rossum, Thea; Lo, Raymond; Khaira, Bhavjinder; Whiteside, Matthew D; Hancock, Robert E W; Brinkman, Fiona S L

2009-01-01

Pseudomonas aeruginosa is a well-studied opportunistic pathogen that is particularly known for its intrinsic antimicrobial resistance, diverse metabolic capacity, and its ability to cause life threatening infections in cystic fibrosis patients. The Pseudomonas Genome Database (http://www.pseudomonas.com) was originally developed as a resource for peer-reviewed, continually updated annotation for the Pseudomonas aeruginosa PAO1 reference strain genome. In order to facilitate cross-strain and cross-species genome comparisons with other Pseudomonas species of importance, we have now expanded the database capabilities to include all Pseudomonas species, and have developed or incorporated methods to facilitate high quality comparative genomics. The database contains robust assessment of orthologs, a novel ortholog clustering method, and incorporates five views of the data at the sequence and annotation levels (Gbrowse, Mauve and custom views) to facilitate genome comparisons. A choice of simple and more flexible user-friendly Boolean search features allows researchers to search and compare annotations or sequences within or between genomes. Other features include more accurate protein subcellular localization predictions and a user-friendly, Boolean searchable log file of updates for the reference strain PAO1. This database aims to continue to provide a high quality, annotated genome resource for the research community and is available under an open source license.
The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes.

PubMed

Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin

2011-01-01

The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
The Génolevures database.

PubMed

Martin, Tiphaine; Sherman, David J; Durrens, Pascal

2011-01-01

The Génolevures online database (URL: http://www.genolevures.org) stores and provides the data and results obtained by the Génolevures Consortium through several campaigns of genome annotation of the yeasts in the Saccharomycotina subphylum (hemiascomycetes). This database is dedicated to large-scale comparison of these genomes, storing not only the different chromosomal elements detected in the sequences, but also the logical relations between them. The database is divided into a public part, accessible to anyone through Internet, and a private part where the Consortium members make genome annotations with our Magus annotation system; this system is used to annotate several related genomes in parallel. The public database is widely consulted and offers structured data, organized using a REST web site architecture that allows for automated requests. The implementation of the database, as well as its associated tools and methods, is evolving to cope with the influx of genome sequences produced by Next Generation Sequencing (NGS). Copyright © 2011 Académie des sciences. Published by Elsevier SAS. All rights reserved.
The contribution of the S-phase checkpoint genes MEC1 and SGS1 to genome stability maintenance in Candida albicans

PubMed Central

Legrand, Melanie; Chan, Christine L.; Jauert, Peter A.; Kirkpatrick, David T.

2011-01-01

Genome rearrangements, a common feature of Candida albicans isolates, are often associated with the acquisition of antifungal drug resistance. In Saccharomyces cerevisiae, perturbations in the S-phase checkpoints result in the same sort of Gross Chromosomal Rearrangements (GCRs) observed in C. albicans. Several proteins are involved in the S. cerevisiae cell cycle checkpoints, including Mec1p, a protein kinase of the PIKK (phosphatidyl inositol 3-kinase-like kinase) family and the central player in the DNA damage checkpoint. Sgs1p, the ortholog of BLM, the Bloom’s syndrome gene, is a RecQ-related DNA helicase; cells from BLM patients are characterized by an increase in genome instability. Yeast strains bearing deletions in MEC1 or SGS1 are viable (in contrast to the inviability seen with loss of MEC1 in S. cerevisiae) but the different deletion mutants have significantly different phenotypes. The mec1Δ/Δ colonies have a wild-type colony morphology, while the sgs1Δ/Δ mutants are slow-growing, producing wrinkled colonies with pseudohyphal-like cells. The mec1Δ/Δ mutants are only sensitive to ethylmethane sulfonate (EMS), methylmethane sulfonate (MMS), and hydroxyurea (HU) but the sgs1Δ/Δ mutants exhibit a high sensitivity to all DNA-damaging agents tested. In an assay for chromosome 1 integrity, the mec1Δ/Δ mutants exhibit an increase in genome instability; no change was observed in the sgs1Δ/Δ mutants. Finally, loss of MEC1 does not affect sensitivity to the antifungal drug fluconazole, while loss of SGS1 leads to an increased susceptibility to fluconazole. Neither deletion elevated the level of antifungal drug resistance acquisition. PMID:21511048
Candida infective endocarditis: an observational cohort study with a focus on therapy.

PubMed

Arnold, Christopher J; Johnson, Melissa; Bayer, Arnold S; Bradley, Suzanne; Giannitsioti, Efthymia; Miró, José M; Tornos, Pilar; Tattevin, Pierre; Strahilevitz, Jacob; Spelman, Denis; Athan, Eugene; Nacinovich, Francisco; Fortes, Claudio Q; Lamas, Cristiane; Barsic, Bruno; Fernández-Hidalgo, Nuria; Muñoz, Patricia; Chu, Vivian H

2015-04-01

Candida infective endocarditis is a rare disease with a high mortality rate. Our understanding of this infection is derived from case series, case reports, and small prospective cohorts. The purpose of this study was to evaluate the clinical features and use of different antifungal treatment regimens for Candida infective endocarditis. This prospective cohort study was based on 70 cases of Candida infective endocarditis from the International Collaboration on Endocarditis (ICE)-Prospective Cohort Study and ICE-Plus databases collected between 2000 and 2010. The majority of infections were acquired nosocomially (67%). Congestive heart failure (24%), prosthetic heart valve (46%), and previous infective endocarditis (26%) were common comorbidities. Overall mortality was high, with 36% mortality in the hospital and 59% at 1 year. On univariate analysis, older age, heart failure at baseline, persistent candidemia, nosocomial acquisition, heart failure as a complication, and intracardiac abscess were associated with higher mortality. Mortality was not affected by use of surgical therapy or choice of antifungal agent. A subgroup analysis was performed on 33 patients for whom specific antifungal therapy information was available. In this subgroup, 11 patients received amphotericin B-based therapy and 14 received echinocandin-based therapy. Despite a higher percentage of older patients and nosocomial infection in the echinocandin group, mortality rates were similar between the two groups. In conclusion, Candida infective endocarditis is associated with a high mortality rate that was not impacted by choice of antifungal therapy or by adjunctive surgical intervention. Additionally, echinocandin therapy was as effective as amphotericin B-based therapy in the small subgroup analysis. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Candida Infective Endocarditis: an Observational Cohort Study with a Focus on Therapy

PubMed Central

Johnson, Melissa; Bayer, Arnold S.; Bradley, Suzanne; Giannitsioti, Efthymia; Miró, José M.; Tornos, Pilar; Tattevin, Pierre; Strahilevitz, Jacob; Spelman, Denis; Athan, Eugene; Nacinovich, Francisco; Fortes, Claudio Q.; Lamas, Cristiane; Barsic, Bruno; Fernández-Hidalgo, Nuria; Muñoz, Patricia; Chu, Vivian H.

2015-01-01

Candida infective endocarditis is a rare disease with a high mortality rate. Our understanding of this infection is derived from case series, case reports, and small prospective cohorts. The purpose of this study was to evaluate the clinical features and use of different antifungal treatment regimens for Candida infective endocarditis. This prospective cohort study was based on 70 cases of Candida infective endocarditis from the International Collaboration on Endocarditis (ICE)-Prospective Cohort Study and ICE-Plus databases collected between 2000 and 2010. The majority of infections were acquired nosocomially (67%). Congestive heart failure (24%), prosthetic heart valve (46%), and previous infective endocarditis (26%) were common comorbidities. Overall mortality was high, with 36% mortality in the hospital and 59% at 1 year. On univariate analysis, older age, heart failure at baseline, persistent candidemia, nosocomial acquisition, heart failure as a complication, and intracardiac abscess were associated with higher mortality. Mortality was not affected by use of surgical therapy or choice of antifungal agent. A subgroup analysis was performed on 33 patients for whom specific antifungal therapy information was available. In this subgroup, 11 patients received amphotericin B-based therapy and 14 received echinocandin-based therapy. Despite a higher percentage of older patients and nosocomial infection in the echinocandin group, mortality rates were similar between the two groups. In conclusion, Candida infective endocarditis is associated with a high mortality rate that was not impacted by choice of antifungal therapy or by adjunctive surgical intervention. Additionally, echinocandin therapy was as effective as amphotericin B-based therapy in the small subgroup analysis. PMID:25645855
Genetic and phenotypic intra-species variation in Candida albicans

PubMed Central

Hirakawa, Matthew P.; Martinez, Diego A.; Sakthikumar, Sharadha; Anderson, Matthew Z.; Berlin, Aaron; Gujja, Sharvari; Zeng, Qiandong; Zisson, Ethan; Wang, Joshua M.; Greenberg, Joshua M.; Berman, Judith

2015-01-01

Candida albicans is a commensal fungus of the human gastrointestinal tract and a prevalent opportunistic pathogen. To examine diversity within this species, extensive genomic and phenotypic analyses were performed on 21 clinical C. albicans isolates. Genomic variation was evident in the form of polymorphisms, copy number variations, chromosomal inversions, subtelomeric hypervariation, loss of heterozygosity (LOH), and whole or partial chromosome aneuploidies. All 21 strains were diploid, although karyotypic changes were present in eight of the 21 isolates, with multiple strains being trisomic for Chromosome 4 or Chromosome 7. Aneuploid strains exhibited a general fitness defect relative to euploid strains when grown under replete conditions. All strains were also heterozygous, yet multiple, distinct LOH tracts were present in each isolate. Higher overall levels of genome heterozygosity correlated with faster growth rates, consistent with increased overall fitness. Genes with the highest rates of amino acid substitutions included many cell wall proteins, implicating fast evolving changes in cell adhesion and host interactions. One clinical isolate, P94015, presented several striking properties including a novel cellular phenotype, an inability to filament, drug resistance, and decreased virulence. Several of these properties were shown to be due to a homozygous nonsense mutation in the EFG1 gene. Furthermore, loss of EFG1 function resulted in increased fitness of P94015 in a commensal model of infection. Our analysis therefore reveals intra-species genetic and phenotypic differences in C. albicans and delineates a natural mutation that alters the balance between commensalism and pathogenicity. PMID:25504520
Importation, Mitigation, and Genomic Epidemiology of Candida auris at a Large Teaching Hospital.

PubMed

Lesho, Emil P; Bronstein, Melissa Z; McGann, Patrick; Stam, Jason; Kwak, Yoon; Maybank, Rosslyn; McNamara, Jodi; Callahan, Megan; Campbell, Jean; Hinkle, Mary K; Walsh, Edward E

2018-01-01

OBJECTIVE Candida auris (CA) is an emerging multidrug-resistant pathogen associated with increased mortality. The environment may play a role, but transmission dynamics remain poorly understood. We sought to limit environmental and patient CA contamination following a sustained unsuspected exposure. DESIGN Quasi-experimental observation. SETTING A 528-bed teaching hospital. PATIENTS The index case patient and 17 collocated ward mates. INTERVENTION Immediately after confirmation of CA in the bloodstream and urine of a patient admitted 6 days previously, active surveillance, enhanced transmission-based precautions, environmental cleaning with peracetic acid-hydrogen peroxide and ultraviolet light, and patient relocation were undertaken. Pre-existing agreements and foundational relationships among internal multidisciplinary teams and external partners were leveraged to bolster detection and mitigation efforts and to provide genomic epidemiology. RESULTS Candida auris was isolated from 3 of 132 surface samples on days 8, 9, and 15 of ward occupancy, and from no patient samples (0 of 48). Environmental and patient isolates were genetically identical (4-8 single-nucleotide polymorphisms [SNPs]) and most closely related to the 2013 India CA-6684 strain (~200 SNPs), supporting the epidemiological hypothesis that the source of environmental contamination was the index case patient, who probably acquired the South Asian strain from another New York hospital. All isolates contained a mutation associated with azole resistance (K163R) found in the India 2105 VPCI strain but not in CA-6684. The index patient remained colonized until death. No surfaces were CA-positive 1 month later. CONCLUSION Compared to previous descriptions, CA dissemination was minimal. Immediate access to rapid CA diagnostics facilitates early containment strategies and outbreak investigations. Infect Control Hosp Epidemiol 2018;39:53-57.
BGD: a database of bat genomes.

PubMed

Fang, Jianfei; Wang, Xuan; Mu, Shuo; Zhang, Shuyi; Dong, Dong

2015-01-01

Bats account for ~20% of mammalian species, and are the only mammals with true powered flight. For the sake of their specialized phenotypic traits, many researches have been devoted to examine the evolution of bats. Until now, some whole genome sequences of bats have been assembled and annotated, however, a uniform resource for the annotated bat genomes is still unavailable. To make the extensive data associated with the bat genomes accessible to the general biological communities, we established a Bat Genome Database (BGD). BGD is an open-access, web-available portal that integrates available data of bat genomes and genes. It hosts data from six bat species, including two megabats and four microbats. Users can query the gene annotations using efficient searching engine, and it offers browsable tracks of bat genomes. Furthermore, an easy-to-use phylogenetic analysis tool was also provided to facilitate online phylogeny study of genes. To the best of our knowledge, BGD is the first database of bat genomes. It will extend our understanding of the bat evolution and be advantageous to the bat sequences analysis. BGD is freely available at: http://donglab.ecnu.edu.cn/databases/BatGenome/.

The UCSC Genome Browser database: extensions and updates 2013.

PubMed

Meyer, Laurence R; Zweig, Ann S; Hinrichs, Angie S; Karolchik, Donna; Kuhn, Robert M; Wong, Matthew; Sloan, Cricket A; Rosenbloom, Kate R; Roe, Greg; Rhead, Brooke; Raney, Brian J; Pohl, Andy; Malladi, Venkat S; Li, Chin H; Lee, Brian T; Learned, Katrina; Kirkup, Vanessa; Hsu, Fan; Heitner, Steve; Harte, Rachel A; Haeussler, Maximilian; Guruvadoo, Luvina; Goldman, Mary; Giardine, Belinda M; Fujita, Pauline A; Dreszer, Timothy R; Diekhans, Mark; Cline, Melissa S; Clawson, Hiram; Barber, Galt P; Haussler, David; Kent, W James

2013-01-01

The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic datasets. As of September 2012, genomic sequence and a basic set of annotation 'tracks' are provided for 63 organisms, including 26 mammals, 13 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms, yeast and sea hare. In the past year 19 new genome assemblies have been added, and we anticipate releasing another 28 in early 2013. Further, a large number of annotation tracks have been either added, updated by contributors or remapped to the latest human reference genome. Among these are an updated UCSC Genes track for human and mouse assemblies. We have also introduced several features to improve usability, including new navigation menus. This article provides an update to the UCSC Genome Browser database, which has been previously featured in the Database issue of this journal.
The Yak genome database: an integrative database for studying yak biology and high-altitude adaption

PubMed Central

2012-01-01

Background The yak (Bos grunniens) is a long-haired bovine that lives at high altitudes and is an important source of milk, meat, fiber and fuel. The recent sequencing, assembly and annotation of its genome are expected to further our understanding of the means by which it has adapted to life at high altitudes and its ecologically important traits. Description The Yak Genome Database (YGD) is an internet-based resource that provides access to genomic sequence data and predicted functional information concerning the genes and proteins of Bos grunniens. The curated data stored in the YGD includes genome sequences, predicted genes and associated annotations, non-coding RNA sequences, transposable elements, single nucleotide variants, and three-way whole-genome alignments between human, cattle and yak. YGD offers useful searching and data mining tools, including the ability to search for genes by name or using function keywords as well as GBrowse genome browsers and/or BLAST servers, which can be used to visualize genome regions and identify similar sequences. Sequence data from the YGD can also be downloaded to perform local searches. Conclusions A new yak genome database (YGD) has been developed to facilitate studies on high-altitude adaption and bovine genomics. The database will be continuously updated to incorporate new information such as transcriptome data and population resequencing data. The YGD can be accessed at http://me.lzu.edu.cn/yak. PMID:23134687
Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE).

PubMed

Schmedes, Sarah E; King, Jonathan L; Budowle, Bruce

2015-01-01

Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.
Assembly: a resource for assembled genomes at NCBI

PubMed Central

Kitts, Paul A.; Church, Deanna M.; Thibaud-Nissen, Françoise; Choi, Jinna; Hem, Vichet; Sapojnikov, Victor; Smith, Robert G.; Tatusova, Tatiana; Xiang, Charlie; Zherikov, Andrey; DiCuccio, Michael; Murphy, Terence D.; Pruitt, Kim D.; Kimchi, Avi

2016-01-01

The NCBI Assembly database (www.ncbi.nlm.nih.gov/assembly/) provides stable accessioning and data tracking for genome assembly data. The model underlying the database can accommodate a range of assembly structures, including sets of unordered contig or scaffold sequences, bacterial genomes consisting of a single complete chromosome, or complex structures such as a human genome with modeled allelic variation. The database provides an assembly accession and version to unambiguously identify the set of sequences that make up a particular version of an assembly, and tracks changes to updated genome assemblies. The Assembly database reports metadata such as assembly names, simple statistical reports of the assembly (number of contigs and scaffolds, contiguity metrics such as contig N50, total sequence length and total gap length) as well as the assembly update history. The Assembly database also tracks the relationship between an assembly submitted to the International Nucleotide Sequence Database Consortium (INSDC) and the assembly represented in the NCBI RefSeq project. Users can find assemblies of interest by querying the Assembly Resource directly or by browsing available assemblies for a particular organism. Links in the Assembly Resource allow users to easily download sequence and annotations for current versions of genome assemblies from the NCBI genomes FTP site. PMID:26578580
WheatGenome.info: an integrated database and portal for wheat genome information.

PubMed

Lai, Kaitao; Berkman, Paul J; Lorenc, Michal Tadeusz; Duran, Chris; Smits, Lars; Manoli, Sahana; Stiller, Jiri; Edwards, David

2012-02-01

Bread wheat (Triticum aestivum) is one of the most important crop plants, globally providing staple food for a large proportion of the human population. However, improvement of this crop has been limited due to its large and complex genome. Advances in genomics are supporting wheat crop improvement. We provide a variety of web-based systems hosting wheat genome and genomic data to support wheat research and crop improvement. WheatGenome.info is an integrated database resource which includes multiple web-based applications. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second-generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This system includes links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/.
Genome databases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Courteau, J.

1991-10-11

Since the Genome Project began several years ago, a plethora of databases have been developed or are in the works. They range from the massive Genome Data Base at Johns Hopkins University, the central repository of all gene mapping information, to small databases focusing on single chromosomes or organisms. Some are publicly available, others are essentially private electronic lab notebooks. Still others limit access to a consortium of researchers working on, say, a single human chromosome. An increasing number incorporate sophisticated search and analytical software, while others operate as little more than data lists. In consultation with numerous experts inmore » the field, a list has been compiled of some key genome-related databases. The list was not limited to map and sequence databases but also included the tools investigators use to interpret and elucidate genetic data, such as protein sequence and protein structure databases. Because a major goal of the Genome Project is to map and sequence the genomes of several experimental animals, including E. coli, yeast, fruit fly, nematode, and mouse, the available databases for those organisms are listed as well. The author also includes several databases that are still under development - including some ambitious efforts that go beyond data compilation to create what are being called electronic research communities, enabling many users, rather than just one or a few curators, to add or edit the data and tag it as raw or confirmed.« less
HOWDY: an integrated database system for human genome research

PubMed Central

Hirakawa, Mika

2002-01-01

HOWDY is an integrated database system for accessing and analyzing human genomic information (http://www-alis.tokyo.jst.go.jp/HOWDY/). HOWDY stores information about relationships between genetic objects and the data extracted from a number of databases. HOWDY consists of an Internet accessible user interface that allows thorough searching of the human genomic databases using the gene symbols and their aliases. It also permits flexible editing of the sequence data. The database can be searched using simple words and the search can be restricted to a specific cytogenetic location. Linear maps displaying markers and genes on contig sequences are available, from which an object can be chosen. Any search starting point identifies all the information matching the query. HOWDY provides a convenient search environment of human genomic data for scientists unsure which database is most appropriate for their search. PMID:11752279
Nencki Genomics Database—Ensembl funcgen enhanced with intersections, user data and genome-wide TFBS motifs

PubMed Central

Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal

2013-01-01

We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql –h database.nencki-genomics.org –u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface. Database URL: http://www.nencki-genomics.org. PMID:24089456
Specialized microbial databases for inductive exploration of microbial genome sequences

PubMed Central

Fang, Gang; Ho, Christine; Qiu, Yaowu; Cubas, Virginie; Yu, Zhou; Cabau, Cédric; Cheung, Frankie; Moszer, Ivan; Danchin, Antoine

2005-01-01

Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore , a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya) has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis) associated to related organisms for comparison. PMID:15698474
iMETHYL: an integrative database of human DNA methylation, gene expression, and genomic variation.

PubMed

Komaki, Shohei; Shiwa, Yuh; Furukawa, Ryohei; Hachiya, Tsuyoshi; Ohmomo, Hideki; Otomo, Ryo; Satoh, Mamoru; Hitomi, Jiro; Sobue, Kenji; Sasaki, Makoto; Shimizu, Atsushi

2018-01-01

We launched an integrative multi-omics database, iMETHYL (http://imethyl.iwate-megabank.org). iMETHYL provides whole-DNA methylation (~24 million autosomal CpG sites), whole-genome (~9 million single-nucleotide variants), and whole-transcriptome (>14 000 genes) data for CD4 + T-lymphocytes, monocytes, and neutrophils collected from approximately 100 subjects. These data were obtained from whole-genome bisulfite sequencing, whole-genome sequencing, and whole-transcriptome sequencing, making iMETHYL a comprehensive database.
The transcriptome of Candida albicans mitochondria and the evolution of organellar transcription units in yeasts.

PubMed

Kolondra, Adam; Labedzka-Dmoch, Karolina; Wenda, Joanna M; Drzewicka, Katarzyna; Golik, Pawel

2015-10-21

Yeasts show remarkable variation in the organization of their mitochondrial genomes, yet there is little experimental data on organellar gene expression outside few model species. Candida albicans is interesting as a human pathogen, and as a representative of a clade that is distant from the model yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe. Unlike them, it encodes seven Complex I subunits in its mtDNA. No experimental data regarding organellar expression were available prior to this study. We used high-throughput RNA sequencing and traditional RNA biology techniques to study the mitochondrial transcriptome of C. albicans strains BWP17 and SN148. The 14 protein-coding genes, two ribosomal RNA genes, and 24 tRNA genes are expressed as eight primary polycistronic transcription units. We also found transcriptional activity in the noncoding regions, and antisense transcripts that could be a part of a regulatory mechanism. The promoter sequence is a variant of the nonanucleotide identified in other yeast mtDNAs, but some of the active promoters show significant departures from the consensus. The primary transcripts are processed by a tRNA punctuation mechanism into the monocistronic and bicistronic mature RNAs. The steady state levels of various mature transcripts exhibit large differences that are a result of posttranscriptional regulation. Transcriptome analysis allowed to precisely annotate the positions of introns in the RNL (2), COB (2) and COX1 (4) genes, as well as to refine the annotation of tRNAs and rRNAs. Comparative study of the mitochondrial genome organization in various Candida species indicates that they undergo shuffling in blocks usually containing 2-3 genes, and that their arrangement in primary transcripts is not conserved. tRNA genes with their associated promoters, as well as GC-rich sequence elements play an important role in these evolutionary events. The main evolutionary force shaping the mitochondrial genomes of yeasts is the frequent recombination, constantly breaking apart and joining genes into novel primary transcription units. The mitochondrial transcription units are constantly rearranged in evolution shaping the features of gene expression, such as the presence of secondary promoter sites that are inactive, or act as "booster" promoters, simplified transcriptional regulation and reliance on posttranscriptional mechanisms.
Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.

PubMed

Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

2016-01-04

We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Comparative genomics of xylose-fermenting fungi for enhanced biofuel production

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wohlbach, Dana J.; Kuo, Alan; Sato, Trey K.

Cellulosic biomass is an abundant and underused substrate for biofuel production. The inability of many microbes to metabolize the pentose sugars abundant within hemicellulose creates specific challenges for microbial biofuel production from cellulosic material. Although engineered strains of Saccharomyces cerevisiae can use the pentose xylose, the fermentative capacity pales in comparison with glucose, limiting the economic feasibility of industrial fermentations. To better understand xylose utilization for subsequent microbial engineering, we sequenced the genomes of two xylose-fermenting, beetle-associated fungi, Spathaspora passalidarum and Candida tenuis. To identify genes involved in xylose metabolism, we applied a comparative genomic approach across 14 Ascomycete genomes,more » mapping phenotypes and genotypes onto the fungal phylogeny, and measured genomic expression across five Hemiascomycete species with different xylose-consumption phenotypes. This approach implicated many genes and processes involved in xylose assimilation. Several of these genes significantly improved xylose utilization when engineered into S. cerevisiae, demonstrating the power of comparative methods in rapidly identifying genes for biomass conversion while reflecting on fungal ecology.« less
NeisseriaBase: a specialised Neisseria genomic resource and analysis platform.

PubMed

Zheng, Wenning; Mutha, Naresh V R; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah; Choo, Siew Woh

2016-01-01

Background. The gram-negative Neisseria is associated with two of the most potent human epidemic diseases: meningococcal meningitis and gonorrhoea. In both cases, disease is caused by bacteria colonizing human mucosal membrane surfaces. Overall, the genus shows great diversity and genetic variation mainly due to its ability to acquire and incorporate genetic material from a diverse range of sources through horizontal gene transfer. Although a number of databases exist for the Neisseria genomes, they are mostly focused on the pathogenic species. In this present study we present the freely available NeisseriaBase, a database dedicated to the genus Neisseria encompassing the complete and draft genomes of 15 pathogenic and commensal Neisseria species. Methods. The genomic data were retrieved from National Center for Biotechnology Information (NCBI) and annotated using the RAST server which were then stored into the MySQL database. The protein-coding genes were further analyzed to obtain information such as calculation of GC content (%), predicted hydrophobicity and molecular weight (Da) using in-house Perl scripts. The web application was developed following the secure four-tier web application architecture: (1) client workstation, (2) web server, (3) application server, and (4) database server. The web interface was constructed using PHP, JavaScript, jQuery, AJAX and CSS, utilizing the model-view-controller (MVC) framework. The in-house developed bioinformatics tools implemented in NeisseraBase were developed using Python, Perl, BioPerl and R languages. Results. Currently, NeisseriaBase houses 603,500 Coding Sequences (CDSs), 16,071 RNAs and 13,119 tRNA genes from 227 Neisseria genomes. The database is equipped with interactive web interfaces. Incorporation of the JBrowse genome browser in the database enables fast and smooth browsing of Neisseria genomes. NeisseriaBase includes the standard BLAST program to facilitate homology searching, and for Virulence Factor Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my.
NeisseriaBase: a specialised Neisseria genomic resource and analysis platform

PubMed Central

Zheng, Wenning; Mutha, Naresh V.R.; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S.; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah

2016-01-01

Background. The gram-negative Neisseria is associated with two of the most potent human epidemic diseases: meningococcal meningitis and gonorrhoea. In both cases, disease is caused by bacteria colonizing human mucosal membrane surfaces. Overall, the genus shows great diversity and genetic variation mainly due to its ability to acquire and incorporate genetic material from a diverse range of sources through horizontal gene transfer. Although a number of databases exist for the Neisseria genomes, they are mostly focused on the pathogenic species. In this present study we present the freely available NeisseriaBase, a database dedicated to the genus Neisseria encompassing the complete and draft genomes of 15 pathogenic and commensal Neisseria species. Methods. The genomic data were retrieved from National Center for Biotechnology Information (NCBI) and annotated using the RAST server which were then stored into the MySQL database. The protein-coding genes were further analyzed to obtain information such as calculation of GC content (%), predicted hydrophobicity and molecular weight (Da) using in-house Perl scripts. The web application was developed following the secure four-tier web application architecture: (1) client workstation, (2) web server, (3) application server, and (4) database server. The web interface was constructed using PHP, JavaScript, jQuery, AJAX and CSS, utilizing the model-view-controller (MVC) framework. The in-house developed bioinformatics tools implemented in NeisseraBase were developed using Python, Perl, BioPerl and R languages. Results. Currently, NeisseriaBase houses 603,500 Coding Sequences (CDSs), 16,071 RNAs and 13,119 tRNA genes from 227 Neisseria genomes. The database is equipped with interactive web interfaces. Incorporation of the JBrowse genome browser in the database enables fast and smooth browsing of Neisseria genomes. NeisseriaBase includes the standard BLAST program to facilitate homology searching, and for Virulence Factor Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my. PMID:27017950
Evaluation of Micronuclei, Nuclear Anomalies and the Nuclear/Cytoplasmic Ratio of Exfoliated Cervical Epithelial Cells in Genital Candidiasis.

PubMed

Safi Oz, Zehra; Dogan Gun, Banu; Ozdamar, Sukru Oguz

2015-01-01

Candida is the most common cause of fungal infections. The aim of this study was to fill the gaps in the current knowledge on the frequencies of micronuclei and nuclear anomalies, and the nucleus/cytoplasmic ratio in genital candidiasis. A total of 88 Papanicolaou- stained cervical smears, which comprised Candida spp. (n = 44) and control cases with no infectious agent (n = 44), were studied. In each smear, cells with micronuclei and nuclear anomalies were counted in 1,000 epithelial cells and also nuclear and cellular areas were evaluated using image analysis software at a magnification of ×400. The frequencies of micronucleated and binucleated cells and cells with perinuclear halos, and the nucleus/cytoplasmic ratio of epithelial cells were higher in the Candida-infected group compared with the control group (p < 0.05). Genital candidiasis is able to induce changes in the size and shape of epithelial cells. The nuclear/cytoplasmic ratio and the frequency of micronuclei may reflect the DNA damage in the cervical epithelium. Micronucleus scoring could be used to screen the genomic damage profile of epithelial cells in candidiasis. © 2015 S. Karger AG, Basel.
CBS Genome Atlas Database: a dynamic storage for bioinformatic results and sequence data.

PubMed

Hallin, Peter F; Ussery, David W

2004-12-12

Currently, new bacterial genomes are being published on a monthly basis. With the growing amount of genome sequence data, there is a demand for a flexible and easy-to-maintain structure for storing sequence data and results from bioinformatic analysis. More than 150 sequenced bacterial genomes are now available, and comparisons of properties for taxonomically similar organisms are not readily available to many biologists. In addition to the most basic information, such as AT content, chromosome length, tRNA count and rRNA count, a large number of more complex calculations are needed to perform detailed comparative genomics. DNA structural calculations like curvature and stacking energy, DNA compositions like base skews, oligo skews and repeats at the local and global level are just a few of the analysis that are presented on the CBS Genome Atlas Web page. Complex analysis, changing methods and frequent addition of new models are factors that require a dynamic database layout. Using basic tools like the GNU Make system, csh, Perl and MySQL, we have created a flexible database environment for storing and maintaining such results for a collection of complete microbial genomes. Currently, these results counts to more than 220 pieces of information. The backbone of this solution consists of a program package written in Perl, which enables administrators to synchronize and update the database content. The MySQL database has been connected to the CBS web-server via PHP4, to present a dynamic web content for users outside the center. This solution is tightly fitted to existing server infrastructure and the solutions proposed here can perhaps serve as a template for other research groups to solve database issues. A web based user interface which is dynamically linked to the Genome Atlas Database can be accessed via www.cbs.dtu.dk/services/GenomeAtlas/. This paper has a supplemental information page which links to the examples presented: www.cbs.dtu.dk/services/GenomeAtlas/suppl/bioinfdatabase.
The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reddy, Tatiparthi B. K.; Thomas, Alex D.; Stamatis, Dimitri

The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Within this paper, we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencingmore » projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. Lastly, GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards.« less
MagnaportheDB: a federated solution for integrating physical and genetic map data with BAC end derived sequences for the rice blast fungus Magnaporthe grisea.

PubMed

Martin, Stanton L; Blackmon, Barbara P; Rajagopalan, Ravi; Houfek, Thomas D; Sceeles, Robert G; Denn, Sheila O; Mitchell, Thomas K; Brown, Douglas E; Wing, Rod A; Dean, Ralph A

2002-01-01

We have created a federated database for genome studies of Magnaporthe grisea, the causal agent of rice blast disease, by integrating end sequence data from BAC clones, genetic marker data and BAC contig assembly data. A library of 9216 BAC clones providing >25-fold coverage of the entire genome was end sequenced and fingerprinted by HindIII digestion. The Image/FPC software package was then used to generate an assembly of 188 contigs covering >95% of the genome. The database contains the results of this assembly integrated with hybridization data of genetic markers to the BAC library. AceDB was used for the core database engine and a MySQL relational database, populated with numerical representations of BAC clones within FPC contigs, was used to create appropriately scaled images. The database is being used to facilitate sequencing efforts. The database also allows researchers mapping known genes or other sequences of interest, rapid and easy access to the fundamental organization of the M.grisea genome. This database, MagnaportheDB, can be accessed on the web at http://www.cals.ncsu.edu/fungal_genomics/mgdatabase/int.htm.
Retrovirus Integration Database (RID): a public database for retroviral insertion sites into host genomes.

PubMed

Shao, Wei; Shan, Jigui; Kearney, Mary F; Wu, Xiaolin; Maldarelli, Frank; Mellors, John W; Luke, Brian; Coffin, John M; Hughes, Stephen H

2016-07-04

The NCI Retrovirus Integration Database is a MySql-based relational database created for storing and retrieving comprehensive information about retroviral integration sites, primarily, but not exclusively, HIV-1. The database is accessible to the public for submission or extraction of data originating from experiments aimed at collecting information related to retroviral integration sites including: the site of integration into the host genome, the virus family and subtype, the origin of the sample, gene exons/introns associated with integration, and proviral orientation. Information about the references from which the data were collected is also stored in the database. Tools are built into the website that can be used to map the integration sites to UCSC genome browser, to plot the integration site patterns on a chromosome, and to display provirus LTRs in their inserted genome sequence. The website is robust, user friendly, and allows users to query the database and analyze the data dynamically. https://rid.ncifcrf.gov ; or http://home.ncifcrf.gov/hivdrp/resources.htm .

Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

PubMed

Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

2016-01-04

The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data.

PubMed

Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu

2015-01-01

The microbial genome database for comparative analysis (MBGD) (available at http://mbgd.genome.ad.jp/) is a comprehensive ortholog database for flexible comparative analysis of microbial genomes, where the users are allowed to create an ortholog table among any specified set of organisms. Because of the rapid increase in microbial genome data owing to the next-generation sequencing technology, it becomes increasingly challenging to maintain high-quality orthology relationships while allowing the users to incorporate the latest genomic data available into an analysis. Because many of the recently accumulating genomic data are draft genome sequences for which some complete genome sequences of the same or closely related species are available, MBGD now stores draft genome data and allows the users to incorporate them into a user-specific ortholog database using the MyMBGD functionality. In this function, draft genome data are incorporated into an existing ortholog table created only from the complete genome data in an incremental manner to prevent low-quality draft data from affecting clustering results. In addition, to provide high-quality orthology relationships, the standard ortholog table containing all the representative genomes, which is first created by the rapid classification program DomClust, is now refined using DomRefine, a recently developed program for improving domain-level clustering using multiple sequence alignment information. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Updates to the Cool Season Food Legume Genome Database: Resources for pea, lentil, faba bean and chickpea genetics, genomics and breeding

USDA-ARS?s Scientific Manuscript database

The Cool Season Food Legume Genome database (CSFL, www.coolseasonfoodlegume.org) is an online resource for genomics, genetics, and breeding research for chickpea, lentil,pea, and faba bean. The user-friendly and curated website allows for all publicly available map,marker,trait, gene,transcript, ger...
Integrated Database And Knowledge Base For Genomic Prospective Cohort Study In Tohoku Medical Megabank Toward Personalized Prevention And Medicine.

PubMed

Ogishima, Soichi; Takai, Takako; Shimokawa, Kazuro; Nagaie, Satoshi; Tanaka, Hiroshi; Nakaya, Jun

2015-01-01

The Tohoku Medical Megabank project is a national project to revitalization of the disaster area in the Tohoku region by the Great East Japan Earthquake, and have conducted large-scale prospective genome-cohort study. Along with prospective genome-cohort study, we have developed integrated database and knowledge base which will be key database for realizing personalized prevention and medicine.
Haemophilus influenzae Genome Database (HIGDB): a single point web resource for Haemophilus influenzae.

PubMed

Swetha, Rayapadi G; Kala Sekar, Dinesh Kumar; Ramaiah, Sudha; Anbarasu, Anand; Sekar, Kanagaraj

2014-12-01

Haemophilus influenzae (H. Influenzae) is the causative agent of pneumonia, bacteraemia and meningitis. The organism is responsible for large number of deaths in both developed and developing countries. Even-though the first bacterial genome to be sequenced was that of H. Influenzae, there is no exclusive database dedicated for H. Influenzae. This prompted us to develop the Haemophilus influenzae Genome Database (HIGDB). All data of HIGDB are stored and managed in MySQL database. The HIGDB is hosted on Solaris server and developed using PERL modules. Ajax and JavaScript are used for the interface development. The HIGDB contains detailed information on 42,741 proteins, 18,077 genes including 10 whole genome sequences and also 284 three dimensional structures of proteins of H. influenzae. In addition, the database provides "Motif search" and "GBrowse". The HIGDB is freely accessible through the URL: http://bioserver1.physics.iisc.ernet.in/HIGDB/. The HIGDB will be a single point access for bacteriological, clinical, genomic and proteomic information of H. influenzae. The database can also be used to identify DNA motifs within H. influenzae genomes and to compare gene or protein sequences of a particular strain with other strains of H. influenzae. Copyright © 2014 Elsevier Ltd. All rights reserved.
Candida-induced prosthetic joint infection. A literature review including 72 cases and a case report.

PubMed

Cobo, Fernando; Rodríguez-Granger, Javier; López, Enrique M; Jiménez, Gemma; Sampedro, Antonio; Aliaga-Martínez, Luis; Navarro-Marí, José María

2017-02-01

The clinical and microbiological characteristics of prosthetic joint infection (PJI) caused by Candida species is described, including 72 cases in the literature and a case of Candida glabrata infection handled at the present centre. We describe one patient and using the key words 'fungal prosthetic joint infection' and 'candida prosthetic joint infection' we searched MEDLINE (National Library of Medicine, Bethesda, MD), Web of Science, CINAHL and Cochrane systematic review databases for case reports of this condition. Out of the 73 patients, 38 were female; mean age at diagnosis was 65.7 (± SD 18) yrs; 50 had risk factors for candidal infection such as systemic disease (e.g. rheumatoid arthritis, Sjogren's syndrome, systemic lupus erythematosus) and/or immunosuppressive therapy in 18 (24.6%) cases, diabetes mellitus in 14 (19.1%), immunosuppression due to malignant or chronic disease in 24 (32.8%) and long-term antibiotic use in four (5.4%) patients. Infection site was the knee in 36 patients and hip in 35; pain was present in 43 patients and swelling in 23 and the mean surgery-diagnosis interval was 32 months. The most frequent species was C. albicans, followed by C. parapsilosis. The diagnosis was obtained from joint fluid aspirate in 33 cases and intra-operative samples in 16. Susceptibility to antifungals was tested in only 21 isolates. The most frequently used antifungals were fluconazole and amphotericin B. Two-stage exchange arthroplasty was performed in 30 patients and resection arthroplasty in 31; 56 patients were cured with a combination of medical and surgical treatment; one patient died from the infection. PJI caused by Candida requires a high index of suspicion; surgery with long-term antifungal therapy is recommended.
MaizeGDB, the maize model organism database

USDA-ARS?s Scientific Manuscript database

MaizeGDB is the maize research community's database for maize genetic and genomic information. In this seminar I will outline our current endeavors including a full website redesign, the status of maize genome assembly and annotation projects, and work toward genome functional annotation. Mechanis...
Global Metabolic Reconstruction and Metabolic Gene Evolution in the Cattle Genome

PubMed Central

Kim, Woonsu; Park, Hyesun; Seo, Seongwon

2016-01-01

The sequence of cattle genome provided a valuable opportunity to systematically link genetic and metabolic traits of cattle. The objectives of this study were 1) to reconstruct genome-scale cattle-specific metabolic pathways based on the most recent and updated cattle genome build and 2) to identify duplicated metabolic genes in the cattle genome for better understanding of metabolic adaptations in cattle. A bioinformatic pipeline of an organism for amalgamating genomic annotations from multiple sources was updated. Using this, an amalgamated cattle genome database based on UMD_3.1, was created. The amalgamated cattle genome database is composed of a total of 33,292 genes: 19,123 consensus genes between NCBI and Ensembl databases, 8,410 and 5,493 genes only found in NCBI or Ensembl, respectively, and 266 genes from NCBI scaffolds. A metabolic reconstruction of the cattle genome and cattle pathway genome database (PGDB) was also developed using Pathway Tools, followed by an intensive manual curation. The manual curation filled or revised 68 pathway holes, deleted 36 metabolic pathways, and added 23 metabolic pathways. Consequently, the curated cattle PGDB contains 304 metabolic pathways, 2,460 reactions including 2,371 enzymatic reactions, and 4,012 enzymes. Furthermore, this study identified eight duplicated genes in 12 metabolic pathways in the cattle genome compared to human and mouse. Some of these duplicated genes are related with specific hormone biosynthesis and detoxifications. The updated genome-scale metabolic reconstruction is a useful tool for understanding biology and metabolic characteristics in cattle. There has been significant improvements in the quality of cattle genome annotations and the MetaCyc database. The duplicated metabolic genes in the cattle genome compared to human and mouse implies evolutionary changes in the cattle genome and provides a useful information for further research on understanding metabolic adaptations of cattle. PMID:26992093
Human Mitochondrial Protein Database

National Institute of Standards and Technology Data Gateway

SRD 131 Human Mitochondrial Protein Database (Web, free access) The Human Mitochondrial Protein Database (HMPDb) provides comprehensive data on mitochondrial and human nuclear encoded proteins involved in mitochondrial biogenesis and function. This database consolidates information from SwissProt, LocusLink, Protein Data Bank (PDB), GenBank, Genome Database (GDB), Online Mendelian Inheritance in Man (OMIM), Human Mitochondrial Genome Database (mtDB), MITOMAP, Neuromuscular Disease Center and Human 2-D PAGE Databases. This database is intended as a tool not only to aid in studying the mitochondrion but in studying the associated diseases.
Design and implementation of the cacao genome database

USDA-ARS?s Scientific Manuscript database

The Cacao Genome Database (CGD, www.cacaogenomedb.org) is being developed to provide a comprehensive data mining resource of genomic, genetic and breeding data for Theobroma cacao. Designed using Chado and a collection of Drupal modules, known as Tripal, CGD currently contains the genetically anchor...
Uniform standards for genome databases in forest and fruit trees

USDA-ARS?s Scientific Manuscript database

TreeGenes and tfGDR serve the international forestry and fruit tree genomics research communities, respectively. These databases hold similar sequence data and provide resources for the submission and recovery of this information in order to enable comparative genomics research. Large-scale genotype...
SoyBase, The USDA-ARS Soybean Genetics and Genomics Database

USDA-ARS?s Scientific Manuscript database

SoyBase, the USDA-ARS soybean genetic database, is a comprehensive repository for professionally curated genetics, genomics and related data resources for soybean. SoyBase contains the most current genetic, physical and genomic sequence maps integrated with qualitative and quantitative traits. The...
Genome-wide association as a means to understanding the mammary gland

USDA-ARS?s Scientific Manuscript database

Next-generation sequencing and related technologies have facilitated the creation of enormous public databases that catalogue genomic variation. These databases have facilitated a variety of approaches to discover new genes that regulate normal biology as well as disease. Genome wide association (...
Genetic and phenotypic intra-species variation in Candida albicans.

PubMed

Hirakawa, Matthew P; Martinez, Diego A; Sakthikumar, Sharadha; Anderson, Matthew Z; Berlin, Aaron; Gujja, Sharvari; Zeng, Qiandong; Zisson, Ethan; Wang, Joshua M; Greenberg, Joshua M; Berman, Judith; Bennett, Richard J; Cuomo, Christina A

2015-03-01

Candida albicans is a commensal fungus of the human gastrointestinal tract and a prevalent opportunistic pathogen. To examine diversity within this species, extensive genomic and phenotypic analyses were performed on 21 clinical C. albicans isolates. Genomic variation was evident in the form of polymorphisms, copy number variations, chromosomal inversions, subtelomeric hypervariation, loss of heterozygosity (LOH), and whole or partial chromosome aneuploidies. All 21 strains were diploid, although karyotypic changes were present in eight of the 21 isolates, with multiple strains being trisomic for Chromosome 4 or Chromosome 7. Aneuploid strains exhibited a general fitness defect relative to euploid strains when grown under replete conditions. All strains were also heterozygous, yet multiple, distinct LOH tracts were present in each isolate. Higher overall levels of genome heterozygosity correlated with faster growth rates, consistent with increased overall fitness. Genes with the highest rates of amino acid substitutions included many cell wall proteins, implicating fast evolving changes in cell adhesion and host interactions. One clinical isolate, P94015, presented several striking properties including a novel cellular phenotype, an inability to filament, drug resistance, and decreased virulence. Several of these properties were shown to be due to a homozygous nonsense mutation in the EFG1 gene. Furthermore, loss of EFG1 function resulted in increased fitness of P94015 in a commensal model of infection. Our analysis therefore reveals intra-species genetic and phenotypic differences in C. albicans and delineates a natural mutation that alters the balance between commensalism and pathogenicity. © 2015 Hirakawa et al.; Published by Cold Spring Harbor Laboratory Press.
MPD: a pathogen genome and metagenome database

PubMed Central

Zhang, Tingting; Miao, Jiaojiao; Han, Na; Qiang, Yujun; Zhang, Wen

2018-01-01

Abstract Advances in high-throughput sequencing have led to unprecedented growth in the amount of available genome sequencing data, especially for bacterial genomes, which has been accompanied by a challenge for the storage and management of such huge datasets. To facilitate bacterial research and related studies, we have developed the Mypathogen database (MPD), which provides access to users for searching, downloading, storing and sharing bacterial genomics data. The MPD represents the first pathogenic database for microbial genomes and metagenomes, and currently covers pathogenic microbial genomes (6604 genera, 11 071 species, 41 906 strains) and metagenomic data from host, air, water and other sources (28 816 samples). The MPD also functions as a management system for statistical and storage data that can be used by different organizations, thereby facilitating data sharing among different organizations and research groups. A user-friendly local client tool is provided to maintain the steady transmission of big sequencing data. The MPD is a useful tool for analysis and management in genomic research, especially for clinical Centers for Disease Control and epidemiological studies, and is expected to contribute to advancing knowledge on pathogenic bacteria genomes and metagenomes. Database URL: http://data.mypathogen.org PMID:29917040
MIPS: a database for protein sequences, homology data and yeast genome information.

PubMed Central

Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F

1997-01-01

The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498
MIPS: analysis and annotation of proteins from whole genomes in 2005

PubMed Central

Mewes, H. W.; Frishman, D.; Mayer, K. F. X.; Münsterkötter, M.; Noubibou, O.; Pagel, P.; Rattei, T.; Oesterheld, M.; Ruepp, A.; Stümpflen, V.

2006-01-01

The Munich Information Center for Protein Sequences (MIPS at the GSF), Neuherberg, Germany, provides resources related to genome information. Manually curated databases for several reference organisms are maintained. Several of these databases are described elsewhere in this and other recent NAR database issues. In a complementary effort, a comprehensive set of >400 genomes automatically annotated with the PEDANT system are maintained. The main goal of our current work on creating and maintaining genome databases is to extend gene centered information to information on interactions within a generic comprehensive framework. We have concentrated our efforts along three lines (i) the development of suitable comprehensive data structures and database technology, communication and query tools to include a wide range of different types of information enabling the representation of complex information such as functional modules or networks Genome Research Environment System, (ii) the development of databases covering computable information such as the basic evolutionary relations among all genes, namely SIMAP, the sequence similarity matrix and the CABiNet network analysis framework and (iii) the compilation and manual annotation of information related to interactions such as protein–protein interactions or other types of relations (e.g. MPCDB, MPPI, CYGD). All databases described and the detailed descriptions of our projects can be accessed through the MIPS WWW server (). PMID:16381839
MIPS: analysis and annotation of proteins from whole genomes in 2005.

PubMed

Mewes, H W; Frishman, D; Mayer, K F X; Münsterkötter, M; Noubibou, O; Pagel, P; Rattei, T; Oesterheld, M; Ruepp, A; Stümpflen, V

2006-01-01

The Munich Information Center for Protein Sequences (MIPS at the GSF), Neuherberg, Germany, provides resources related to genome information. Manually curated databases for several reference organisms are maintained. Several of these databases are described elsewhere in this and other recent NAR database issues. In a complementary effort, a comprehensive set of >400 genomes automatically annotated with the PEDANT system are maintained. The main goal of our current work on creating and maintaining genome databases is to extend gene centered information to information on interactions within a generic comprehensive framework. We have concentrated our efforts along three lines (i) the development of suitable comprehensive data structures and database technology, communication and query tools to include a wide range of different types of information enabling the representation of complex information such as functional modules or networks Genome Research Environment System, (ii) the development of databases covering computable information such as the basic evolutionary relations among all genes, namely SIMAP, the sequence similarity matrix and the CABiNet network analysis framework and (iii) the compilation and manual annotation of information related to interactions such as protein-protein interactions or other types of relations (e.g. MPCDB, MPPI, CYGD). All databases described and the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.gsf.de).
Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.

PubMed

Hiscock, D; Upton, C

2000-05-01

The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .
Choosing a genome browser for a Model Organism Database: surveying the Maize community

PubMed Central

Sen, Taner Z.; Harper, Lisa C.; Schaeffer, Mary L.; Andorf, Carson M.; Seigfried, Trent E.; Campbell, Darwin A.; Lawrence, Carolyn J.

2010-01-01

As the B73 maize genome sequencing project neared completion, MaizeGDB began to integrate a graphical genome browser with its existing web interface and database. To ensure that maize researchers would optimally benefit from the potential addition of a genome browser to the existing MaizeGDB resource, personnel at MaizeGDB surveyed researchers’ needs. Collected data indicate that existing genome browsers for maize were inadequate and suggest implementation of a browser with quick interface and intuitive tools would meet most researchers’ needs. Here, we document the survey’s outcomes, review functionalities of available genome browser software platforms and offer our rationale for choosing the GBrowse software suite for MaizeGDB. Because the genome as represented within the MaizeGDB Genome Browser is tied to detailed phenotypic data, molecular marker information, available stocks, etc., the MaizeGDB Genome Browser represents a novel mechanism by which the researchers can leverage maize sequence information toward crop improvement directly. Database URL: http://gbrowse.maizegdb.org/ PMID:20627860

Nencki Genomics Database--Ensembl funcgen enhanced with intersections, user data and genome-wide TFBS motifs.

PubMed

Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal

2013-01-01

We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql -h database.nencki-genomics.org -u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface.
Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

PubMed

Mackey, Aaron J; Pearson, William R

2004-10-01

Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
BioQ: tracing experimental origins in public genomic databases using a novel data provenance model.

PubMed

Saccone, Scott F; Quan, Jiaxi; Jones, Peter L

2012-04-15

Public genomic databases, which are often used to guide genetic studies of human disease, are now being applied to genomic medicine through in silico integrative genomics. These databases, however, often lack tools for systematically determining the experimental origins of the data. We introduce a new data provenance model that we have implemented in a public web application, BioQ, for assessing the reliability of the data by systematically tracing its experimental origins to the original subjects and biologics. BioQ allows investigators to both visualize data provenance as well as explore individual elements of experimental process flow using precise tools for detailed data exploration and documentation. It includes a number of human genetic variation databases such as the HapMap and 1000 Genomes projects. BioQ is freely available to the public at http://bioq.saclab.net.
TabSQL: a MySQL tool to facilitate mapping user data to public databases.

PubMed

Xia, Xiao-Qin; McClelland, Michael; Wang, Yipeng

2010-06-23

With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data.
TabSQL: a MySQL tool to facilitate mapping user data to public databases

PubMed Central

2010-01-01

Background With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. Results We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. Conclusions TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data. PMID:20573251
Orthology for comparative genomics in the mouse genome database.

PubMed

Dolan, Mary E; Baldarelli, Richard M; Bello, Susan M; Ni, Li; McAndrews, Monica S; Bult, Carol J; Kadin, James A; Richardson, Joel E; Ringwald, Martin; Eppig, Janan T; Blake, Judith A

2015-08-01

The mouse genome database (MGD) is the model organism database component of the mouse genome informatics system at The Jackson Laboratory. MGD is the international data resource for the laboratory mouse and facilitates the use of mice in the study of human health and disease. Since its beginnings, MGD has included comparative genomics data with a particular focus on human-mouse orthology, an essential component of the use of mouse as a model organism. Over the past 25 years, novel algorithms and addition of orthologs from other model organisms have enriched comparative genomics in MGD data, extending the use of orthology data to support the laboratory mouse as a model of human biology. Here, we describe current comparative data in MGD and review the history and refinement of orthology representation in this resource.
Cloning and characterization of a Candida albicans gene homologous to fructose-1,6-bisphosphatase genes.

PubMed

De la Rosa, J M; Ruíz, T; Rodríguez, L

2000-12-01

By sequencing of the DNA adjacent to the Candida albicans SEC61 gene, an open reading frame encoding a polypeptide of 331 amino acids was found. The predicted protein showed a strong homology with the fructose-1,6-bisphosphatase [FbPase] from other organisms, and conserved regions included the catalytic motif found in all known FbPases. Although the cloned gene did not complement the growth failure of a Saccharomyces cerevisiae fbp1 mutant in media with gluconeogenic carbon sources, it was transcribed in the transformants in a fashion that indicates a partial repression by glucose. A similar control on the transcription of this gene and on FbPase activity was found in wild-type C. albicans, where the cloned gene (CaFBP1) was shown to be localized in a single chromosomal locus in the genome.
Xylitol dehydrogenase from Candida tropicalis: molecular cloning of the gene and structural analysis of the protein.

PubMed

Lima, Luanne Helena Augusto; Pinheiro, Cristiano Guimarães do Amaral; de Moraes, Lídia Maria Pepe; de Freitas, Sonia Maria; Torres, Fernando Araripe Gonçalves

2006-12-01

Yeasts can metabolize xylose by the action of two key enzymes: xylose reductase and xylitol dehydrogenase. In this work, we present data concerning the cloning of the XYL2 gene encoding xylitol dehydrogenase from the yeast Candida tropicalis. The gene is present as a single copy in the genome and is controlled at the transcriptional level by the presence of the inducer xylose. XYL2 was functionally tested by heterologous expression in Saccharomyces cerevisiae to develop a yeast strain capable of producing ethanol from xylose. Structural analysis of C. tropicalis xylitol dehydrogenase, Xyl2, suggests that it is a member of the medium-chain dehydrogenase (MDR) family. This is supported by the presence of the amino acid signature [GHE]xx[G]xxxxx[G]xx[V] in its primary sequence and a typical alcohol dehydrogenase Rossmann fold pattern composed by NAD(+) and zinc ion binding domains.
Yeast diversity isolated from grape musts during spontaneous fermentation from a Brazilian winery.

PubMed

Bezerra-Bussoli, Carolina; Baffi, Milla Alves; Gomes, Eleni; Da-Silva, Roberto

2013-09-01

Saccharomyces and non-Saccharomyces yeast species from a winery located in Brazil were identified by ribosomal gene-sequencing analysis. A total of 130 yeast strains were isolated from grape surfaces and musts during alcoholic fermentation from Isabel, Bordeaux, and Cabernet Sauvignon varieties. Samples were submitted to PCR-RFLP analysis and genomic sequencing. Thirteen species were identified: Candida quercitrusa, Candida stellata, Cryptococcus flavescens, Cryptococcus laurentii, Hanseniaspora uvarum, Issatchenkia occidentalis, Issatchenkia orientalis, Issatchenkia terricola, Pichia kluyveri, Pichia guilliermondii, Pichia sp., Saccharomyces cerevisiae, and Sporidiobolus pararoseus. A sequential substitution of species during the different stages of fermentation, with a dominance of non-Saccharomyces yeasts at the beginning, and a successive replacement of species by S. cerevisiae strains at the final steps were observed. This is the first report about the yeast distribution present throughout the alcoholic fermentation in a Brazilian winery, providing supportive information for future studies on their contribution to wine quality.
The MaizeGDB Genome Browser tutorial: one example of database outreach to biologists via video.

PubMed

Harper, Lisa C; Schaeffer, Mary L; Thistle, Jordan; Gardiner, Jack M; Andorf, Carson M; Campbell, Darwin A; Cannon, Ethalinda K S; Braun, Bremen L; Birkett, Scott M; Lawrence, Carolyn J; Sen, Taner Z

2011-01-01

Video tutorials are an effective way for researchers to quickly learn how to use online tools offered by biological databases. At MaizeGDB, we have developed a number of video tutorials that demonstrate how to use various tools and explicitly outline the caveats researchers should know to interpret the information available to them. One such popular video currently available is 'Using the MaizeGDB Genome Browser', which describes how the maize genome was sequenced and assembled as well as how the sequence can be visualized and interacted with via the MaizeGDB Genome Browser. Database
A Ruby API to query the Ensembl database for genomic features.

PubMed

Strozzi, Francesco; Aerts, Jan

2011-04-01

The Ensembl database makes genomic features available via its Genome Browser. It is also possible to access the underlying data through a Perl API for advanced querying. We have developed a full-featured Ruby API to the Ensembl databases, providing the same functionality as the Perl interface with additional features. A single Ruby API is used to access different releases of the Ensembl databases and is also able to query multi-species databases. Most functionality of the API is provided using the ActiveRecord pattern. The library depends on introspection to make it release independent. The API is available through the Rubygem system and can be installed with the command gem install ruby-ensembl-api.
Mycobacteriophage genome database.

PubMed

Joseph, Jerrine; Rajendran, Vasanthi; Hassan, Sameer; Kumar, Vanaja

2011-01-01

Mycobacteriophage genome database (MGDB) is an exclusive repository of the 64 completely sequenced mycobacteriophages with annotated information. It is a comprehensive compilation of the various gene parameters captured from several databases pooled together to empower mycobacteriophage researchers. The MGDB (Version No.1.0) comprises of 6086 genes from 64 mycobacteriophages classified into 72 families based on ACLAME database. Manual curation was aided by information available from public databases which was enriched further by analysis. Its web interface allows browsing as well as querying the classification. The main objective is to collect and organize the complexity inherent to mycobacteriophage protein classification in a rational way. The other objective is to browse the existing and new genomes and describe their functional annotation. The database is available for free at http://mpgdb.ibioinformatics.org/mpgdb.php.
Multicenter study evaluating the Vitek MS system for identification of medically important yeasts.

PubMed

Westblade, Lars F; Jennemann, Rebecca; Branda, John A; Bythrow, Maureen; Ferraro, Mary Jane; Garner, Omai B; Ginocchio, Christine C; Lewinski, Michael A; Manji, Ryhana; Mochon, A Brian; Procop, Gary W; Richter, Sandra S; Rychert, Jenna A; Sercia, Linda; Burnham, Carey-Ann D

2013-07-01

The optimal management of fungal infections is correlated with timely organism identification. Matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry (MS) is revolutionizing the identification of yeasts isolated from clinical specimens. We present a multicenter study assessing the performance of the Vitek MS system (bioMérieux) in identifying medically important yeasts. A collection of 852 isolates was tested, including 20 Candida species (626 isolates, including 58 C. albicans, 62 C. glabrata, and 53 C. krusei isolates), 35 Cryptococcus neoformans isolates, and 191 other clinically relevant yeast isolates; in total, 31 different species were evaluated. Isolates were directly applied to a target plate, followed by a formic acid overlay. Mass spectra were acquired using the Vitek MS system and were analyzed using the Vitek MS v2.0 database. The gold standard for identification was sequence analysis of the D2 region of the 26S rRNA gene. In total, 823 isolates (96.6%) were identified to the genus level and 819 isolates (96.1%) were identified to the species level. Twenty-four isolates (2.8%) were not identified, and five isolates (0.6%) were misidentified. Misidentified isolates included one isolate of C. albicans (n = 58) identified as Candida dubliniensis, one isolate of Candida parapsilosis (n = 73) identified as Candida pelliculosa, and three isolates of Geotrichum klebahnii (n = 6) identified as Geotrichum candidum. The identification of clinically relevant yeasts using MS is superior to the phenotypic identification systems currently employed in clinical microbiology laboratories.
CycADS: an annotation database system to ease the development and update of BioCyc databases

PubMed Central

Vellozo, Augusto F.; Véron, Amélie S.; Baa-Puyoulet, Patrice; Huerta-Cepas, Jaime; Cottret, Ludovic; Febvay, Gérard; Calevro, Federica; Rahbé, Yvan; Douglas, Angela E.; Gabaldón, Toni; Sagot, Marie-France; Charles, Hubert; Colella, Stefano

2011-01-01

In recent years, genomes from an increasing number of organisms have been sequenced, but their annotation remains a time-consuming process. The BioCyc databases offer a framework for the integrated analysis of metabolic networks. The Pathway tool software suite allows the automated construction of a database starting from an annotated genome, but it requires prior integration of all annotations into a specific summary file or into a GenBank file. To allow the easy creation and update of a BioCyc database starting from the multiple genome annotation resources available over time, we have developed an ad hoc data management system that we called Cyc Annotation Database System (CycADS). CycADS is centred on a specific database model and on a set of Java programs to import, filter and export relevant information. Data from GenBank and other annotation sources (including for example: KAAS, PRIAM, Blast2GO and PhylomeDB) are collected into a database to be subsequently filtered and extracted to generate a complete annotation file. This file is then used to build an enriched BioCyc database using the PathoLogic program of Pathway Tools. The CycADS pipeline for annotation management was used to build the AcypiCyc database for the pea aphid (Acyrthosiphon pisum) whose genome was recently sequenced. The AcypiCyc database webpage includes also, for comparative analyses, two other metabolic reconstruction BioCyc databases generated using CycADS: TricaCyc for Tribolium castaneum and DromeCyc for Drosophila melanogaster. Linked to its flexible design, CycADS offers a powerful software tool for the generation and regular updating of enriched BioCyc databases. The CycADS system is particularly suited for metabolic gene annotation and network reconstruction in newly sequenced genomes. Because of the uniform annotation used for metabolic network reconstruction, CycADS is particularly useful for comparative analysis of the metabolism of different organisms. Database URL: http://www.cycadsys.org PMID:21474551
DroSpeGe: rapid access database for new Drosophila species genomes.

PubMed

Gilbert, Donald G

2007-01-01

The Drosophila species comparative genome database DroSpeGe (http://insects.eugenes.org/DroSpeGe/) provides genome researchers with rapid, usable access to 12 new and old Drosophila genomes, since its inception in 2004. Scientists can use, with minimal computing expertise, the wealth of new genome information for developing new insights into insect evolution. New genome assemblies provided by several sequencing centers have been annotated with known model organism gene homologies and gene predictions to provided basic comparative data. TeraGrid supplies the shared cyberinfrastructure for the primary computations. This genome database includes homologies to Drosophila melanogaster and eight other eukaryote model genomes, and gene predictions from several groups. BLAST searches of the newest assemblies are integrated with genome maps. GBrowse maps provide detailed views of cross-species aligned genomes. BioMart provides for data mining of annotations and sequences. Common chromosome maps identify major synteny among species. Potential gain and loss of genes is suggested by Gene Ontology groupings for genes of the new species. Summaries of essential genome statistics include sizes, genes found and predicted, homology among genomes, phylogenetic trees of species and comparisons of several gene predictions for sensitivity and specificity in finding new and known genes.
CottonGen: a genomics, genetics and breeding database for cotton research

USDA-ARS?s Scientific Manuscript database

CottonGen (http://www.cottongen.org) is a curated and integrated web-based relational database providing access to publicly available genomic, genetic and breeding data for cotton. CottonGen supercedes CottonDB and the Cotton Marker Database, with enhanced tools for easier data sharing, mining, vis...
Use of Genomic Databases for Inquiry-Based Learning about Influenza

ERIC Educational Resources Information Center

Ledley, Fred; Ndung'u, Eric

2011-01-01

The genome projects of the past decades have created extensive databases of biological information with applications in both research and education. We describe an inquiry-based exercise that uses one such database, the National Center for Biotechnology Information Influenza Virus Resource, to advance learning about influenza. This database…
The porcine translational research database: A manually curated, genomics and proteomics-based research resource

USDA-ARS?s Scientific Manuscript database

The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are...
dbWGFP: a database and web server of human whole-genome single nucleotide variants and their functional predictions.

PubMed

Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui

2016-01-01

The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. © The Author(s) 2016. Published by Oxford University Press.
Exploration of the Chemical Space of Public Genomic Databases

EPA Science Inventory

The current project aims to chemically index the content of public genomic databases to make these data accessible in relation to other publicly available, chemically-indexed toxicological information.

Genomics Community Resources | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

To facilitate genomic research and the dissemination of its products, National Human Genome Research Institute (NHGRI) supports genomic resources that are crucial for basic research, disease studies, model organism studies, and other biomedical research. Awards under this FOA will support the development and distribution of genomic resources that will be valuable for the broad research community, using cost-effective approaches. Such resources include (but are not limited to) databases and informatics resources (such as human and model organism databases, ontologies, and analysi
Resolving the problem of multiple accessions of the same transcript deposited across various public databases.

PubMed

Weirick, Tyler; John, David; Uchida, Shizuka

2017-03-01

Maintaining the consistency of genomic annotations is an increasingly complex task because of the iterative and dynamic nature of assembly and annotation, growing numbers of biological databases and insufficient integration of annotations across databases. As information exchange among databases is poor, a 'novel' sequence from one reference annotation could be annotated in another. Furthermore, relationships to nearby or overlapping annotated transcripts are even more complicated when using different genome assemblies. To better understand these problems, we surveyed current and previous versions of genomic assemblies and annotations across a number of public databases containing long noncoding RNA. We identified numerous discrepancies of transcripts regarding their genomic locations, transcript lengths and identifiers. Further investigation showed that the positional differences between reference annotations of essentially the same transcript could lead to differences in its measured expression at the RNA level. To aid in resolving these problems, we present the algorithm 'Universal Genomic Accession Hash (UGAHash)' and created an open source web tool to encourage the usage of the UGAHash algorithm. The UGAHash web tool (http://ugahash.uni-frankfurt.de) can be accessed freely without registration. The web tool allows researchers to generate Universal Genomic Accessions for genomic features or to explore annotations deposited in the public databases of the past and present versions. We anticipate that the UGAHash web tool will be a valuable tool to check for the existence of transcripts before judging the newly discovered transcripts as novel. © The Author 2016. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
WGE: a CRISPR database for genome engineering.

PubMed

Hodgkins, Alex; Farne, Anna; Perera, Sajith; Grego, Tiago; Parry-Smith, David J; Skarnes, William C; Iyer, Vivek

2015-09-15

The rapid development of CRISPR-Cas9 mediated genome editing techniques has given rise to a number of online and stand-alone tools to find and score CRISPR sites for whole genomes. Here we describe the Wellcome Trust Sanger Institute Genome Editing database (WGE), which uses novel methods to compute, visualize and select optimal CRISPR sites in a genome browser environment. The WGE database currently stores single and paired CRISPR sites and pre-calculated off-target information for CRISPRs located in the mouse and human exomes. Scoring and display of off-target sites is simple, and intuitive, and filters can be applied to identify high-quality CRISPR sites rapidly. WGE also provides a tool for the design and display of gene targeting vectors in the same genome browser, along with gene models, protein translation and variation tracks. WGE is open, extensible and can be set up to compute and present CRISPR sites for any genome. The WGE database is freely available at www.sanger.ac.uk/htgt/wge : vvi@sanger.ac.uk or skarnes@sanger.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
VCGDB: a dynamic genome database of the Chinese population

PubMed Central

2014-01-01

Background The data released by the 1000 Genomes Project contain an increasing number of genome sequences from different nations and populations with a large number of genetic variations. As a result, the focus of human genome studies is changing from single and static to complex and dynamic. The currently available human reference genome (GRCh37) is based on sequencing data from 13 anonymous Caucasian volunteers, which might limit the scope of genomics, transcriptomics, epigenetics, and genome wide association studies. Description We used the massive amount of sequencing data published by the 1000 Genomes Project Consortium to construct the Virtual Chinese Genome Database (VCGDB), a dynamic genome database of the Chinese population based on the whole genome sequencing data of 194 individuals. VCGDB provides dynamic genomic information, which contains 35 million single nucleotide variations (SNVs), 0.5 million insertions/deletions (indels), and 29 million rare variations, together with genomic annotation information. VCGDB also provides a highly interactive user-friendly virtual Chinese genome browser (VCGBrowser) with functions like seamless zooming and real-time searching. In addition, we have established three population-specific consensus Chinese reference genomes that are compatible with mainstream alignment software. Conclusions VCGDB offers a feasible strategy for processing big data to keep pace with the biological data explosion by providing a robust resource for genomics studies; in particular, studies aimed at finding regions of the genome associated with diseases. PMID:24708222
Public variant databases: liability?

PubMed

Thorogood, Adrian; Cook-Deegan, Robert; Knoppers, Bartha Maria

2017-07-01

Public variant databases support the curation, clinical interpretation, and sharing of genomic data, thus reducing harmful errors or delays in diagnosis. As variant databases are increasingly relied on in the clinical context, there is concern that negligent variant interpretation will harm patients and attract liability. This article explores the evolving legal duties of laboratories, public variant databases, and physicians in clinical genomics and recommends a governance framework for databases to promote responsible data sharing.Genet Med advance online publication 15 December 2016.
Purification, Reconstitution, and Inhibition of Cytochrome P-450 Sterol Δ22-Desaturase from the Pathogenic Fungus Candida glabrata

PubMed Central

Lamb, David C.; Maspahy, Segula; Kelly, Diane E.; Manning, Nigel J.; Geber, Antonia; Bennett, John E.; Kelly, Steven L.

1999-01-01

Sterol Δ22-desaturase has been purified from a strain of Candida glabrata with a disruption in the gene encoding sterol 14α-demethylase (cytochrome P-45051; CYP51). The purified cytochrome P-450 exhibited sterol Δ22-desaturase activity in a reconstituted system with NADPH–cytochrome P-450 reductase in dilaurylphosphatidylcholine, with the enzyme kinetic studies revealing a Km for ergosta-5,7-dienol of 12.5 μM and a Vmax of 0.59 nmol of this substrate metabolized/min/nmol of P-450. This enzyme is encoded by CYP61 (ERG5) in Saccharomyces cerevisiae, and homologues have been shown in the Candida albicans and Schizosaccharomyces pombe genome projects. Ketoconazole, itraconazole, and fluconazole formed low-spin complexes with the ferric cytochrome and exhibited type II spectra, which are indicative of an interaction between the azole moiety and the cytochrome heme. The azole antifungal compounds inhibited reconstituted sterol Δ22-desaturase activity by binding to the cytochrome with a one-to-one stoichiometry, with total inhibition of enzyme activity occurring when equimolar amounts of azole and cytochrome P-450 were added. These results reveal the potential for sterol Δ22-desaturase to be an antifungal target and to contribute to the binding of drugs within the fungal cell. PMID:10390230
Extension modules for storage, visualization and querying of genomic, genetic and breeding data in Tripal databases

PubMed Central

Lee, Taein; Cheng, Chun-Huai; Ficklin, Stephen; Yu, Jing; Humann, Jodi; Main, Dorrie

2017-01-01

Abstract Tripal is an open-source database platform primarily used for development of genomic, genetic and breeding databases. We report here on the release of the Chado Loader, Chado Data Display and Chado Search modules to extend the functionality of the core Tripal modules. These new extension modules provide additional tools for (1) data loading, (2) customized visualization and (3) advanced search functions for supported data types such as organism, marker, QTL/Mendelian Trait Loci, germplasm, map, project, phenotype, genotype and their respective metadata. The Chado Loader module provides data collection templates in Excel with defined metadata and data loaders with front end forms. The Chado Data Display module contains tools to visualize each data type and the metadata which can be used as is or customized as desired. The Chado Search module provides search and download functionality for the supported data types. Also included are the tools to visualize map and species summary. The use of materialized views in the Chado Search module enables better performance as well as flexibility of data modeling in Chado, allowing existing Tripal databases with different metadata types to utilize the module. These Tripal Extension modules are implemented in the Genome Database for Rosaceae (rosaceae.org), CottonGen (cottongen.org), Citrus Genome Database (citrusgenomedb.org), Genome Database for Vaccinium (vaccinium.org) and the Cool Season Food Legume Database (coolseasonfoodlegume.org). Database URL: https://www.citrusgenomedb.org/, https://www.coolseasonfoodlegume.org/, https://www.cottongen.org/, https://www.rosaceae.org/, https://www.vaccinium.org/
PlantRGDB: A Database of Plant Retrocopied Genes.

PubMed

Wang, Yi

2017-01-01

RNA-based gene duplication, known as retrocopy, plays important roles in gene origination and genome evolution. The genomes of many plants have been sequenced, offering an opportunity to annotate and mine the retrocopies in plant genomes. However, comprehensive and unified annotation of retrocopies in these plants is still lacking. In this study I constructed the PlantRGDB (Plant Retrocopied Gene DataBase), the first database of plant retrocopies, to provide a putatively complete centralized list of retrocopies in plant genomes. The database is freely accessible at http://probes.pw.usda.gov/plantrgdb or http://aegilops.wheat.ucdavis.edu/plantrgdb. It currently integrates 49 plant species and 38,997 retrocopies along with characterization information. PlantRGDB provides a user-friendly web interface for searching, browsing and downloading the retrocopies in the database. PlantRGDB also offers graphical viewer-integrated sequence information for displaying the structure of each retrocopy. The attributes of the retrocopies of each species are reported using a browse function. In addition, useful tools, such as an advanced search and BLAST, are available to search the database more conveniently. In conclusion, the database will provide a web platform for obtaining valuable insight into the generation of retrocopies and will supplement research on gene duplication and genome evolution in plants. © The Author 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Complete genome sequence of Lactobacillus plantarum LZ206, a potential probiotic strain with antimicrobial activity against food-borne pathogenic microorganisms.

PubMed

Li, Ping; Gu, Qing; Zhou, Qingqing

2016-11-20

Lactobacilli strains have been considered as important candidates for manufacturing "natural food", due to their antimicrobial properties and generally regarded as safe (GRAS) status. Lactobacillus plantarum LZ206 is a potential probiotic strain isolated from raw cow milk, with antimicrobial activity against various pathogens, including Gram-positive bacteria (Staphylococcus aureus and Listeria monocytogenes), Gram-negtive bacteria (Escherichia coli and Salmonella enterica), and fungus Candida albicans. To better understand molecular base for its antimicrobial activity, entire genome of LZ206 was sequenced. It was revealed that genome of LZ206 contained a circular 3,212,951-bp chromosome, two circular plasmids and one predicted linear plasmid. A plantaricin gene cluster, which is responsible for bacteriocins biosynthesis and could be associated with its broad-spectrum antimicrobial activity, was identified based on comparative genomic analysis. Whole genome sequencing of L. plantarum LZ206 might facilitate its applications to protect food products from pathogens' contamination in the dairy industry. Copyright © 2016 Elsevier B.V. All rights reserved.
Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics.

PubMed

Sakai, Hiroaki; Lee, Sung Shin; Tanaka, Tsuyoshi; Numa, Hisataka; Kim, Jungsok; Kawahara, Yoshihiro; Wakimoto, Hironobu; Yang, Ching-chia; Iwamoto, Masao; Abe, Takashi; Yamada, Yuko; Muto, Akira; Inokuchi, Hachiro; Ikemura, Toshimichi; Matsumoto, Takashi; Sasaki, Takuji; Itoh, Takeshi

2013-02-01

The Rice Annotation Project Database (RAP-DB, http://rapdb.dna.affrc.go.jp/) has been providing a comprehensive set of gene annotations for the genome sequence of rice, Oryza sativa (japonica group) cv. Nipponbare. Since the first release in 2005, RAP-DB has been updated several times along with the genome assembly updates. Here, we present our newest RAP-DB based on the latest genome assembly, Os-Nipponbare-Reference-IRGSP-1.0 (IRGSP-1.0), which was released in 2011. We detected 37,869 loci by mapping transcript and protein sequences of 150 monocot species. To provide plant researchers with highly reliable and up to date rice gene annotations, we have been incorporating literature-based manually curated data, and 1,626 loci currently incorporate literature-based annotation data, including commonly used gene names or gene symbols. Transcriptional activities are shown at the nucleotide level by mapping RNA-Seq reads derived from 27 samples. We also mapped the Illumina reads of a Japanese leading japonica cultivar, Koshihikari, and a Chinese indica cultivar, Guangluai-4, to the genome and show alignments together with the single nucleotide polymorphisms (SNPs) and gene functional annotations through a newly developed browser, Short-Read Assembly Browser (S-RAB). We have developed two satellite databases, Plant Gene Family Database (PGFD) and Integrative Database of Cereal Gene Phylogeny (IDCGP), which display gene family and homologous gene relationships among diverse plant species. RAP-DB and the satellite databases offer simple and user-friendly web interfaces, enabling plant and genome researchers to access the data easily and facilitating a broad range of plant research topics.
The MaizeGDB Genome Browser tutorial: one example of database outreach to biologists via video

PubMed Central

Harper, Lisa C.; Schaeffer, Mary L.; Thistle, Jordan; Gardiner, Jack M.; Andorf, Carson M.; Campbell, Darwin A.; Cannon, Ethalinda K.S.; Braun, Bremen L.; Birkett, Scott M.; Lawrence, Carolyn J.; Sen, Taner Z.

2011-01-01

Video tutorials are an effective way for researchers to quickly learn how to use online tools offered by biological databases. At MaizeGDB, we have developed a number of video tutorials that demonstrate how to use various tools and explicitly outline the caveats researchers should know to interpret the information available to them. One such popular video currently available is ‘Using the MaizeGDB Genome Browser’, which describes how the maize genome was sequenced and assembled as well as how the sequence can be visualized and interacted with via the MaizeGDB Genome Browser. Database URL: http://www.maizegdb.org/ PMID:21565781
The new modern era of yeast genomics: community sequencing and the resulting annotation of multiple Saccharomyces cerevisiae strains at the Saccharomyces Genome Database

PubMed Central

Engel, Stacia R.; Cherry, J. Michael

2013-01-01

The first completed eukaryotic genome sequence was that of the yeast Saccharomyces cerevisiae, and the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the original model organism database. SGD remains the authoritative community resource for the S. cerevisiae reference genome sequence and its annotation, and continues to provide comprehensive biological information correlated with S. cerevisiae genes and their products. A diverse set of yeast strains have been sequenced to explore commercial and laboratory applications, and a brief history of those strains is provided. The publication of these new genomes has motivated the creation of new tools, and SGD will annotate and provide comparative analyses of these sequences, correlating changes with variations in strain phenotypes and protein function. We are entering a new era at SGD, as we incorporate these new sequences and make them accessible to the scientific community, all in an effort to continue in our mission of educating researchers and facilitating discovery. Database URL: http://www.yeastgenome.org/ PMID:23487186
Metabolome searcher: a high throughput tool for metabolite identification and metabolic pathway mapping directly from mass spectrometry and using genome restriction.

PubMed

Dhanasekaran, A Ranjitha; Pearson, Jon L; Ganesan, Balasubramanian; Weimer, Bart C

2015-02-25

Mass spectrometric analysis of microbial metabolism provides a long list of possible compounds. Restricting the identification of the possible compounds to those produced by the specific organism would benefit the identification process. Currently, identification of mass spectrometry (MS) data is commonly done using empirically derived compound databases. Unfortunately, most databases contain relatively few compounds, leaving long lists of unidentified molecules. Incorporating genome-encoded metabolism enables MS output identification that may not be included in databases. Using an organism's genome as a database restricts metabolite identification to only those compounds that the organism can produce. To address the challenge of metabolomic analysis from MS data, a web-based application to directly search genome-constructed metabolic databases was developed. The user query returns a genome-restricted list of possible compound identifications along with the putative metabolic pathways based on the name, formula, SMILES structure, and the compound mass as defined by the user. Multiple queries can be done simultaneously by submitting a text file created by the user or obtained from the MS analysis software. The user can also provide parameters specific to the experiment's MS analysis conditions, such as mass deviation, adducts, and detection mode during the query so as to provide additional levels of evidence to produce the tentative identification. The query results are provided as an HTML page and downloadable text file of possible compounds that are restricted to a specific genome. Hyperlinks provided in the HTML file connect the user to the curated metabolic databases housed in ProCyc, a Pathway Tools platform, as well as the KEGG Pathway database for visualization and metabolic pathway analysis. Metabolome Searcher, a web-based tool, facilitates putative compound identification of MS output based on genome-restricted metabolic capability. This enables researchers to rapidly extend the possible identifications of large data sets for metabolites that are not in compound databases. Putative compound names with their associated metabolic pathways from metabolomics data sets are returned to the user for additional biological interpretation and visualization. This novel approach enables compound identification by restricting the possible masses to those encoded in the genome.
Exploring Genetic, Genomic, and Phenotypic Data at the Rat Genome Database

PubMed Central

Laulederkind, Stanley J. F.; Hayman, G. Thomas; Wang, Shur-Jen; Lowry, Timothy F.; Nigam, Rajni; Petri, Victoria; Smith, Jennifer R.; Dwinell, Melinda R.; Jacob, Howard J.; Shimoyama, Mary

2013-01-01

The laboratory rat, Rattus norvegicus, is an important model of human health and disease, and experimental findings in the rat have relevance to human physiology and disease. The Rat Genome Database (RGD, http://rgd.mcw.edu) is a model organism database that provides access to a wide variety of curated rat data including disease associations, phenotypes, pathways, molecular functions, biological processes and cellular components for genes, quantitative trait loci, and strains. We present an overview of the database followed by specific examples that can be used to gain experience in employing RGD to explore the wealth of functional data available for the rat. PMID:23255149
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency.

PubMed

Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio

2015-01-01

Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.
Guided genome halving: hardness, heuristics and the history of the Hemiascomycetes.

PubMed

Zheng, Chunfang; Zhu, Qian; Adam, Zaky; Sankoff, David

2008-07-01

Some present day species have incurred a whole genome doubling event in their evolutionary history, and this is reflected today in patterns of duplicated segments scattered throughout their chromosomes. These duplications may be used as data to 'halve' the genome, i.e. to reconstruct the ancestral genome at the moment of doubling, but the solution is often highly nonunique. To resolve this problem, we take account of outgroups, external reference genomes, to guide and narrow down the search. We improve on a previous, computationally costly, 'brute force' method by adapting the genome halving algorithm of El-Mabrouk and Sankoff so that it rapidly and accurately constructs an ancestor close the outgroups, prior to a local optimization heuristic. We apply this to reconstruct the predoubling ancestor of Saccharomyces cerevisiae and Candida glabrata, guided by the genomes of three other yeasts that diverged before the genome doubling event. We analyze the results in terms (1) of the minimum evolution criterion, (2) how close the genome halving result is to the final (local) minimum and (3) how close the final result is to an ancestor manually constructed by an expert with access to additional information. We also visualize the set of reconstructed ancestors using classic multidimensional scaling to see what aspects of the two doubled and three unduplicated genomes influence the differences among the reconstructions. The experimental software is available on request.
PoMaMo--a comprehensive database for potato genome data.

PubMed

Meyer, Svenja; Nagel, Axel; Gebhardt, Christiane

2005-01-01

A database for potato genome data (PoMaMo, Potato Maps and More) was established. The database contains molecular maps of all twelve potato chromosomes with about 1000 mapped elements, sequence data, putative gene functions, results from BLAST analysis, SNP and InDel information from different diploid and tetraploid potato genotypes, publication references, links to other public databases like GenBank (http://www.ncbi.nlm.nih.gov/) or SGN (Solanaceae Genomics Network, http://www.sgn.cornell.edu/), etc. Flexible search and data visualization interfaces enable easy access to the data via internet (https://gabi.rzpd.de/PoMaMo.html). The Java servlet tool YAMB (Yet Another Map Browser) was designed to interactively display chromosomal maps. Maps can be zoomed in and out, and detailed information about mapped elements can be obtained by clicking on an element of interest. The GreenCards interface allows a text-based data search by marker-, sequence- or genotype name, by sequence accession number, gene function, BLAST Hit or publication reference. The PoMaMo database is a comprehensive database for different potato genome data, and to date the only database containing SNP and InDel data from diploid and tetraploid potato genotypes.
PoMaMo—a comprehensive database for potato genome data

PubMed Central

Meyer, Svenja; Nagel, Axel; Gebhardt, Christiane

2005-01-01

A database for potato genome data (PoMaMo, Potato Maps and More) was established. The database contains molecular maps of all twelve potato chromosomes with about 1000 mapped elements, sequence data, putative gene functions, results from BLAST analysis, SNP and InDel information from different diploid and tetraploid potato genotypes, publication references, links to other public databases like GenBank (http://www.ncbi.nlm.nih.gov/) or SGN (Solanaceae Genomics Network, http://www.sgn.cornell.edu/), etc. Flexible search and data visualization interfaces enable easy access to the data via internet (https://gabi.rzpd.de/PoMaMo.html). The Java servlet tool YAMB (Yet Another Map Browser) was designed to interactively display chromosomal maps. Maps can be zoomed in and out, and detailed information about mapped elements can be obtained by clicking on an element of interest. The GreenCards interface allows a text-based data search by marker-, sequence- or genotype name, by sequence accession number, gene function, BLAST Hit or publication reference. The PoMaMo database is a comprehensive database for different potato genome data, and to date the only database containing SNP and InDel data from diploid and tetraploid potato genotypes. PMID:15608284
A searchable database for the genome of Phomopsis longicolla (isolate MSPL 10-6).

PubMed

Darwish, Omar; Li, Shuxian; May, Zane; Matthews, Benjamin; Alkharouf, Nadim W

2016-01-01

Phomopsis longicolla (syn. Diaporthe longicolla) is an important seed-borne fungal pathogen that primarily causes Phomopsis seed decay (PSD) in most soybean production areas worldwide. This disease severely decreases soybean seed quality by reducing seed viability and oil quality, altering seed composition, and increasing frequencies of moldy and/or split beans. To facilitate investigation of the genetic base of fungal virulence factors and understand the mechanism of disease development, we designed and developed a database for P. longicolla isolate MSPL 10-6 that contains information about the genome assemblies (contigs), gene models, gene descriptions and GO functional ontologies. A web-based front end to the database was built using ASP.NET, which allows researchers to search and mine the genome of this important fungus. This database represents the first reported genome database for a seed borne fungal pathogen in the Diaporthe- Phomopsis complex. The database will also be a valuable resource for research and agricultural communities. It will aid in the development of new control strategies for this pathogen. http://bioinformatics.towson.edu/Phomopsis_longicolla/HomePage.aspx.
A searchable database for the genome of Phomopsis longicolla (isolate MSPL 10-6)

PubMed Central

May, Zane; Matthews, Benjamin; Alkharouf, Nadim W.

2016-01-01

Phomopsis longicolla (syn. Diaporthe longicolla) is an important seed-borne fungal pathogen that primarily causes Phomopsis seed decay (PSD) in most soybean production areas worldwide. This disease severely decreases soybean seed quality by reducing seed viability and oil quality, altering seed composition, and increasing frequencies of moldy and/or split beans. To facilitate investigation of the genetic base of fungal virulence factors and understand the mechanism of disease development, we designed and developed a database for P. longicolla isolate MSPL 10-6 that contains information about the genome assemblies (contigs), gene models, gene descriptions and GO functional ontologies. A web-based front end to the database was built using ASP.NET, which allows researchers to search and mine the genome of this important fungus. This database represents the first reported genome database for a seed borne fungal pathogen in the Diaporthe– Phomopsis complex. The database will also be a valuable resource for research and agricultural communities. It will aid in the development of new control strategies for this pathogen. Availability: http://bioinformatics.towson.edu/Phomopsis_longicolla/HomePage.aspx PMID:28197060

THGS: a web-based database of Transmembrane Helices in Genome Sequences

PubMed Central

Fernando, S. A.; Selvarani, P.; Das, Soma; Kumar, Ch. Kiran; Mondal, Sukanta; Ramakumar, S.; Sekar, K.

2004-01-01

Transmembrane Helices in Genome Sequences (THGS) is an interactive web-based database, developed to search the transmembrane helices in the user-interested gene sequences available in the Genome Database (GDB). The proposed database has provision to search sequence motifs in transmembrane and globular proteins. In addition, the motif can be searched in the other sequence databases (Swiss-Prot and PIR) or in the macromolecular structure database, Protein Data Bank (PDB). Further, the 3D structure of the corresponding queried motif, if it is available in the solved protein structures deposited in the Protein Data Bank, can also be visualized using the widely used graphics package RASMOL. All the sequence databases used in the present work are updated frequently and hence the results produced are up to date. The database THGS is freely available via the world wide web and can be accessed at http://pranag.physics.iisc.ernet.in/thgs/ or http://144.16.71.10/thgs/. PMID:14681375
Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants.

PubMed

Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki

2014-09-01

In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops.
In silico mining of putative microsatellite markers from whole genome sequence of water buffalo (Bubalus bubalis) and development of first BuffSatDB

PubMed Central

2013-01-01

Background Though India has sequenced water buffalo genome but its draft assembly is based on cattle genome BTau 4.0, thus de novo chromosome wise assembly is a major pending issue for global community. The existing radiation hybrid of buffalo and these reported STR can be used further in final gap plugging and “finishing” expected in de novo genome assembly. QTL and gene mapping needs mining of putative STR from buffalo genome at equal interval on each and every chromosome. Such markers have potential role in improvement of desirable characteristics, such as high milk yields, resistance to diseases, high growth rate. The STR mining from whole genome and development of user friendly database is yet to be done to reap the benefit of whole genome sequence. Description By in silico microsatellite mining of whole genome, we have developed first STR database of water buffalo, BuffSatDb (Buffalo MicroSatellite Database (http://cabindb.iasri.res.in/buffsatdb/) which is a web based relational database of 910529 microsatellite markers, developed using PHP and MySQL database. Microsatellite markers have been generated using MIcroSAtellite tool. It is simple and systematic web based search for customised retrieval of chromosome wise and genome-wide microsatellites. Search has been enabled based on chromosomes, motif type (mono-hexa), repeat motif and repeat kind (simple and composite). The search may be customised by limiting location of STR on chromosome as well as number of markers in that range. This is a novel approach and not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of the selected markers enabling researcher to select markers of choice at desired interval over the chromosome. The unique add-on of degenerate bases further helps in resolving presence of degenerate bases in current buffalo assembly. Conclusion Being first buffalo STR database in the world , this would not only pave the way in resolving current assembly problem but shall be of immense use for global community in QTL/gene mapping critically required to increase knowledge in the endeavour to increase buffalo productivity, especially for third world country where rural economy is significantly dependent on buffalo productivity. PMID:23336431
In silico mining of putative microsatellite markers from whole genome sequence of water buffalo (Bubalus bubalis) and development of first BuffSatDB.

PubMed

Sarika; Arora, Vasu; Iquebal, Mir Asif; Rai, Anil; Kumar, Dinesh

2013-01-19

Though India has sequenced water buffalo genome but its draft assembly is based on cattle genome BTau 4.0, thus de novo chromosome wise assembly is a major pending issue for global community. The existing radiation hybrid of buffalo and these reported STR can be used further in final gap plugging and "finishing" expected in de novo genome assembly. QTL and gene mapping needs mining of putative STR from buffalo genome at equal interval on each and every chromosome. Such markers have potential role in improvement of desirable characteristics, such as high milk yields, resistance to diseases, high growth rate. The STR mining from whole genome and development of user friendly database is yet to be done to reap the benefit of whole genome sequence. By in silico microsatellite mining of whole genome, we have developed first STR database of water buffalo, BuffSatDb (Buffalo MicroSatellite Database (http://cabindb.iasri.res.in/buffsatdb/) which is a web based relational database of 910529 microsatellite markers, developed using PHP and MySQL database. Microsatellite markers have been generated using MIcroSAtellite tool. It is simple and systematic web based search for customised retrieval of chromosome wise and genome-wide microsatellites. Search has been enabled based on chromosomes, motif type (mono-hexa), repeat motif and repeat kind (simple and composite). The search may be customised by limiting location of STR on chromosome as well as number of markers in that range. This is a novel approach and not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of the selected markers enabling researcher to select markers of choice at desired interval over the chromosome. The unique add-on of degenerate bases further helps in resolving presence of degenerate bases in current buffalo assembly. Being first buffalo STR database in the world , this would not only pave the way in resolving current assembly problem but shall be of immense use for global community in QTL/gene mapping critically required to increase knowledge in the endeavour to increase buffalo productivity, especially for third world country where rural economy is significantly dependent on buffalo productivity.
The Innate Immune Database (IIDB)

PubMed Central

Korb, Martin; Rust, Aistair G; Thorsson, Vesteinn; Battail, Christophe; Li, Bin; Hwang, Daehee; Kennedy, Kathleen A; Roach, Jared C; Rosenberger, Carrie M; Gilchrist, Mark; Zak, Daniel; Johnson, Carrie; Marzolf, Bruz; Aderem, Alan; Shmulevich, Ilya; Bolouri, Hamid

2008-01-01

Background As part of a National Institute of Allergy and Infectious Diseases funded collaborative project, we have performed over 150 microarray experiments measuring the response of C57/BL6 mouse bone marrow macrophages to toll-like receptor stimuli. These microarray expression profiles are available freely from our project web site . Here, we report the development of a database of computationally predicted transcription factor binding sites and related genomic features for a set of over 2000 murine immune genes of interest. Our database, which includes microarray co-expression clusters and a host of web-based query, analysis and visualization facilities, is available freely via the internet. It provides a broad resource to the research community, and a stepping stone towards the delineation of the network of transcriptional regulatory interactions underlying the integrated response of macrophages to pathogens. Description We constructed a database indexed on genes and annotations of the immediate surrounding genomic regions. To facilitate both gene-specific and systems biology oriented research, our database provides the means to analyze individual genes or an entire genomic locus. Although our focus to-date has been on mammalian toll-like receptor signaling pathways, our database structure is not limited to this subject, and is intended to be broadly applicable to immunology. By focusing on selected immune-active genes, we were able to perform computationally intensive expression and sequence analyses that would currently be prohibitive if applied to the entire genome. Using six complementary computational algorithms and methodologies, we identified transcription factor binding sites based on the Position Weight Matrices available in TRANSFAC. For one example transcription factor (ATF3) for which experimental data is available, over 50% of our predicted binding sites coincide with genome-wide chromatin immnuopreciptation (ChIP-chip) results. Our database can be interrogated via a web interface. Genomic annotations and binding site predictions can be automatically viewed with a customized version of the Argo genome browser. Conclusion We present the Innate Immune Database (IIDB) as a community resource for immunologists interested in gene regulatory systems underlying innate responses to pathogens. The database website can be freely accessed at . PMID:18321385
BioQ: tracing experimental origins in public genomic databases using a novel data provenance model

PubMed Central

Saccone, Scott F.; Quan, Jiaxi; Jones, Peter L.

2012-01-01

Motivation: Public genomic databases, which are often used to guide genetic studies of human disease, are now being applied to genomic medicine through in silico integrative genomics. These databases, however, often lack tools for systematically determining the experimental origins of the data. Results: We introduce a new data provenance model that we have implemented in a public web application, BioQ, for assessing the reliability of the data by systematically tracing its experimental origins to the original subjects and biologics. BioQ allows investigators to both visualize data provenance as well as explore individual elements of experimental process flow using precise tools for detailed data exploration and documentation. It includes a number of human genetic variation databases such as the HapMap and 1000 Genomes projects. Availability and implementation: BioQ is freely available to the public at http://bioq.saclab.net Contact: ssaccone@wustl.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22426342
Cpf1-Database: web-based genome-wide guide RNA library design for gene knockout screens using CRISPR-Cpf1.

PubMed

Park, Jeongbin; Bae, Sangsu

2018-03-15

Following the type II CRISPR-Cas9 system, type V CRISPR-Cpf1 endonucleases have been found to be applicable for genome editing in various organisms in vivo. However, there are as yet no web-based tools capable of optimally selecting guide RNAs (gRNAs) among all possible genome-wide target sites. Here, we present Cpf1-Database, a genome-wide gRNA library design tool for LbCpf1 and AsCpf1, which have DNA recognition sequences of 5'-TTTN-3' at the 5' ends of target sites. Cpf1-Database provides a sophisticated but simple way to design gRNAs for AsCpf1 nucleases on the genome scale. One can easily access the data using a straightforward web interface, and using the powerful collections feature one can easily design gRNAs for thousands of genes in short time. Free access at http://www.rgenome.net/cpf1-database/. sangsubae@hanyang.ac.kr.
Public variant databases: liability?

PubMed Central

Thorogood, Adrian; Cook-Deegan, Robert; Knoppers, Bartha Maria

2017-01-01

Public variant databases support the curation, clinical interpretation, and sharing of genomic data, thus reducing harmful errors or delays in diagnosis. As variant databases are increasingly relied on in the clinical context, there is concern that negligent variant interpretation will harm patients and attract liability. This article explores the evolving legal duties of laboratories, public variant databases, and physicians in clinical genomics and recommends a governance framework for databases to promote responsible data sharing. Genet Med advance online publication 15 December 2016 PMID:27977006
Mutant power: using mutant allele collections for yeast functional genomics.

PubMed

Norman, Kaitlyn L; Kumar, Anuj

2016-03-01

The budding yeast has long served as a model eukaryote for the functional genomic analysis of highly conserved signaling pathways, cellular processes and mechanisms underlying human disease. The collection of reagents available for genomics in yeast is extensive, encompassing a growing diversity of mutant collections beyond gene deletion sets in the standard wild-type S288C genetic background. We review here three main types of mutant allele collections: transposon mutagen collections, essential gene collections and overexpression libraries. Each collection provides unique and identifiable alleles that can be utilized in genome-wide, high-throughput studies. These genomic reagents are particularly informative in identifying synthetic phenotypes and functions associated with essential genes, including those modeled most effectively in complex genetic backgrounds. Several examples of genomic studies in filamentous/pseudohyphal backgrounds are provided here to illustrate this point. Additionally, the limitations of each approach are examined. Collectively, these mutant allele collections in Saccharomyces cerevisiae and the related pathogenic yeast Candida albicans promise insights toward an advanced understanding of eukaryotic molecular and cellular biology. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Extraction of genomic DNA from yeasts for PCR-based applications.

PubMed

Lõoke, Marko; Kristjuhan, Kersti; Kristjuhan, Arnold

2011-05-01

We have developed a quick and low-cost genomic DNA extraction protocol from yeast cells for PCR-based applications. This method does not require any enzymes, hazardous chemicals, or extreme temperatures, and is especially powerful for simultaneous analysis of a large number of samples. DNA can be efficiently extracted from different yeast species (Kluyveromyces lactis, Hansenula polymorpha, Schizosaccharomyces pombe, Candida albicans, Pichia pastoris, and Saccharomyces cerevisiae). The protocol involves lysis of yeast colonies or cells from liquid culture in a lithium acetate (LiOAc)-SDS solution and subsequent precipitation of DNA with ethanol. Approximately 100 nanograms of total genomic DNA can be extracted from 1 × 10(7) cells. DNA extracted by this method is suitable for a variety of PCR-based applications (including colony PCR, real-time qPCR, and DNA sequencing) for amplification of DNA fragments of ≤ 3500 bp.
Reconstruction of metabolic pathways for the cattle genome

PubMed Central

Seo, Seongwon; Lewin, Harris A

2009-01-01

Background Metabolic reconstruction of microbial, plant and animal genomes is a necessary step toward understanding the evolutionary origins of metabolism and species-specific adaptive traits. The aims of this study were to reconstruct conserved metabolic pathways in the cattle genome and to identify metabolic pathways with missing genes and proteins. The MetaCyc database and PathwayTools software suite were chosen for this work because they are widely used and easy to implement. Results An amalgamated cattle genome database was created using the NCBI and Ensembl cattle genome databases (based on build 3.1) as data sources. PathwayTools was used to create a cattle-specific pathway genome database, which was followed by comprehensive manual curation for the reconstruction of metabolic pathways. The curated database, CattleCyc 1.0, consists of 217 metabolic pathways. A total of 64 mammalian-specific metabolic pathways were modified from the reference pathways in MetaCyc, and two pathways previously identified but missing from MetaCyc were added. Comparative analysis of metabolic pathways revealed the absence of mammalian genes for 22 metabolic enzymes whose activity was reported in the literature. We also identified six human metabolic protein-coding genes for which the cattle ortholog is missing from the sequence assembly. Conclusion CattleCyc is a powerful tool for understanding the biology of ruminants and other cetartiodactyl species. In addition, the approach used to develop CattleCyc provides a framework for the metabolic reconstruction of other newly sequenced mammalian genomes. It is clear that metabolic pathway analysis strongly reflects the quality of the underlying genome annotations. Thus, having well-annotated genomes from many mammalian species hosted in BioCyc will facilitate the comparative analysis of metabolic pathways among different species and a systems approach to comparative physiology. PMID:19284618
ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding.

PubMed

Guhlin, Joseph; Silverstein, Kevin A T; Zhou, Peng; Tiffin, Peter; Young, Nevin D

2017-08-10

Rapid generation of omics data in recent years have resulted in vast amounts of disconnected datasets without systemic integration and knowledge building, while individual groups have made customized, annotated datasets available on the web with few ways to link them to in-lab datasets. With so many research groups generating their own data, the ability to relate it to the larger genomic and comparative genomic context is becoming increasingly crucial to make full use of the data. The Omics Database Generator (ODG) allows users to create customized databases that utilize published genomics data integrated with experimental data which can be queried using a flexible graph database. When provided with omics and experimental data, ODG will create a comparative, multi-dimensional graph database. ODG can import definitions and annotations from other sources such as InterProScan, the Gene Ontology, ENZYME, UniPathway, and others. This annotation data can be especially useful for studying new or understudied species for which transcripts have only been predicted, and rapidly give additional layers of annotation to predicted genes. In better studied species, ODG can perform syntenic annotation translations or rapidly identify characteristics of a set of genes or nucleotide locations, such as hits from an association study. ODG provides a web-based user-interface for configuring the data import and for querying the database. Queries can also be run from the command-line and the database can be queried directly through programming language hooks available for most languages. ODG supports most common genomic formats as well as generic, easy to use tab-separated value format for user-provided annotations. ODG is a user-friendly database generation and query tool that adapts to the supplied data to produce a comparative genomic database or multi-layered annotation database. ODG provides rapid comparative genomic annotation and is therefore particularly useful for non-model or understudied species. For species for which more data are available, ODG can be used to conduct complex multi-omics, pattern-matching queries.
Gramene database in 2010: updates and extensions.

PubMed

Youens-Clark, Ken; Buckler, Ed; Casstevens, Terry; Chen, Charles; Declerck, Genevieve; Derwent, Paul; Dharmawardhana, Palitha; Jaiswal, Pankaj; Kersey, Paul; Karthikeyan, A S; Lu, Jerry; McCouch, Susan R; Ren, Liya; Spooner, William; Stein, Joshua C; Thomason, Jim; Wei, Sharon; Ware, Doreen

2011-01-01

Now in its 10th year, the Gramene database (http://www.gramene.org) has grown from its primary focus on rice, the first fully-sequenced grass genome, to become a resource for major model and crop plants including Arabidopsis, Brachypodium, maize, sorghum, poplar and grape in addition to several species of rice. Gramene began with the addition of an Ensembl genome browser and has expanded in the last decade to become a robust resource for plant genomics hosting a wide array of data sets including quantitative trait loci (QTL), metabolic pathways, genetic diversity, genes, proteins, germplasm, literature, ontologies and a fully-structured markers and sequences database integrated with genome browsers and maps from various published studies (genetic, physical, bin, etc.). In addition, Gramene now hosts a variety of web services including a Distributed Annotation Server (DAS), BLAST and a public MySQL database. Twice a year, Gramene releases a major build of the database and makes interim releases to correct errors or to make important updates to software and/or data.
Benchmarking database performance for genomic data.

PubMed

Khushi, Matloob

2015-06-01

Genomic regions represent features such as gene annotations, transcription factor binding sites and epigenetic modifications. Performing various genomic operations such as identifying overlapping/non-overlapping regions or nearest gene annotations are common research needs. The data can be saved in a database system for easy management, however, there is no comprehensive database built-in algorithm at present to identify overlapping regions. Therefore I have developed a novel region-mapping (RegMap) SQL-based algorithm to perform genomic operations and have benchmarked the performance of different databases. Benchmarking identified that PostgreSQL extracts overlapping regions much faster than MySQL. Insertion and data uploads in PostgreSQL were also better, although general searching capability of both databases was almost equivalent. In addition, using the algorithm pair-wise, overlaps of >1000 datasets of transcription factor binding sites and histone marks, collected from previous publications, were reported and it was found that HNF4G significantly co-locates with cohesin subunit STAG1 (SA1).Inc. © 2015 Wiley Periodicals, Inc.
Accessing the SEED genome databases via Web services API: tools for programmers.

PubMed

Disz, Terry; Akhter, Sajia; Cuevas, Daniel; Olson, Robert; Overbeek, Ross; Vonstein, Veronika; Stevens, Rick; Edwards, Robert A

2010-06-14

The SEED integrates many publicly available genome sequences into a single resource. The database contains accurate and up-to-date annotations based on the subsystems concept that leverages clustering between genomes and other clues to accurately and efficiently annotate microbial genomes. The backend is used as the foundation for many genome annotation tools, such as the Rapid Annotation using Subsystems Technology (RAST) server for whole genome annotation, the metagenomics RAST server for random community genome annotations, and the annotation clearinghouse for exchanging annotations from different resources. In addition to a web user interface, the SEED also provides Web services based API for programmatic access to the data in the SEED, allowing the development of third-party tools and mash-ups. The currently exposed Web services encompass over forty different methods for accessing data related to microbial genome annotations. The Web services provide comprehensive access to the database back end, allowing any programmer access to the most consistent and accurate genome annotations available. The Web services are deployed using a platform independent service-oriented approach that allows the user to choose the most suitable programming platform for their application. Example code demonstrate that Web services can be used to access the SEED using common bioinformatics programming languages such as Perl, Python, and Java. We present a novel approach to access the SEED database. Using Web services, a robust API for access to genomics data is provided, without requiring large volume downloads all at once. The API ensures timely access to the most current datasets available, including the new genomes as soon as they come online.
Analysis of the Genome and Chromium Metabolism-Related Genes of Serratia sp. S2.

PubMed

Dong, Lanlan; Zhou, Simin; He, Yuan; Jia, Yan; Bai, Qunhua; Deng, Peng; Gao, Jieying; Li, Yingli; Xiao, Hong

2018-05-01

This study is to investigate the genome sequence of Serratia sp. S2. The genomic DNA of Serratia sp. S2 was extracted and the sequencing library was constructed. The sequencing was carried out by Illumina 2000 and complete genomic sequences were obtained. Gene function annotation and bioinformatics analysis were performed by comparing with the known databases. The genome size of Serratia sp. S2 was 5,604,115 bp and the G+C content was 57.61%. There were 5373 protein coding genes, and 3732, 3614, and 3942 genes were respectively annotated into the GO, KEGG, and COG databases. There were 12 genes related to chromium metabolism in the Serratia sp. S2 genome. The whole genome sequence of Serratia sp. S2 is submitted to the GenBank database with gene accession number of LNRP00000000. Our findings may provide theoretical basis for the subsequent development of new biotechnology to repair environmental chromium pollution.
High-Resolution SNP/CGH Microarrays Reveal the Accumulation of Loss of Heterozygosity in Commonly Used Candida albicans Strains

PubMed Central

Abbey, Darren; Hickman, Meleah; Gresham, David; Berman, Judith

2011-01-01

Phenotypic diversity can arise rapidly through loss of heterozygosity (LOH) or by the acquisition of copy number variations (CNV) spanning whole chromosomes or shorter contiguous chromosome segments. In Candida albicans, a heterozygous diploid yeast pathogen with no known meiotic cycle, homozygosis and aneuploidy alter clinical characteristics, including drug resistance. Here, we developed a high-resolution microarray that simultaneously detects ∼39,000 single nucleotide polymorphism (SNP) alleles and ∼20,000 copy number variation loci across the C. albicans genome. An important feature of the array analysis is a computational pipeline that determines SNP allele ratios based upon chromosome copy number. Using the array and analysis tools, we constructed a haplotype map (hapmap) of strain SC5314 to assign SNP alleles to specific homologs, and we used it to follow the acquisition of loss of heterozygosity (LOH) and copy number changes in a series of derived laboratory strains. This high-resolution SNP/CGH microarray and the associated hapmap facilitated the phasing of alleles in lab strains and revealed detrimental genome changes that arose frequently during molecular manipulations of laboratory strains. Furthermore, it provided a useful tool for rapid, high-resolution, and cost-effective characterization of changes in allele diversity as well as changes in chromosome copy number in new C. albicans isolates. PMID:22384363
Using population mixtures to optimize the utility of genomic databases: linkage disequilibrium and association study design in India.

PubMed

Pemberton, T J; Jakobsson, M; Conrad, D F; Coop, G; Wall, J D; Pritchard, J K; Patel, P I; Rosenberg, N A

2008-07-01

When performing association studies in populations that have not been the focus of large-scale investigations of haplotype variation, it is often helpful to rely on genomic databases in other populations for study design and analysis - such as in the selection of tag SNPs and in the imputation of missing genotypes. One way of improving the use of these databases is to rely on a mixture of database samples that is similar to the population of interest, rather than using the single most similar database sample. We demonstrate the effectiveness of the mixture approach in the application of African, European, and East Asian HapMap samples for tag SNP selection in populations from India, a genetically intermediate region underrepresented in genomic studies of haplotype variation.
PvTFDB: a Phaseolus vulgaris transcription factors database for expediting functional genomics in legumes

PubMed Central

Bhawna; Bonthala, V.S.; Gajula, MNV Prasad

2016-01-01

The common bean [Phaseolus vulgaris (L.)] is one of the essential proteinaceous vegetables grown in developing countries. However, its production is challenged by low yields caused by numerous biotic and abiotic stress conditions. Regulatory transcription factors (TFs) symbolize a key component of the genome and are the most significant targets for producing stress tolerant crop and hence functional genomic studies of these TFs are important. Therefore, here we have constructed a web-accessible TFs database for P. vulgaris, called PvTFDB, which contains 2370 putative TF gene models in 49 TF families. This database provides a comprehensive information for each of the identified TF that includes sequence data, functional annotation, SSRs with their primer sets, protein physical properties, chromosomal location, phylogeny, tissue-specific gene expression data, orthologues, cis-regulatory elements and gene ontology (GO) assignment. Altogether, this information would be used in expediting the functional genomic studies of a specific TF(s) of interest. The objectives of this database are to understand functional genomics study of common bean TFs and recognize the regulatory mechanisms underlying various stress responses to ease breeding strategy for variety production through a couple of search interfaces including gene ID, functional annotation and browsing interfaces including by family and by chromosome. This database will also serve as a promising central repository for researchers as well as breeders who are working towards crop improvement of legume crops. In addition, this database provide the user unrestricted public access and the user can download entire data present in the database freely. Database URL: http://www.multiomics.in/PvTFDB/ PMID:27465131
The MaizeGDB Genome Browser Tutorial: One example of database outreach to biologists via video

USDA-ARS?s Scientific Manuscript database

Video tutorials are an effective way for researchers to quickly learn how to use online tools offered by biological databases. At the Maize Genetics and Genomics Database (MaizeGDB), we have developed a number of video tutorials that aim to demonstrate how to use various tools as well as to explici...

Toward the automated generation of genome-scale metabolic networks in the SEED.

PubMed

DeJongh, Matthew; Formsma, Kevin; Boillot, Paul; Gould, John; Rycenga, Matthew; Best, Aaron

2007-04-26

Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis). We have implemented our tools and database within the SEED, an open-source software environment for comparative genome annotation and analysis. Our method sets the stage for the automated generation of substantially complete metabolic networks for over 400 complete genome sequences currently in the SEED. With each genome that is processed using our tools, the database of common components grows to cover more of the diversity of metabolic pathways. This increases the likelihood that components of reaction networks for subsequently processed genomes can be retrieved from the database, rather than assembled and verified manually.
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency

PubMed Central

Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio

2015-01-01

Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB. PMID:26558254
SalmonDB: a bioinformatics resource for Salmo salar and Oncorhynchus mykiss

PubMed Central

Di Génova, Alex; Aravena, Andrés; Zapata, Luis; González, Mauricio; Maass, Alejandro; Iturra, Patricia

2011-01-01

SalmonDB is a new multiorganism database containing EST sequences from Salmo salar, Oncorhynchus mykiss and the whole genome sequence of Danio rerio, Gasterosteus aculeatus, Tetraodon nigroviridis, Oryzias latipes and Takifugu rubripes, built with core components from GMOD project, GOPArc system and the BioMart project. The information provided by this resource includes Gene Ontology terms, metabolic pathways, SNP prediction, CDS prediction, orthologs prediction, several precalculated BLAST searches and domains. It also provides a BLAST server for matching user-provided sequences to any of the databases and an advanced query tool (BioMart) that allows easy browsing of EST databases with user-defined criteria. These tools make SalmonDB database a valuable resource for researchers searching for transcripts and genomic information regarding S. salar and other salmonid species. The database is expected to grow in the near feature, particularly with the S. salar genome sequencing project. Database URL: http://genomicasalmones.dim.uchile.cl/ PMID:22120661
SalmonDB: a bioinformatics resource for Salmo salar and Oncorhynchus mykiss.

PubMed

Di Génova, Alex; Aravena, Andrés; Zapata, Luis; González, Mauricio; Maass, Alejandro; Iturra, Patricia

2011-01-01

SalmonDB is a new multiorganism database containing EST sequences from Salmo salar, Oncorhynchus mykiss and the whole genome sequence of Danio rerio, Gasterosteus aculeatus, Tetraodon nigroviridis, Oryzias latipes and Takifugu rubripes, built with core components from GMOD project, GOPArc system and the BioMart project. The information provided by this resource includes Gene Ontology terms, metabolic pathways, SNP prediction, CDS prediction, orthologs prediction, several precalculated BLAST searches and domains. It also provides a BLAST server for matching user-provided sequences to any of the databases and an advanced query tool (BioMart) that allows easy browsing of EST databases with user-defined criteria. These tools make SalmonDB database a valuable resource for researchers searching for transcripts and genomic information regarding S. salar and other salmonid species. The database is expected to grow in the near feature, particularly with the S. salar genome sequencing project. Database URL: http://genomicasalmones.dim.uchile.cl/
WheatGenome.info: A Resource for Wheat Genomics Resource.

PubMed

Lai, Kaitao

2016-01-01

An integrated database with a variety of Web-based systems named WheatGenome.info hosting wheat genome and genomic data has been developed to support wheat research and crop improvement. The resource includes multiple Web-based applications, which are implemented as a variety of Web-based systems. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This portal provides links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/ .
ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation

PubMed Central

Kristensen, David M.; Wolf, Yuri I.; Koonin, Eugene V.

2017-01-01

The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbial world. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of ‘index’ orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html. PMID:28053163
Investigation of mutations in the HBB gene using the 1,000 genomes database.

PubMed

Carlice-Dos-Reis, Tânia; Viana, Jaime; Moreira, Fabiano Cordeiro; Cardoso, Greice de Lemos; Guerreiro, João; Santos, Sidney; Ribeiro-Dos-Santos, Ândrea

2017-01-01

Mutations in the HBB gene are responsible for several serious hemoglobinopathies, such as sickle cell anemia and β-thalassemia. Sickle cell anemia is one of the most common monogenic diseases worldwide. Due to its prevalence, diverse strategies have been developed for a better understanding of its molecular mechanisms. In silico analysis has been increasingly used to investigate the genotype-phenotype relationship of many diseases, and the sequences of healthy individuals deposited in the 1,000 Genomes database appear to be an excellent tool for such analysis. The objective of this study is to analyze the variations in the HBB gene in the 1,000 Genomes database, to describe the mutation frequencies in the different population groups, and to investigate the pattern of pathogenicity. The computational tool SNPEFF was used to align the data from 2,504 samples of the 1,000 Genomes database with the HG19 genome reference. The pathogenicity of each amino acid change was investigated using the databases CLINVAR, dbSNP and HbVar and five different predictors. Twenty different mutations were found in 209 healthy individuals. The African group had the highest number of individuals with mutations, and the European group had the lowest number. Thus, it is concluded that approximately 8.3% of phenotypically healthy individuals from the 1,000 Genomes database have some mutation in the HBB gene. The frequency of mutated genes was estimated at 0.042, so that the expected frequency of being homozygous or compound heterozygous for these variants in the next generation is approximately 0.002. In total, 193 subjects had a non-synonymous mutation, which 186 (7.4%) have a deleterious mutation. Considering that the 1,000 Genomes database is representative of the world's population, it can be estimated that fourteen out of every 10,000 individuals in the world will have a hemoglobinopathy in the next generation.
Yeast species diversity in apple juice for cider production evidenced by culture-based method.

PubMed

Lorenzini, Marilinda; Simonato, Barbara; Zapparoli, Giacomo

2018-05-07

Identification of yeasts isolated from apple juices of two cider houses (one located in a plain area and one in an alpine area) was carried out by culture-based method. Wallerstein Laboratory Nutrient Agar was used as medium for isolation and preliminary yeasts identification. A total of 20 species of yeasts belonging to ten different genera were identified using both BLAST algorithm for pairwise sequence comparison and phylogenetic approaches. A wide variety of non-Saccharomyces species was found. Interestingly, Candida railenensis, Candida cylindracea, Hanseniaspora meyeri, Hanseniaspora pseudoguilliermondii, and Metschnikowia sinensis were recovered for the first time in the yeast community of an apple environment. Phylogenetic analysis revealed a better resolution in identifying Metschnikowia and Moesziomyces isolates than comparative analysis using the GenBank or YeastIP gene databases. This study provides important data on yeast microbiota of apple juice and evidenced differences between two geographical cider production areas in terms of species composition.
Multicenter Study Evaluating the Vitek MS System for Identification of Medically Important Yeasts

PubMed Central

Westblade, Lars F.; Jennemann, Rebecca; Branda, John A.; Bythrow, Maureen; Ferraro, Mary Jane; Garner, Omai B.; Ginocchio, Christine C.; Lewinski, Michael A.; Manji, Ryhana; Mochon, A. Brian; Procop, Gary W.; Richter, Sandra S.; Rychert, Jenna A.; Sercia, Linda

2013-01-01

The optimal management of fungal infections is correlated with timely organism identification. Matrix-assisted laser desorption ionization–time of flight (MALDI-TOF) mass spectrometry (MS) is revolutionizing the identification of yeasts isolated from clinical specimens. We present a multicenter study assessing the performance of the Vitek MS system (bioMérieux) in identifying medically important yeasts. A collection of 852 isolates was tested, including 20 Candida species (626 isolates, including 58 C. albicans, 62 C. glabrata, and 53 C. krusei isolates), 35 Cryptococcus neoformans isolates, and 191 other clinically relevant yeast isolates; in total, 31 different species were evaluated. Isolates were directly applied to a target plate, followed by a formic acid overlay. Mass spectra were acquired using the Vitek MS system and were analyzed using the Vitek MS v2.0 database. The gold standard for identification was sequence analysis of the D2 region of the 26S rRNA gene. In total, 823 isolates (96.6%) were identified to the genus level and 819 isolates (96.1%) were identified to the species level. Twenty-four isolates (2.8%) were not identified, and five isolates (0.6%) were misidentified. Misidentified isolates included one isolate of C. albicans (n = 58) identified as Candida dubliniensis, one isolate of Candida parapsilosis (n = 73) identified as Candida pelliculosa, and three isolates of Geotrichum klebahnii (n = 6) identified as Geotrichum candidum. The identification of clinically relevant yeasts using MS is superior to the phenotypic identification systems currently employed in clinical microbiology laboratories. PMID:23658267
MALDI-TOF mass spectrometry proteomic phenotyping of clinically relevant fungi.

PubMed

Putignani, Lorenza; Del Chierico, Federica; Onori, Manuela; Mancinelli, Livia; Argentieri, Marta; Bernaschi, Paola; Coltella, Luana; Lucignano, Barbara; Pansani, Laura; Ranno, Stefania; Russo, Cristina; Urbani, Andrea; Federici, Giorgio; Menichella, Donato

2011-03-01

Proteomics is particularly suitable for characterising human pathogens with high life cycle complexity, such as fungi. Protein content and expression levels may be affected by growth states and life cycle morphs and correlate to species and strain variation. Identification and typing of fungi by conventional methods are often difficult, time-consuming and frequently, for unusual species, inconclusive. Proteomic phenotypes from MALDI-TOF MS were employed as analytical and typing expression profiling of yeast, yeast-like species and strain variants in order to achieve a microbial proteomics population study. Spectra from 303 clinical isolates were generated and processed by standard pattern matching with a MALDI-TOF Biotyper (MT). Identifications (IDs) were compared to a reference biochemical-based system (Vitek-2) and, when discordant, MT IDs were verified with genotyping IDs, obtained by sequencing the 25-28S rRNA hypervariable D2 region. Spectra were converted into virtual gel-like formats, and hierarchical clustering analysis was performed for 274 Candida profiles to investigate species and strain typing correlation. MT provided 257/303 IDs consistent with Vitek-2 ones. However, amongst 26/303 discordant MT IDs, only 5 appeared "true". No MT identification was achieved for 20/303 isolates for incompleteness of database species variants. Candida spectra clustering agreed with identified species and topology of Candida albicans and Candida parapsilosis specific dendrograms. MT IDs show a high analytical performance and profiling heterogeneity which seems to complement or even outclass existing typing tools. This variability reflects the high biological complexity of yeasts and may be properly exploited to provide epidemiological tracing and infection dispersion patterns.
Apollo2Go: a web service adapter for the Apollo genome viewer to enable distributed genome annotation.

PubMed

Klee, Kathrin; Ernst, Rebecca; Spannagl, Manuel; Mayer, Klaus F X

2007-08-30

Apollo, a genome annotation viewer and editor, has become a widely used genome annotation and visualization tool for distributed genome annotation projects. When using Apollo for annotation, database updates are carried out by uploading intermediate annotation files into the respective database. This non-direct database upload is laborious and evokes problems of data synchronicity. To overcome these limitations we extended the Apollo data adapter with a generic, configurable web service client that is able to retrieve annotation data in a GAME-XML-formatted string and pass it on to Apollo's internal input routine. This Apollo web service adapter, Apollo2Go, simplifies the data exchange in distributed projects and aims to render the annotation process more comfortable. The Apollo2Go software is freely available from ftp://ftpmips.gsf.de/plants/apollo_webservice.
Apollo2Go: a web service adapter for the Apollo genome viewer to enable distributed genome annotation

PubMed Central

Klee, Kathrin; Ernst, Rebecca; Spannagl, Manuel; Mayer, Klaus FX

2007-01-01

Background Apollo, a genome annotation viewer and editor, has become a widely used genome annotation and visualization tool for distributed genome annotation projects. When using Apollo for annotation, database updates are carried out by uploading intermediate annotation files into the respective database. This non-direct database upload is laborious and evokes problems of data synchronicity. Results To overcome these limitations we extended the Apollo data adapter with a generic, configurable web service client that is able to retrieve annotation data in a GAME-XML-formatted string and pass it on to Apollo's internal input routine. Conclusion This Apollo web service adapter, Apollo2Go, simplifies the data exchange in distributed projects and aims to render the annotation process more comfortable. The Apollo2Go software is freely available from . PMID:17760972
The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata.

PubMed

Liolios, Konstantinos; Mavromatis, Konstantinos; Tavernarakis, Nektarios; Kyrpides, Nikos C

2008-01-01

The Genomes On Line Database (GOLD) is a comprehensive resource that provides information on genome and metagenome projects worldwide. Complete and ongoing projects and their associated metadata can be accessed in GOLD through pre-computed lists and a search page. As of September 2007, GOLD contains information on more than 2900 sequencing projects, out of which 639 have been completed and their sequence data deposited in the public databases. GOLD continues to expand with the goal of providing metadata information related to the projects and the organisms/environments towards the Minimum Information about a Genome Sequence' (MIGS) guideline. GOLD is available at http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece at http://gold.imbb.forth.gr/
The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata

PubMed Central

Liolios, Konstantinos; Mavromatis, Konstantinos; Tavernarakis, Nektarios; Kyrpides, Nikos C.

2008-01-01

The Genomes On Line Database (GOLD) is a comprehensive resource that provides information on genome and metagenome projects worldwide. Complete and ongoing projects and their associated metadata can be accessed in GOLD through pre-computed lists and a search page. As of September 2007, GOLD contains information on more than 2900 sequencing projects, out of which 639 have been completed and their sequence data deposited in the public databases. GOLD continues to expand with the goal of providing metadata information related to the projects and the organisms/environments towards the Minimum Information about a Genome Sequence’ (MIGS) guideline. GOLD is available at http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece at http://gold.imbb.forth.gr/ PMID:17981842
Non-B DB: a database of predicted non-B DNA-forming motifs in mammalian genomes.

PubMed

Cer, Regina Z; Bruce, Kevin H; Mudunuri, Uma S; Yi, Ming; Volfovsky, Natalia; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M

2011-01-01

Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine•pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.
DEPPDB - DNA electrostatic potential properties database. Electrostatic properties of genome DNA elements.

PubMed

Osypov, Alexander A; Krutinin, Gleb G; Krutinina, Eugenia A; Kamzolova, Svetlana G

2012-04-01

Electrostatic properties of genome DNA are important to its interactions with different proteins, in particular, related to transcription. DEPPDB - DNA Electrostatic Potential (and other Physical) Properties Database - provides information on the electrostatic and other physical properties of genome DNA combined with its sequence and annotation of biological and structural properties of genomes and their elements. Genomes are organized on taxonomical basis, supporting comparative and evolutionary studies. Currently, DEPPDB contains all completely sequenced bacterial, viral, mitochondrial, and plastids genomes according to the NCBI RefSeq, and some model eukaryotic genomes. Data for promoters, regulation sites, binding proteins, etc., are incorporated from established DBs and literature. The database is complemented by analytical tools. User sequences calculations are available. Case studies discovered electrostatics complementing DNA bending in E.coli plasmid BNT2 promoter functioning, possibly affecting host-environment metabolic switch. Transcription factors binding sites gravitate to high potential regions, confirming the electrostatics universal importance in protein-DNA interactions beyond the classical promoter-RNA polymerase recognition and regulation. Other genome elements, such as terminators, also show electrostatic peculiarities. Most intriguing are gene starts, exhibiting taxonomic correlations. The necessity of the genome electrostatic properties studies is discussed.
Candida guilliermondii and Other Species of Candida Misidentified as Candida famata: Assessment by Vitek 2, DNA Sequencing Analysis, and Matrix-Assisted Laser Desorption Ionization–Time of Flight Mass Spectrometry in Two Global Antifungal Surveillance Programs

PubMed Central

Woosley, Leah N.; Diekema, Daniel J.; Jones, Ronald N.; Pfaller, Michael A.

2013-01-01

Candida famata (teleomorph Debaryomyces hansenii) has been described as a medically relevant yeast, and this species has been included in many commercial identification systems that are currently used in clinical laboratories. Among 53 strains collected during the SENTRY and ARTEMIS surveillance programs and previously identified as C. famata (includes all submitted strains with this identification) by a variety of commercial methods (Vitek, MicroScan, API, and AuxaColor), DNA sequencing methods demonstrated that 19 strains were C. guilliermondii, 14 were C. parapsilosis, 5 were C. lusitaniae, 4 were C. albicans, and 3 were C. tropicalis, and five isolates belonged to other Candida species (two C. fermentati and one each C. intermedia, C. pelliculosa, and Pichia fabianni). Additionally, three misidentified C. famata strains were correctly identified as Kodomaea ohmeri, Debaryomyces nepalensis, and Debaryomyces fabryi using intergenic transcribed spacer (ITS) and/or intergenic spacer (IGS) sequencing. The Vitek 2 system identified three isolates with high confidence to be C. famata and another 15 with low confidence between C. famata and C. guilliermondii or C. parapsilosis, displaying only 56.6% agreement with DNA sequencing results. Matrix-assisted laser desorption ionization–time of flight (MALDI-TOF) results displayed 81.1% agreement with DNA sequencing. One strain each of C. metapsilosis, C. fermentati, and C. intermedia demonstrated a low score for identification (<2.0) in the MALDI Biotyper. K. ohmeri, D. nepalensis, and D. fabryi identified by DNA sequencing in this study were not in the current database for the MALDI Biotyper. These results suggest that the occurrence of C. famata in fungal infections is much lower than previously appreciated and that commercial systems do not produce accurate identifications except for the newly introduced MALDI-TOF instruments. PMID:23100350
Entamoeba histolytica: construction and applications of subgenomic databases.

PubMed

Hofer, Margit; Duchêne, Michael

2005-07-01

Knowledge about the influence of environmental stress such as the action of chemotherapeutic agents on gene expression in Entamoeba histolytica is limited. We plan to use oligonucleotide microarray hybridization to approach these questions. As the basis for our array, sequence data from the genome project carried out by the Institute for Genomic Research (TIGR) and the Sanger Institute were used to annotate parts of the parasite genome. Three subgenomic databases containing enzymes, cytoskeleton genes, and stress genes were compiled with the help of the ExPASy proteomics website and the BLAST servers at the two genome project sites. The known sequences from reference species, mostly human and Escherichia coli, were searched against TIGR and Sanger E. histolytica sequence contigs and the homologs were copied into a Microsoft Access database. In a similar way, two additional databases of cytoskeletal genes and stress genes were generated. Metabolic pathways could be assembled from our enzyme database, but sometimes they were incomplete as is the case for the sterol biosynthesis pathway. The raw databases contained a significant number of duplicate entries which were merged to obtain curated non-redundant databases. This procedure revealed that some E. histolytica genes may have several putative functions. Representative examples such as the case of the delta-aminolevulinate synthase/serine palmitoyltransferase are discussed.
RICD: a rice indica cDNA database resource for rice functional genomics.

PubMed

Lu, Tingting; Huang, Xuehui; Zhu, Chuanrang; Huang, Tao; Zhao, Qiang; Xie, Kabing; Xiong, Lizhong; Zhang, Qifa; Han, Bin

2008-11-26

The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. Rice Indica cDNA Database (RICD) is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB) and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.
GenomeVista

DOE Office of Scientific and Technical Information (OSTI.GOV)

Poliakov, Alexander; Couronne, Olivier

2002-11-04

Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less

PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

PubMed

Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

2011-01-01

PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
GANESH: software for customized annotation of genome regions.

PubMed

Huntley, Derek; Hummerich, Holger; Smedley, Damian; Kittivoravitkul, Sasivimol; McCarthy, Mark; Little, Peter; Sergot, Marek

2003-09-01

GANESH is a software package designed to support the genetic analysis of regions of human and other genomes. It provides a set of components that may be assembled to construct a self-updating database of DNA sequence, mapping data, and annotations of possible genome features. Once one or more remote sources of data for the target region have been identified, all sequences for that region are downloaded, assimilated, and subjected to a (configurable) set of standard database-searching and genome-analysis packages. The results are stored in compressed form in a relational database, and are updated automatically on a regular schedule so that they are always immediately available in their most up-to-date versions. A Java front-end, executed as a stand alone application or web applet, provides a graphical interface for navigating the database and for viewing the annotations. There are facilities for importing and exporting data in the format of the Distributed Annotation System (DAS), enabling a GANESH database to be used as a component of a DAS configuration. The system has been used to construct databases for about a dozen regions of human chromosomes and for three regions of mouse chromosomes.
IMGMD: A platform for the integration and standardisation of In silico Microbial Genome-scale Metabolic Models.

PubMed

Ye, Chao; Xu, Nan; Dong, Chuan; Ye, Yuannong; Zou, Xuan; Chen, Xiulai; Guo, Fengbiao; Liu, Liming

2017-04-07

Genome-scale metabolic models (GSMMs) constitute a platform that combines genome sequences and detailed biochemical information to quantify microbial physiology at the system level. To improve the unity, integrity, correctness, and format of data in published GSMMs, a consensus IMGMD database was built in the LAMP (Linux + Apache + MySQL + PHP) system by integrating and standardizing 328 GSMMs constructed for 139 microorganisms. The IMGMD database can help microbial researchers download manually curated GSMMs, rapidly reconstruct standard GSMMs, design pathways, and identify metabolic targets for strategies on strain improvement. Moreover, the IMGMD database facilitates the integration of wet-lab and in silico data to gain an additional insight into microbial physiology. The IMGMD database is freely available, without any registration requirements, at http://imgmd.jiangnan.edu.cn/database.
Using SQL Databases for Sequence Similarity Searching and Analysis.

PubMed

Pearson, William R; Mackey, Aaron J

2017-09-13

Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
In silico and in vitro screening to identify structurally diverse non-azole CYP51 inhibitors as potent antifungal agent.

PubMed

Singh, Aarti; Paliwal, Sarvesh Kumar; Sharma, Mukta; Mittal, Anupama; Sharma, Swapnil; Sharma, Jai Prakash

2016-01-01

The problem of resistance to azole class of antifungals is a serious cause of concern to the medical fraternity and thus there is an urgent need to identify non-azole scaffolds with high affinity for lanosterol 14α-demethylase (CYP51). In view of this we have attempted to identify novel non-azole CYP51 inhibitors through the application of pharmacophore based virtual screening and in vitro evaluation. A rigorously validated pharmacophore model comprising of 2 hydrogen bond acceptor and 2 hydrophobic features has been developed and used to mine NCI database. Out of 265 retrieved hits, NSC 1215 and 1520 have been chosen on the basis of Lipinski's rule of five, fit and estimated values. Both the hits were docked into the active site of CYP51. In view of high fit value and CDocker score, NSC 1215 and 1520 have been subjected to in vitro microbiological assay. The result reveals that NSC 1215 and 1520 are active against Candida albicans, Candida parapsilosis, Candida tropicalis, and Aspergillus niger. In addition to this the absorption characteristics of both the hits have also been determined using the rat sac technique and permeation in order of NSC 1520>NSC 1215 has been observed. Copyright © 2015 Elsevier Inc. All rights reserved.
Genomics Portals: integrative web-platform for mining genomics data.

PubMed

Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario

2010-01-13

A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
Genomics Portals: integrative web-platform for mining genomics data

PubMed Central

2010-01-01

Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org. PMID:20070909
Genomic Approach to Understand the Association of DNA Repair with Longevity and Healthy Aging Using Genomic Databases of Oldest-Old Population

PubMed Central

Kim, Hyun Soo

2018-01-01

Aged population is increasing worldwide due to the aging process that is inevitable. Accordingly, longevity and healthy aging have been spotlighted to promote social contribution of aged population. Many studies in the past few decades have reported the process of aging and longevity, emphasizing the importance of maintaining genomic stability in exceptionally long-lived population. Underlying reason of longevity remains unclear due to its complexity involving multiple factors. With advances in sequencing technology and human genome-associated approaches, studies based on population-based genomic studies are increasing. In this review, we summarize recent longevity and healthy aging studies of human population focusing on DNA repair as a major factor in maintaining genome integrity. To keep pace with recent growth in genomic research, aging- and longevity-associated genomic databases are also briefly introduced. To suggest novel approaches to investigate longevity-associated genetic variants related to DNA repair using genomic databases, gene set analysis was conducted, focusing on DNA repair- and longevity-associated genes. Their biological networks were additionally analyzed to grasp major factors containing genetic variants of human longevity and healthy aging in DNA repair mechanisms. In summary, this review emphasizes DNA repair activity in human longevity and suggests approach to conduct DNA repair-associated genomic study on human healthy aging.
PGSB PlantsDB: updates to the database framework for comparative plant genome research.

PubMed

Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai C; Martis, Mihaela M; Seidel, Michael; Kugler, Karl G; Gundlach, Heidrun; Mayer, Klaus F X

2016-01-04

PGSB (Plant Genome and Systems Biology: formerly MIPS) PlantsDB (http://pgsb.helmholtz-muenchen.de/plant/index.jsp) is a database framework for the comparative analysis and visualization of plant genome data. The resource has been updated with new data sets and types as well as specialized tools and interfaces to address user demands for intuitive access to complex plant genome data. In its latest incarnation, we have re-worked both the layout and navigation structure and implemented new keyword search options and a new BLAST sequence search functionality. Actively involved in corresponding sequencing consortia, PlantsDB has dedicated special efforts to the integration and visualization of complex triticeae genome data, especially for barley, wheat and rye. We enhanced CrowsNest, a tool to visualize syntenic relationships between genomes, with data from the wheat sub-genome progenitor Aegilops tauschii and added functionality to the PGSB RNASeqExpressionBrowser. GenomeZipper results were integrated for the genomes of barley, rye, wheat and perennial ryegrass and interactive access is granted through PlantsDB interfaces. Data exchange and cross-linking between PlantsDB and other plant genome databases is stimulated by the transPLANT project (http://transplantdb.eu/). © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
PvTFDB: a Phaseolus vulgaris transcription factors database for expediting functional genomics in legumes.

PubMed

Bhawna; Bonthala, V S; Gajula, Mnv Prasad

2016-01-01

The common bean [Phaseolus vulgaris (L.)] is one of the essential proteinaceous vegetables grown in developing countries. However, its production is challenged by low yields caused by numerous biotic and abiotic stress conditions. Regulatory transcription factors (TFs) symbolize a key component of the genome and are the most significant targets for producing stress tolerant crop and hence functional genomic studies of these TFs are important. Therefore, here we have constructed a web-accessible TFs database for P. vulgaris, called PvTFDB, which contains 2370 putative TF gene models in 49 TF families. This database provides a comprehensive information for each of the identified TF that includes sequence data, functional annotation, SSRs with their primer sets, protein physical properties, chromosomal location, phylogeny, tissue-specific gene expression data, orthologues, cis-regulatory elements and gene ontology (GO) assignment. Altogether, this information would be used in expediting the functional genomic studies of a specific TF(s) of interest. The objectives of this database are to understand functional genomics study of common bean TFs and recognize the regulatory mechanisms underlying various stress responses to ease breeding strategy for variety production through a couple of search interfaces including gene ID, functional annotation and browsing interfaces including by family and by chromosome. This database will also serve as a promising central repository for researchers as well as breeders who are working towards crop improvement of legume crops. In addition, this database provide the user unrestricted public access and the user can download entire data present in the database freely.Database URL: http://www.multiomics.in/PvTFDB/. © The Author(s) 2016. Published by Oxford University Press.
Alternatives to relational databases in precision medicine: Comparison of NoSQL approaches for big data storage using supercomputers

NASA Astrophysics Data System (ADS)

Velazquez, Enrique Israel

Improvements in medical and genomic technologies have dramatically increased the production of electronic data over the last decade. As a result, data management is rapidly becoming a major determinant, and urgent challenge, for the development of Precision Medicine. Although successful data management is achievable using Relational Database Management Systems (RDBMS), exponential data growth is a significant contributor to failure scenarios. Growing amounts of data can also be observed in other sectors, such as economics and business, which, together with the previous facts, suggests that alternate database approaches (NoSQL) may soon be required for efficient storage and management of big databases. However, this hypothesis has been difficult to test in the Precision Medicine field since alternate database architectures are complex to assess and means to integrate heterogeneous electronic health records (EHR) with dynamic genomic data are not easily available. In this dissertation, we present a novel set of experiments for identifying NoSQL database approaches that enable effective data storage and management in Precision Medicine using patients' clinical and genomic information from the cancer genome atlas (TCGA). The first experiment draws on performance and scalability from biologically meaningful queries with differing complexity and database sizes. The second experiment measures performance and scalability in database updates without schema changes. The third experiment assesses performance and scalability in database updates with schema modifications due dynamic data. We have identified two NoSQL approach, based on Cassandra and Redis, which seems to be the ideal database management systems for our precision medicine queries in terms of performance and scalability. We present NoSQL approaches and show how they can be used to manage clinical and genomic big data. Our research is relevant to the public health since we are focusing on one of the main challenges to the development of Precision Medicine and, consequently, investigating a potential solution to the progressively increasing demands on health care.
Evidence for a Pneumocystis carinii Flo8-like transcription factor: insights into organism adhesion.

PubMed

Kottom, Theodore J; Limper, Andrew H

2016-02-01

Pneumocystis carinii (Pc) adhesion to alveolar epithelial cells is well established and is thought to be a prerequisite for the initiation of Pneumocystis pneumonia. Pc binding events occur in part through the major Pc surface glycoprotein Msg, as well as an integrin-like molecule termed PcInt1. Recent data from the Pc sequencing project also demonstrate DNA sequences homologous to other genes important in Candida spp. binding to mammalian host cells, as well as organism binding to polystyrene surfaces and in biofilm formation. One of these genes, flo8, a transcription factor needed for downstream cAMP/PKA-pathway-mediated activation of the major adhesion/flocculin Flo11 in yeast, was cloned from a Pc cDNA library utilizing a partial sequence available in the Pc genome database. A CHEF blot of Pc genomic DNA yielded a single band providing evidence this gene is present in the organism. BLASTP analysis of the predicted protein demonstrated 41 % homology to the Saccharomyces cerevisiae Flo8. Northern blotting demonstrated greatest expression at pH 6.0-8.0, pH comparable to reported fungal biofilm milieu. Western blot and immunoprecipitation assays of PcFlo8 protein in isolated cyst and tropic life forms confirmed the presence of the cognate protein in these Pc life forms. Heterologous expression of Pcflo8 cDNA in flo8Δ-deficient yeast strains demonstrated that the Pcflo8 was able to restore yeast binding to polystyrene and invasive growth of yeast flo8Δ cells. Furthermore, Pcflo8 promoted yeast binding to HEK293 human epithelial cells, strengthening its functional classification as a Flo8 transcription factor. Taken together, these data suggest that PcFlo8 is expressed by Pc and may exert activity in organism adhesion and biofilm formation.
Evidence for a Pneumocystis carinii Flo8-like Transcription Factor: Insights into Organism Adhesion

PubMed Central

Kottom, Theodore J.; Limper, Andrew H.

2015-01-01

Pneumocystis carinii (Pc) adhesion to alveolar epithelial cells is well established and is thought to be a prerequisite for initiation of Pneumocystis pneumonia. Pc binding events occur in part through the major Pc surface glycoprotein Msg, as well as an integrin-like molecule termed PcInt1. Recent data from the Pc sequencing project also demonstrate DNA sequences homologous to other genes important in Candida spp. binding to mammalian host cells, as well as organism binding to polystyrene surfaces and in biofilm formation. One of these genes, flo8, a transcription factor needed for downstream cAMP/PKA-pathway-mediated activation of the major adhesin/flocculin Flo11 in yeast, was cloned from a Pc cDNA library utilizing a partial sequence available in the Pc genome database. A CHEF blot of Pc genomic DNA yielded a single band providing evidence this gene is present in the organism. BLASTP analysis of the predicted protein demonstrated 41% homology to the Saccharomyces cerevisiae Flo8. Northern blotting demonstrated greatest expression at pH 6.0–8.0, pH comparable to reported fungal biofilm milieu. Western blot and immunoprecipitation assays of PcFlo8 protein in isolated cyst and tropic life forms confirmed the presence of the cognate protein in these Pc life forms. Heterologous expression of Pcflo8 cDNA in flo8Δ (deficient) yeast strains demonstrated the Pcflo8 was able to restore yeast binding to polystyrene and invasive growth of yeast flo8Δ cells. Furthermore, Pcflo8 promoted yeast binding to HEK293 human epithelial cells, strengthening its functional classification as a Flo8 transcription factor. Taken together these data suggests that PcFlo8 is expressed by Pc and may exert activity in organism adhesion and biofilm formation. PMID:26215665
Techno-politics of genomic nationalism: tracing genomics and its use in drug regulation in Japan and Taiwan.

PubMed

Kuo, Wen-Hua

2011-10-01

This paper compares the development of genomics as a form of state project in Japan and Taiwan. Broadening the concepts of genomic sovereignty and bionationalism, I argue that the establishment and use of genomic databases vary according to techno-political context. While both Japan and Taiwan hold population-based databases to be necessary for scientific advance and competitiveness, they differ in how they have attempted to transform the information produced by databases into regulatory schemes for drug approval. The effectiveness of Taiwan's biobank is severely limited by the IRB reviewing process. By contrast, while updating its regulations for drug approval, Japan, is using pharmacogenomics to deal with matters relating to ethnic identity. By analysing genomic initiatives in the political context that nurtures them, this paper seeks to capture how global science and local societies interact and offers insight into the assessment of state-sponsored science in East Asia as they become transnational. Copyright © 2011 Elsevier Ltd. All rights reserved.
Significance of genome-wide association studies in molecular anthropology.

PubMed

Gupta, Vipin; Khadgawat, Rajesh; Sachdeva, Mohinder Pal

2009-12-01

The successful advent of a genome-wide approach in association studies raises the hopes of human geneticists for solving a genetic maze of complex traits especially the disorders. This approach, which is replete with the application of cutting-edge technology and supported by big science projects (like Human Genome Project; and even more importantly the International HapMap Project) and various important databases (SNP database, CNV database, etc.), has had unprecedented success in rapidly uncovering many of the genetic determinants of complex disorders. The magnitude of this approach in the genetics of classical anthropological variables like height, skin color, eye color, and other genome diversity projects has certainly expanded the horizons of molecular anthropology. Therefore, in this article we have proposed a genome-wide association approach in molecular anthropological studies by providing lessons from the exemplary study of the Wellcome Trust Case Control Consortium. We have also highlighted the importance and uniqueness of Indian population groups in facilitating the design and finding optimum solutions for other genome-wide association-related challenges.
The vaginal mycobiome: A contemporary perspective on fungi in women's health and diseases.

PubMed

Bradford, L Latéy; Ravel, Jacques

2017-04-03

Most of what is known about fungi in the human vagina has come from culture-based studies and phenotypic characterization of single organisms. Though valuable, these approaches have masked the complexity of fungal communities within the vagina. The vaginal mycobiome has become an emerging field of study as genomics tools are increasingly employed and we begin to appreciate the role these fungal communities play in human health and disease. Though vastly outnumbered by its bacterial counterparts, fungi are important constituents of the vaginal ecosystem in many healthy women. Candida albicans, an opportunistic fungal pathogen, colonizes 20% of women without causing any overt symptoms, yet it is one of the leading causes of infectious vaginitis. Understanding its mechanisms of commensalism and patho-genesis are both essential to developing more effective therapies. Describing the interactions between Candida, bacteria (such as Lactobacillus spp.) and other fungi in the vagina is funda-mental to our characterization of the vaginal mycobiome.
High-frequency transformation of a methylotrophic yeast, Candida boidinii, with autonomously replicating plasmids which are also functional in Saccharomyces cerevisiae.

PubMed

Sakai, Y; Goh, T K; Tani, Y

1993-06-01

We have developed a transformation system which uses autonomous replicating plasmids for a methylotrophic yeast, Candida boidinii. Two autonomous replication sequences, CARS1 and CARS2, were newly cloned from the genome of C. boidinii. Plasmids having both a CARS fragment and the C. boidinii URA3 gene transformed C. boidinii ura3 cells to Ura+ phenotype at frequencies of up to 10(4) CFU/micrograms of DNA. From Southern blot analysis, CARS plasmids seemed to exist in polymeric forms as well as in monomeric forms in C. boidinii cells. The C. boidinii URA3 gene was overexpressed in C. boidinii on these CARS vectors. CARS1 and CARS2 were found to function as an autonomous replicating element in Saccharomyces cerevisiae as well. Different portions of the CARS1 sequence were needed for autonomous replicating activity in C. boidinii and S. cerevisiae. C. boidinii could also be transformed with vectors harboring a CARS fragment and the S. cerevisiae URA3 gene.
Yeast infection in a beached southern right whale (Eubalaena australis) neonate.

PubMed

Mouton, Marnel; Reeb, Desray; Botha, Alfred; Best, Peter

2009-07-01

A female southern right whale (Eubalaena australis) neonate was found stranded on the Western Cape coast of southern Africa. Skin samples were taken the same day from three different locations on the animal's body and stored at -20 C. Isolation through repetitive culture of these skin sections yielded a single yeast species, Candida zeylanoides. Total genomic DNA also was isolated directly from skin samples. Polymerase chain reaction analysis of the internal transcribed spacer region of the fungal ribosomal gene cluster revealed the presence of Filobasidiella neoformans var. neoformans, the teleomorphic state of Cryptococcus neoformans. Fungal infections in cetaceans seem to be limited when compared to infections caused by bacteria, viruses and parasites. However, Candida species appear to be the most common type of fungal infection associated with cetaceans. To our knowledge this is the first report of a C. zeylanoides infection in a mysticete, as well as the first report of a dual infection involving two opportunistic pathogenic yeast species in a cetacean.
Application of different markers and data-analysis tools to the examination of biodiversity can lead to different results: a case study with Starmerella bacillaris (synonym Candida zemplinina) strains.

PubMed

Csoma, Hajnalka; Ács-Szabó, Lajos; Papp, László Attila; Sipiczki, Matthias

2018-08-01

Starmerella bacillaris (Candida zemplinina) is a genetically heterogeneous species. In this work, the diversity of 41 strains of various origins is examined and compared by the analysis of the length polymorphism of nuclear microsatellites and the RFLP of mitochondrial genomes. The band patterns are analysed with UPGMA, neighbor joining, neighbor net, minimum spanning tree and non-metric MDS algorithms. The results and their comparison to previous analyses demonstrate that different markers and different clustering methods can result in very different groupings of the same strains. The observed differences between the topologies of the dendrograms also indicate that the positions of the strains do not necessarily reflect their real genetic relationships and origins. The possibilities that the differences might be partially due to different sensitivity of the markers to environmental factors (selection pressure) and partially to the different grouping criteria of the algorithms are also discussed.
The vaginal mycobiome: A contemporary perspective on fungi in women's health and diseases

PubMed Central

2017-01-01

ABSTRACT Most of what is known about fungi in the human vagina has come from culture-based studies and phenotypic characterization of single organisms. Though valuable, these approaches have masked the complexity of fungal communities within the vagina. The vaginal mycobiome has become an emerging field of study as genomics tools are increasingly employed and we begin to appreciate the role these fungal communities play in human health and disease. Though vastly outnumbered by its bacterial counterparts, fungi are important constituents of the vaginal ecosystem in many healthy women. Candida albicans, an opportunistic fungal pathogen, colonizes 20% of women without causing any overt symptoms, yet it is one of the leading causes of infectious vaginitis. Understanding its mechanisms of commensalism and patho-genesis are both essential to developing more effective therapies. Describing the interactions between Candida, bacteria (such as Lactobacillus spp.) and other fungi in the vagina is funda-mental to our characterization of the vaginal mycobiome. PMID:27657355

The Biofuel Feedstock Genomics Resource: a web-based portal and database to enable functional genomics of plant biofuel feedstock species.

PubMed

Childs, Kevin L; Konganti, Kranti; Buell, C Robin

2012-01-01

Major feedstock sources for future biofuel production are likely to be high biomass producing plant species such as poplar, pine, switchgrass, sorghum and maize. One active area of research in these species is genome-enabled improvement of lignocellulosic biofuel feedstock quality and yield. To facilitate genomic-based investigations in these species, we developed the Biofuel Feedstock Genomic Resource (BFGR), a database and web-portal that provides high-quality, uniform and integrated functional annotation of gene and transcript assembly sequences from species of interest to lignocellulosic biofuel feedstock researchers. The BFGR includes sequence data from 54 species and permits researchers to view, analyze and obtain annotation at the gene, transcript, protein and genome level. Annotation of biochemical pathways permits the identification of key genes and transcripts central to the improvement of lignocellulosic properties in these species. The integrated nature of the BFGR in terms of annotation methods, orthologous/paralogous relationships and linkage to seven species with complete genome sequences allows comparative analyses for biofuel feedstock species with limited sequence resources. Database URL: http://bfgr.plantbiology.msu.edu.
Human Ageing Genomic Resources: new and updated databases

PubMed Central

Tacutu, Robi; Thornton, Daniel; Johnson, Emily; Budovsky, Arie; Barardo, Diogo; Craig, Thomas; Diana, Eugene; Lehmann, Gilad; Toren, Dmitri; Wang, Jingwei; Fraifeld, Vadim E

2018-01-01

Abstract In spite of a growing body of research and data, human ageing remains a poorly understood process. Over 10 years ago we developed the Human Ageing Genomic Resources (HAGR), a collection of databases and tools for studying the biology and genetics of ageing. Here, we present HAGR’s main functionalities, highlighting new additions and improvements. HAGR consists of six core databases: (i) the GenAge database of ageing-related genes, in turn composed of a dataset of >300 human ageing-related genes and a dataset with >2000 genes associated with ageing or longevity in model organisms; (ii) the AnAge database of animal ageing and longevity, featuring >4000 species; (iii) the GenDR database with >200 genes associated with the life-extending effects of dietary restriction; (iv) the LongevityMap database of human genetic association studies of longevity with >500 entries; (v) the DrugAge database with >400 ageing or longevity-associated drugs or compounds; (vi) the CellAge database with >200 genes associated with cell senescence. All our databases are manually curated by experts and regularly updated to ensure a high quality data. Cross-links across our databases and to external resources help researchers locate and integrate relevant information. HAGR is freely available online (http://genomics.senescence.info/). PMID:29121237
MIPSPlantsDB—plant database resource for integrative and comparative plant genome research

PubMed Central

Spannagl, Manuel; Noubibou, Octave; Haase, Dirk; Yang, Li; Gundlach, Heidrun; Hindemitt, Tobias; Klee, Kathrin; Haberer, Georg; Schoof, Heiko; Mayer, Klaus F. X.

2007-01-01

Genome-oriented plant research delivers rapidly increasing amount of plant genome data. Comprehensive and structured information resources are required to structure and communicate genome and associated analytical data for model organisms as well as for crops. The increase in available plant genomic data enables powerful comparative analysis and integrative approaches. PlantsDB aims to provide data and information resources for individual plant species and in addition to build a platform for integrative and comparative plant genome research. PlantsDB is constituted from genome databases for Arabidopsis, Medicago, Lotus, rice, maize and tomato. Complementary data resources for cis elements, repetive elements and extensive cross-species comparisons are implemented. The PlantsDB portal can be reached at . PMID:17202173
The COG database: new developments in phylogenetic classification of proteins from complete genomes

PubMed Central

Tatusov, Roman L.; Natale, Darren A.; Garkavtsev, Igor V.; Tatusova, Tatiana A.; Shankavaram, Uma T.; Rao, Bachoti S.; Kiryutin, Boris; Galperin, Michael Y.; Fedorova, Natalie D.; Koonin, Eugene V.

2001-01-01

The database of Clusters of Orthologous Groups of proteins (COGs), which represents an attempt on a phylogenetic classification of the proteins encoded in complete genomes, currently consists of 2791 COGs including 45 350 proteins from 30 genomes of bacteria, archaea and the yeast Saccharomyces cerevisiae (http://www.ncbi.nlm.nih.gov/COG). In addition, a supplement to the COGs is available, in which proteins encoded in the genomes of two multicellular eukaryotes, the nematode Caenorhabditis elegans and the fruit fly Drosophila melanogaster, and shared with bacteria and/or archaea were included. The new features added to the COG database include information pages with structural and functional details on each COG and literature references, improvements of the COGNITOR program that is used to fit new proteins into the COGs, and classification of genomes and COGs constructed by using principal component analysis. PMID:11125040
Genomic Target Database (GTD): A database of potential targets in human pathogenic bacteria

PubMed Central

Barh, Debmalya; Kumar, Anil; Misra, Amarendra Narayana

2009-01-01

A Genomic Target Database (GTD) has been developed having putative genomic drug targets for human bacterial pathogens. The selected pathogens are either drug resistant or vaccines are yet to be developed against them. The drug targets have been identified using subtractive genomics approaches and these are subsequently classified into Drug targets in pathogen specific unique metabolic pathways,Drug targets in host-pathogen common metabolic pathways, andMembrane localized drug targets. HTML code is used to link each target to its various properties and other available public resources. Essential resources and tools for subtractive genomic analysis, sub-cellular localization, vaccine and drug designing are also mentioned. To the best of authors knowledge, no such database (DB) is presently available that has listed metabolic pathways and membrane specific genomic drug targets based on subtractive genomics. Listed targets in GTD are readily available resource in developing drug and vaccine against the respective pathogen, its subtypes, and other family members. Currently GTD contains 58 drug targets for four pathogens. Shortly, drug targets for six more pathogens will be listed. Availability GTD is available at IIOAB website http://www.iioab.webs.com/GTD.htm. It can also be accessed at http://www.iioabdgd.webs.com.GTD is free for academic research and non-commercial use only. Commercial use is strictly prohibited without prior permission from IIOAB. PMID:20011153
The YeastGenome app: the Saccharomyces Genome Database at your fingertips.

PubMed

Wong, Edith D; Karra, Kalpana; Hitz, Benjamin C; Hong, Eurie L; Cherry, J Michael

2013-01-01

The Saccharomyces Genome Database (SGD) is a scientific database that provides researchers with high-quality curated data about the genes and gene products of Saccharomyces cerevisiae. To provide instant and easy access to this information on mobile devices, we have developed YeastGenome, a native application for the Apple iPhone and iPad. YeastGenome can be used to quickly find basic information about S. cerevisiae genes and chromosomal features regardless of internet connectivity. With or without network access, you can view basic information and Gene Ontology annotations about a gene of interest by searching gene names and gene descriptions or by browsing the database within the app to find the gene of interest. With internet access, the app provides more detailed information about the gene, including mutant phenotypes, references and protein and genetic interactions, as well as provides hyperlinks to retrieve detailed information by showing SGD pages and views of the genome browser. SGD provides online help describing basic ways to navigate the mobile version of SGD, highlights key features and answers frequently asked questions related to the app. The app is available from iTunes (http://itunes.com/apps/yeastgenome). The YeastGenome app is provided freely as a service to our community, as part of SGD's mission to provide free and open access to all its data and annotations.
Deppdb--DNA electrostatic potential properties database: electrostatic properties of genome DNA.

PubMed

Osypov, Alexander A; Krutinin, Gleb G; Kamzolova, Svetlana G

2010-06-01

The electrostatic properties of genome DNA influence its interactions with different proteins, in particular, the regulation of transcription by RNA-polymerases. DEPPDB--DNA Electrostatic Potential Properties Database--was developed to hold and provide all available information on the electrostatic properties of genome DNA combined with its sequence and annotation of biological and structural properties of genome elements and whole genomes. Genomes in DEPPDB are organized on a taxonomical basis. Currently, the database contains all the completely sequenced bacterial and viral genomes according to NCBI RefSeq. General properties of the genome DNA electrostatic potential profile and principles of its formation are revealed. This potential correlates with the GC content but does not correspond to it exactly and strongly depends on both the sequence arrangement and its context (flanking regions). Analysis of the promoter regions for bacterial and viral RNA polymerases revealed a correspondence between the scale of these proteins' physical properties and electrostatic profile patterns. We also discovered a direct correlation between the potential value and the binding frequency of RNA polymerase to DNA, supporting the idea of the role of electrostatics in these interactions. This matches a pronounced tendency of the promoter regions to possess higher values of the electrostatic potential.
MaizeGDB: The Maize Genetics and Genomics Database.

USDA-ARS?s Scientific Manuscript database

MaizeGDB is the community database for biological information about the crop plant Zea mays. Genomic, genetic, sequence, gene product, functional characterization, literature reference, and person/organization contact information are among the datatypes stored at MaizeGDB. At the project’s website...
Virus Database and Online Inquiry System Based on Natural Vectors.

PubMed

Dong, Rui; Zheng, Hui; Tian, Kun; Yau, Shek-Chung; Mao, Weiguang; Yu, Wenping; Yin, Changchuan; Yu, Chenglong; He, Rong Lucy; Yang, Jie; Yau, Stephen St

2017-01-01

We construct a virus database called VirusDB (http://yaulab.math.tsinghua.edu.cn/VirusDB/) and an online inquiry system to serve people who are interested in viral classification and prediction. The database stores all viral genomes, their corresponding natural vectors, and the classification information of the single/multiple-segmented viral reference sequences downloaded from National Center for Biotechnology Information. The online inquiry system serves the purpose of computing natural vectors and their distances based on submitted genomes, providing an online interface for accessing and using the database for viral classification and prediction, and back-end processes for automatic and manual updating of database content to synchronize with GenBank. Submitted genomes data in FASTA format will be carried out and the prediction results with 5 closest neighbors and their classifications will be returned by email. Considering the one-to-one correspondence between sequence and natural vector, time efficiency, and high accuracy, natural vector is a significant advance compared with alignment methods, which makes VirusDB a useful database in further research.
Strategies to improve reference databases for soil microbiomes

DOE PAGES

Choi, Jinlyung; Yang, Fan; Stepanauskas, Ramunas; ...

2016-12-09

A database of curated genomes is needed to better assess soil microbial communities and their processes associated with differing land management and environmental impacts. Interpreting soil metagenomic datasets with existing sequence databases is challenging because these datasets are biased towards medical and biotechnology research and can result in misleading annotations. We have curated a database of 928 genomes of soil-associated organisms (888 bacteria, 34 archaea, and 6 fungi). Using this database as a representation of the current state of knowledge of soil microbes that are well-characterized, we evaluated its composition and compared it to broader microbial databases, specifically NCBI’s RefSeq,more » as well as 3,035 publicly available soil amplicon datasets. These comparisons identified phyla and functions that are enriched in soils as well as those that may be underrepresented in RefSoil. For example, RefSoil was observed to have increased representation of Firmicutes despite its low abundance in soil environments and also lacked representation of Acidobacteria and Verrucomicrobia, which are abundant in soils. Our comparison of RefSoil to soil amplicon datasets allowed us to identify targets that if cultured or sequenced would significantly increase the biodiversity represented within RefSoil. To demonstrate the opportunities to access these underrepresented targets, we employed single cell genomics in a pilot experiment to recover 14 genomes from the "most wanted" list, which improved RefSoil's representation of EMP sequences by 7% by abundance. This effort demonstrates the value of RefSoil in the guidance of future research efforts and the capability of single cell genomics as a practical means to fill the existing genomic data gaps.« less
Strategies to improve reference databases for soil microbiomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Choi, Jinlyung; Yang, Fan; Stepanauskas, Ramunas

A database of curated genomes is needed to better assess soil microbial communities and their processes associated with differing land management and environmental impacts. Interpreting soil metagenomic datasets with existing sequence databases is challenging because these datasets are biased towards medical and biotechnology research and can result in misleading annotations. We have curated a database of 928 genomes of soil-associated organisms (888 bacteria, 34 archaea, and 6 fungi). Using this database as a representation of the current state of knowledge of soil microbes that are well-characterized, we evaluated its composition and compared it to broader microbial databases, specifically NCBI’s RefSeq,more » as well as 3,035 publicly available soil amplicon datasets. These comparisons identified phyla and functions that are enriched in soils as well as those that may be underrepresented in RefSoil. For example, RefSoil was observed to have increased representation of Firmicutes despite its low abundance in soil environments and also lacked representation of Acidobacteria and Verrucomicrobia, which are abundant in soils. Our comparison of RefSoil to soil amplicon datasets allowed us to identify targets that if cultured or sequenced would significantly increase the biodiversity represented within RefSoil. To demonstrate the opportunities to access these underrepresented targets, we employed single cell genomics in a pilot experiment to recover 14 genomes from the "most wanted" list, which improved RefSoil's representation of EMP sequences by 7% by abundance. This effort demonstrates the value of RefSoil in the guidance of future research efforts and the capability of single cell genomics as a practical means to fill the existing genomic data gaps.« less
Phylogenomics databases for facilitating functional genomics in rice.

PubMed

Jung, Ki-Hong; Cao, Peijian; Sharma, Rita; Jain, Rashmi; Ronald, Pamela C

2015-12-01

The completion of whole genome sequence of rice (Oryza sativa) has significantly accelerated functional genomics studies. Prior to the release of the sequence, only a few genes were assigned a function each year. Since sequencing was completed in 2005, the rate has exponentially increased. As of 2014, 1,021 genes have been described and added to the collection at The Overview of functionally characterized Genes in Rice online database (OGRO). Despite this progress, that number is still very low compared with the total number of genes estimated in the rice genome. One limitation to progress is the presence of functional redundancy among members of the same rice gene family, which covers 51.6 % of all non-transposable element-encoding genes. There remain a significant portion or rice genes that are not functionally redundant, as reflected in the recovery of loss-of-function mutants. To more accurately analyze functional redundancy in the rice genome, we have developed a phylogenomics databases for six large gene families in rice, including those for glycosyltransferases, glycoside hydrolases, kinases, transcription factors, transporters, and cytochrome P450 monooxygenases. In this review, we introduce key features and applications of these databases. We expect that they will serve as a very useful guide in the post-genomics era of research.
The SUPERFAMILY database in 2004: additions and improvements.

PubMed

Madera, Martin; Vogel, Christine; Kummerfeld, Sarah K; Chothia, Cyrus; Gough, Julian

2004-01-01

The SUPERFAMILY database provides structural assignments to protein sequences and a framework for analysis of the results. At the core of the database is a library of profile Hidden Markov Models that represent all proteins of known structure. The library is based on the SCOP classification of proteins: each model corresponds to a SCOP domain and aims to represent an entire superfamily. We have applied the library to predicted proteins from all completely sequenced genomes (currently 154), the Swiss-Prot and TrEMBL databases and other sequence collections. Close to 60% of all proteins have at least one match, and one half of all residues are covered by assignments. All models and full results are available for download and online browsing at http://supfam.org. Users can study the distribution of their superfamily of interest across all completely sequenced genomes, investigate with which other superfamilies it combines and retrieve proteins in which it occurs. Alternatively, concentrating on a particular genome as a whole, it is possible first, to find out its superfamily composition, and secondly, to compare it with that of other genomes to detect superfamilies that are over- or under-represented. In addition, the webserver provides the following standard services: sequence search; keyword search for genomes, superfamilies and sequence identifiers; and multiple alignment of genomic, PDB and custom sequences.
Description of Kuraishia piskuri f.a., sp. nov., a new methanol assimilating yeast and transfer of phylogenetically related Candida species to the genera Kuraishia and Nakazawaea as new combinations.

PubMed

Kurtzman, Cletus P; Robnett, Christie J

2014-11-01

The new anamorphic yeast Kuraishia piskuri, f.a., sp. nov. is described for three strains that were isolated from insect frass from trees growing in Florida, USA (type strain, NRRL YB-2544, CBS 13714). Species placement was based on phylogenetic analysis of nuclear gene sequences for the D1/D2 domains of large subunit rRNA, small subunit rRNA, translation elongation factor-1α, and subunits B1 and B2 of RNA polymerase II B. From this analysis, the anamorphic species Candida borneana, Candida cidri, Candida floccosa, Candida hungarica, and Candida ogatae were transferred to the genus Kuraishia as new combinations and Candida anatomiae, Candida ernobii, Candida ishiwadae, Candida laoshanensis, Candida molendini-olei, Candida peltata, Candida pomicola, Candida populi, Candida wickerhamii, and Candida wyomingensis were transferred to the genus Nakazawaea. Published 2014. This article is a U.S. Government work and is in the public domain in the USA.
The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

PubMed Central

Overbeek, Ross; Olson, Robert; Pusch, Gordon D.; Olsen, Gary J.; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Parrello, Bruce; Shukla, Maulik; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang; Stevens, Rick

2014-01-01

In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources. PMID:24293654
The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

PubMed

Overbeek, Ross; Olson, Robert; Pusch, Gordon D; Olsen, Gary J; Davis, James J; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Parrello, Bruce; Shukla, Maulik; Vonstein, Veronika; Wattam, Alice R; Xia, Fangfang; Stevens, Rick

2014-01-01

In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.
Absence of Photoreactivating Enzyme in Candida albicans, Candida stellatoidea, and Candida tropicalis

PubMed Central

Miller, Glendon R.; Sarachek, Alvin

1974-01-01

In vitro assays demonstrate photoreactivating enzyme activity in extracts of Candida pseudotropicalis but not in extracts of Candida albicans, Candida stellatoidea, or Candida tropicalis. PMID:4604052
A knowledge base for tracking the impact of genomics on population health.

PubMed

Yu, Wei; Gwinn, Marta; Dotson, W David; Green, Ridgely Fisk; Clyne, Mindy; Wulf, Anja; Bowen, Scott; Kolor, Katherine; Khoury, Muin J

2016-12-01

We created an online knowledge base (the Public Health Genomics Knowledge Base (PHGKB)) to provide systematically curated and updated information that bridges population-based research on genomics with clinical and public health applications. Weekly horizon scanning of a wide variety of online resources is used to retrieve relevant scientific publications, guidelines, and commentaries. After curation by domain experts, links are deposited into Web-based databases. PHGKB currently consists of nine component databases. Users can search the entire knowledge base or search one or more component databases directly and choose options for customizing the display of their search results. PHGKB offers researchers, policy makers, practitioners, and the general public a way to find information they need to understand the complicated landscape of genomics and population health.Genet Med 18 12, 1312-1314.
Outreach and online training services at the Saccharomyces Genome Database.

PubMed

MacPherson, Kevin A; Starr, Barry; Wong, Edith D; Dalusag, Kyla S; Hellerstedt, Sage T; Lang, Olivia W; Nash, Robert S; Skrzypek, Marek S; Engel, Stacia R; Cherry, J Michael

2017-01-01

The Saccharomyces Genome Database (SGD; www.yeastgenome.org ), the primary genetics and genomics resource for the budding yeast S. cerevisiae , provides free public access to expertly curated information about the yeast genome and its gene products. As the central hub for the yeast research community, SGD engages in a variety of social outreach efforts to inform our users about new developments, promote collaboration, increase public awareness of the importance of yeast to biomedical research, and facilitate scientific discovery. Here we describe these various outreach methods, from networking at scientific conferences to the use of online media such as blog posts and webinars, and include our perspectives on the benefits provided by outreach activities for model organism databases. http://www.yeastgenome.org. © The Author(s) 2017. Published by Oxford University Press.
The MAR databases: development and implementation of databases specific for marine metagenomics

PubMed Central

Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen

2018-01-01

Abstract We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. PMID:29106641

Occurrence and diversity of Candida genus in marine environments

NASA Astrophysics Data System (ADS)

Wang, Lin; Chi, Zhenming; Yue, Lixi; Chi, Zhe; Zhang, Dechao

2008-11-01

A total of 317 yeast isolates from seawater, sediments, mud of salterns, guts of marine fishes and marine algae were obtained. The results of routine identification and molecular characterization showed that six isolates among these marine yeasts belonged to Candida genus as Candida intermedia for YA01a, Candida parapsilosis for 3eA2, Candida quercitrusa for JHSb, Candia rugosa for wl8, Candida zeylanoides for TJY13a, and Candida membranifaciens for W14-3. Isolates YA01a ( Candida intermedia), wl8 ( Candida rugosa), 3eA2 ( Candida parapsilosis), and JHSb ( Candida quercitrusa) were found producing cell-bound lipase, while isolate W14-3 ( Candida membranifaciens) producing riboflavin. These marine yeast Candida spp. seem to have wide potential applications in biotechnology.
Reinventing MaizeGDB

USDA-ARS?s Scientific Manuscript database

The Maize Database (MaizeDB) to the Maize Genetics and Genomics Database (MaizeGDB) turns 20 this year, and such a significant milestone must be celebrated! With the release of the B73 reference sequence and more sequenced genomes on the way, the maize community needs to address various opportunitie...
The antiSMASH database, a comprehensive database of microbial secondary metabolite biosynthetic gene clusters.

PubMed

Blin, Kai; Medema, Marnix H; Kottmann, Renzo; Lee, Sang Yup; Weber, Tilmann

2017-01-04

Secondary metabolites produced by microorganisms are the main source of bioactive compounds that are in use as antimicrobial and anticancer drugs, fungicides, herbicides and pesticides. In the last decade, the increasing availability of microbial genomes has established genome mining as a very important method for the identification of their biosynthetic gene clusters (BGCs). One of the most popular tools for this task is antiSMASH. However, so far, antiSMASH is limited to de novo computing results for user-submitted genomes and only partially connects these with BGCs from other organisms. Therefore, we developed the antiSMASH database, a simple but highly useful new resource to browse antiSMASH-annotated BGCs in the currently 3907 bacterial genomes in the database and perform advanced search queries combining multiple search criteria. antiSMASH-DB is available at http://antismash-db.secondarymetabolites.org/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
The Reference Genome Sequence of Saccharomyces cerevisiae: Then and Now

PubMed Central

Engel, Stacia R.; Dietrich, Fred S.; Fisk, Dianna G.; Binkley, Gail; Balakrishnan, Rama; Costanzo, Maria C.; Dwight, Selina S.; Hitz, Benjamin C.; Karra, Kalpana; Nash, Robert S.; Weng, Shuai; Wong, Edith D.; Lloyd, Paul; Skrzypek, Marek S.; Miyasato, Stuart R.; Simison, Matt; Cherry, J. Michael

2014-01-01

The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called “S288C 2010,” was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science. PMID:24374639
Translational genomics for plant breeding with the genome sequence explosion.

PubMed

Kang, Yang Jae; Lee, Taeyoung; Lee, Jayern; Shim, Sangrea; Jeong, Haneul; Satyawan, Dani; Kim, Moon Young; Lee, Suk-Ha

2016-04-01

The use of next-generation sequencers and advanced genotyping technologies has propelled the field of plant genomics in model crops and plants and enhanced the discovery of hidden bridges between genotypes and phenotypes. The newly generated reference sequences of unstudied minor plants can be annotated by the knowledge of model plants via translational genomics approaches. Here, we reviewed the strategies of translational genomics and suggested perspectives on the current databases of genomic resources and the database structures of translated information on the new genome. As a draft picture of phenotypic annotation, translational genomics on newly sequenced plants will provide valuable assistance for breeders and researchers who are interested in genetic studies. © 2015 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Design and implementation of a database for Brucella melitensis genome annotation.

PubMed

De Hertogh, Benoît; Lahlimi, Leïla; Lambert, Christophe; Letesson, Jean-Jacques; Depiereux, Eric

2008-03-18

The genome sequences of three Brucella biovars and of some species close to Brucella sp. have become available, leading to new relationship analysis. Moreover, the automatic genome annotation of the pathogenic bacteria Brucella melitensis has been manually corrected by a consortium of experts, leading to 899 modifications of start sites predictions among the 3198 open reading frames (ORFs) examined. This new annotation, coupled with the results of automatic annotation tools of the complete genome sequences of the B. melitensis genome (including BLASTs to 9 genomes close to Brucella), provides numerous data sets related to predicted functions, biochemical properties and phylogenic comparisons. To made these results available, alphaPAGe, a functional auto-updatable database of the corrected sequence genome of B. melitensis, has been built, using the entity-relationship (ER) approach and a multi-purpose database structure. A friendly graphical user interface has been designed, and users can carry out different kinds of information by three levels of queries: (1) the basic search use the classical keywords or sequence identifiers; (2) the original advanced search engine allows to combine (by using logical operators) numerous criteria: (a) keywords (textual comparison) related to the pCDS's function, family domains and cellular localization; (b) physico-chemical characteristics (numerical comparison) such as isoelectric point or molecular weight and structural criteria such as the nucleic length or the number of transmembrane helix (TMH); (c) similarity scores with Escherichia coli and 10 species phylogenetically close to B. melitensis; (3) complex queries can be performed by using a SQL field, which allows all queries respecting the database's structure. The database is publicly available through a Web server at the following url: http://www.fundp.ac.be/urbm/bioinfo/aPAGe.
A low-latency, big database system and browser for storage, querying and visualization of 3D genomic data.

PubMed

Butyaev, Alexander; Mavlyutov, Ruslan; Blanchette, Mathieu; Cudré-Mauroux, Philippe; Waldispühl, Jérôme

2015-09-18

Recent releases of genome three-dimensional (3D) structures have the potential to transform our understanding of genomes. Nonetheless, the storage technology and visualization tools need to evolve to offer to the scientific community fast and convenient access to these data. We introduce simultaneously a database system to store and query 3D genomic data (3DBG), and a 3D genome browser to visualize and explore 3D genome structures (3DGB). We benchmark 3DBG against state-of-the-art systems and demonstrate that it is faster than previous solutions, and importantly gracefully scales with the size of data. We also illustrate the usefulness of our 3D genome Web browser to explore human genome structures. The 3D genome browser is available at http://3dgb.cs.mcgill.ca/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
A low-latency, big database system and browser for storage, querying and visualization of 3D genomic data

PubMed Central

Butyaev, Alexander; Mavlyutov, Ruslan; Blanchette, Mathieu; Cudré-Mauroux, Philippe; Waldispühl, Jérôme

2015-01-01

Recent releases of genome three-dimensional (3D) structures have the potential to transform our understanding of genomes. Nonetheless, the storage technology and visualization tools need to evolve to offer to the scientific community fast and convenient access to these data. We introduce simultaneously a database system to store and query 3D genomic data (3DBG), and a 3D genome browser to visualize and explore 3D genome structures (3DGB). We benchmark 3DBG against state-of-the-art systems and demonstrate that it is faster than previous solutions, and importantly gracefully scales with the size of data. We also illustrate the usefulness of our 3D genome Web browser to explore human genome structures. The 3D genome browser is available at http://3dgb.cs.mcgill.ca/. PMID:25990738
KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation.

PubMed

Wang, Dapeng; Xu, Jiayue; Yu, Jun

2015-09-16

The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. To meet increasing demands for comparing large genome sequences and to promote the use of the K-mer approach, we develop a versatile database, KGCAK ( http://kgcak.big.ac.cn/KGCAK/ ), containing ~8,000 genomes that include genome sequences of diverse life forms (viruses, prokaryotes, protists, animals, and plants) and cellular organelles of eukaryotic lineages. It builds phylogeny based on genomic elements in an alignment-free fashion and provides in-depth data processing enabling users to compare the complexity of genome sequences based on K-mer distribution. We hope that KGCAK becomes a powerful tool for exploring relationship within and among groups of species in a tree of life based on genomic data.
Identification of genomic sites for CRISPR/Cas9-based genome editing in the Vitis vinifera genome.

PubMed

Wang, Yi; Liu, Xianju; Ren, Chong; Zhong, Gan-Yuan; Yang, Long; Li, Shaohua; Liang, Zhenchang

2016-04-21

CRISPR/Cas9 has been recently demonstrated as an effective and popular genome editing tool for modifying genomes of humans, animals, microorganisms, and plants. Success of such genome editing is highly dependent on the availability of suitable target sites in the genomes to be edited. Many specific target sites for CRISPR/Cas9 have been computationally identified for several annual model and crop species, but such sites have not been reported for perennial, woody fruit species. In this study, we identified and characterized five types of CRISPR/Cas9 target sites in the widely cultivated grape species Vitis vinifera and developed a user-friendly database for editing grape genomes in the future. A total of 35,767,960 potential CRISPR/Cas9 target sites were identified from grape genomes in this study. Among them, 22,597,817 target sites were mapped to specific genomic locations and 7,269,788 were found to be highly specific. Protospacers and PAMs were found to distribute uniformly and abundantly in the grape genomes. They were present in all the structural elements of genes with the coding region having the highest abundance. Five PAM types, TGG, AGG, GGG, CGG and NGG, were observed. With the exception of the NGG type, they were abundantly present in the grape genomes. Synteny analysis of similar genes revealed that the synteny of protospacers matched the synteny of homologous genes. A user-friendly database containing protospacers and detailed information of the sites was developed and is available for public use at the Grape-CRISPR website ( http://biodb.sdau.edu.cn/gc/index.html ). Grape genomes harbour millions of potential CRISPR/Cas9 target sites. These sites are widely distributed among and within chromosomes with predominant abundance in the coding regions of genes. We developed a publicly-accessible Grape-CRISPR database for facilitating the use of the CRISPR/Cas9 system as a genome editing tool for functional studies and molecular breeding of grapes. Among other functions, the database allows users to identify and select multi-protospacers for editing similar sequences in grape genomes simultaneously.
The Saccharomyces Genome Database Variant Viewer

PubMed Central

Sheppard, Travis K.; Hitz, Benjamin C.; Engel, Stacia R.; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C.; Dalusag, Kyla S.; Demeter, Janos; Hellerstedt, Sage T.; Karra, Kalpana; Nash, Robert S.; Paskov, Kelley M.; Skrzypek, Marek S.; Weng, Shuai; Wong, Edith D.; Cherry, J. Michael

2016-01-01

The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. PMID:26578556
ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation.

PubMed

Kristensen, David M; Wolf, Yuri I; Koonin, Eugene V

2017-01-04

The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbial world. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of 'index' orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Gnome View: A tool for visual representation of human genome data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pelkey, J.E.; Thomas, G.S.; Thurman, D.A.

1993-02-01

GnomeView is a tool for exploring data generated by the Human Gemone Project. GnomeView provides both graphical and textural styles of data presentation: employs an intuitive window-based graphical query interface: and integrates its underlying genome databases in such a way that the user can navigate smoothly across databases and between different levels of data. This paper describes GnomeView and discusses how it addresses various genome informatics issues.
The 2018 Nucleic Acids Research database issue and the online molecular biology database collection.

PubMed

Rigden, Daniel J; Fernández, Xosé M

2018-01-04

The 2018 Nucleic Acids Research Database Issue contains 181 papers spanning molecular biology. Among them, 82 are new and 84 are updates describing resources that appeared in the Issue previously. The remaining 15 cover databases most recently published elsewhere. Databases in the area of nucleic acids include 3DIV for visualisation of data on genome 3D structure and RNArchitecture, a hierarchical classification of RNA families. Protein databases include the established SMART, ELM and MEROPS while GPCRdb and the newcomer STCRDab cover families of biomedical interest. In the area of metabolism, HMDB and Reactome both report new features while PULDB appears in NAR for the first time. This issue also contains reports on genomics resources including Ensembl, the UCSC Genome Browser and ENCODE. Update papers from the IUPHAR/BPS Guide to Pharmacology and DrugBank are highlights of the drug and drug target section while a number of proteomics databases including proteomicsDB are also covered. The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). The NAR online Molecular Biology Database Collection has been updated, reviewing 138 entries, adding 88 new resources and eliminating 47 discontinued URLs, bringing the current total to 1737 databases. It is available at http://www.oxfordjournals.org/nar/database/c/. © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes.

PubMed

Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim

2010-03-01

Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. The database can be accessed through http://proteinworlddb.org
Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context

PubMed Central

Faith, Jeremiah J; Olson, Andrew J; Gardner, Timothy S; Sachidanandam, Ravi

2007-01-01

Background Lightweight genome viewer (lwgv) is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. Results lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. Conclusion lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired. PMID:17877794
GenomeRNAi: a database for cell-based RNAi phenotypes.

PubMed

Horn, Thomas; Arziman, Zeynep; Berger, Juerg; Boutros, Michael

2007-01-01

RNA interference (RNAi) has emerged as a powerful tool to generate loss-of-function phenotypes in a variety of organisms. Combined with the sequence information of almost completely annotated genomes, RNAi technologies have opened new avenues to conduct systematic genetic screens for every annotated gene in the genome. As increasing large datasets of RNAi-induced phenotypes become available, an important challenge remains the systematic integration and annotation of functional information. Genome-wide RNAi screens have been performed both in Caenorhabditis elegans and Drosophila for a variety of phenotypes and several RNAi libraries have become available to assess phenotypes for almost every gene in the genome. These screens were performed using different types of assays from visible phenotypes to focused transcriptional readouts and provide a rich data source for functional annotation across different species. The GenomeRNAi database provides access to published RNAi phenotypes obtained from cell-based screens and maps them to their genomic locus, including possible non-specific regions. The database also gives access to sequence information of RNAi probes used in various screens. It can be searched by phenotype, by gene, by RNAi probe or by sequence and is accessible at http://rnai.dkfz.de.
GenomeRNAi: a database for cell-based RNAi phenotypes

PubMed Central

Horn, Thomas; Arziman, Zeynep; Berger, Juerg; Boutros, Michael

2007-01-01

RNA interference (RNAi) has emerged as a powerful tool to generate loss-of-function phenotypes in a variety of organisms. Combined with the sequence information of almost completely annotated genomes, RNAi technologies have opened new avenues to conduct systematic genetic screens for every annotated gene in the genome. As increasing large datasets of RNAi-induced phenotypes become available, an important challenge remains the systematic integration and annotation of functional information. Genome-wide RNAi screens have been performed both in Caenorhabditis elegans and Drosophila for a variety of phenotypes and several RNAi libraries have become available to assess phenotypes for almost every gene in the genome. These screens were performed using different types of assays from visible phenotypes to focused transcriptional readouts and provide a rich data source for functional annotation across different species. The GenomeRNAi database provides access to published RNAi phenotypes obtained from cell-based screens and maps them to their genomic locus, including possible non-specific regions. The database also gives access to sequence information of RNAi probes used in various screens. It can be searched by phenotype, by gene, by RNAi probe or by sequence and is accessible at PMID:17135194
The BIG Data Center: from deposition to integration to translation

PubMed Central

2017-01-01

Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn. PMID:27899658
Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context.

PubMed

Faith, Jeremiah J; Olson, Andrew J; Gardner, Timothy S; Sachidanandam, Ravi

2007-09-18

Lightweight genome viewer (lwgv) is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired.

Bolbase: a comprehensive genomics database for Brassica oleracea.

PubMed

Yu, Jingyin; Zhao, Meixia; Wang, Xiaowu; Tong, Chaobo; Huang, Shunmou; Tehrim, Sadia; Liu, Yumei; Hua, Wei; Liu, Shengyi

2013-09-30

Brassica oleracea is a morphologically diverse species in the family Brassicaceae and contains a group of nutrition-rich vegetable crops, including common heading cabbage, cauliflower, broccoli, kohlrabi, kale, Brussels sprouts. This diversity along with its phylogenetic membership in a group of three diploid and three tetraploid species, and the recent availability of genome sequences within Brassica provide an unprecedented opportunity to study intra- and inter-species divergence and evolution in this species and its close relatives. We have developed a comprehensive database, Bolbase, which provides access to the B. oleracea genome data and comparative genomics information. The whole genome of B. oleracea is available, including nine fully assembled chromosomes and 1,848 scaffolds, with 45,758 predicted genes, 13,382 transposable elements, and 3,581 non-coding RNAs. Comparative genomics information is available, including syntenic regions among B. oleracea, Brassica rapa and Arabidopsis thaliana, synonymous (Ks) and non-synonymous (Ka) substitution rates between orthologous gene pairs, gene families or clusters, and differences in quantity, category, and distribution of transposable elements on chromosomes. Bolbase provides useful search and data mining tools, including a keyword search, a local BLAST server, and a customized GBrowse tool, which can be used to extract annotations of genome components, identify similar sequences and visualize syntenic regions among species. Users can download all genomic data and explore comparative genomics in a highly visual setting. Bolbase is the first resource platform for the B. oleracea genome and for genomic comparisons with its relatives, and thus it will help the research community to better study the function and evolution of Brassica genomes as well as enhance molecular breeding research. This database will be updated regularly with new features, improvements to genome annotation, and new genomic sequences as they become available. Bolbase is freely available at http://ocri-genomics.org/bolbase.
Benchmarking distributed data warehouse solutions for storing genomic variant information

PubMed Central

Wiewiórka, Marek S.; Wysakowicz, Dawid P.; Okoniewski, Michał J.

2017-01-01

Abstract Genomic-based personalized medicine encompasses storing, analysing and interpreting genomic variants as its central issues. At a time when thousands of patientss sequenced exomes and genomes are becoming available, there is a growing need for efficient database storage and querying. The answer could be the application of modern distributed storage systems and query engines. However, the application of large genomic variant databases to this problem has not been sufficiently far explored so far in the literature. To investigate the effectiveness of modern columnar storage [column-oriented Database Management System (DBMS)] and query engines, we have developed a prototypic genomic variant data warehouse, populated with large generated content of genomic variants and phenotypic data. Next, we have benchmarked performance of a number of combinations of distributed storages and query engines on a set of SQL queries that address biological questions essential for both research and medical applications. In addition, a non-distributed, analytical database (MonetDB) has been used as a baseline. Comparison of query execution times confirms that distributed data warehousing solutions outperform classic relational DBMSs. Moreover, pre-aggregation and further denormalization of data, which reduce the number of distributed join operations, significantly improve query performance by several orders of magnitude. Most of distributed back-ends offer a good performance for complex analytical queries, while the Optimized Row Columnar (ORC) format paired with Presto and Parquet with Spark 2 query engines provide, on average, the lowest execution times. Apache Kudu on the other hand, is the only solution that guarantees a sub-second performance for simple genome range queries returning a small subset of data, where low-latency response is expected, while still offering decent performance for running analytical queries. In summary, research and clinical applications that require the storage and analysis of variants from thousands of samples can benefit from the scalability and performance of distributed data warehouse solutions. Database URL: https://github.com/ZSI-Bio/variantsdwh PMID:29220442
Comparative evaluation of six chromogenic media for presumptive yeast identification.

PubMed

Vecchione, Alessandra; Florio, Walter; Celandroni, Francesco; Barnini, Simona; Lupetti, Antonella; Ghelardi, Emilia

2017-12-01

The present study was undertaken to evaluate the discrimination ability of six chromogenic media in presumptive yeast identification. We analysed 108 clinical isolates and reference strains belonging to eight different species: Candida albicans , Candida dubliniensis , Candida tropicalis , Candida krusei , Candida glabrata , Candida parapsilosis , Candida lusitaniae and Trichosporon mucoides . C. albicans , C. tropicalis and C. krusei could be distinguished from one another in all the tested chromogenic media, as predicted by the manufacturers. In addition, C. albicans could be distinguished from C. dubliniensis on BBL CHROMagar Candida, Kima CHROMagar Candida and Brilliance Candida, and C. parapsilosis could be identified on CHROMATIC Candida agar, CHROMOGENIC Candida agar, and Brilliance Candida agar. Brilliance Candida provided the widest discrimination ability, being able to discriminate five out of the seven Candida species tested. Interestingly, C. tropicalis and C. krusei could be already distinguished from each other after 24 hours of incubation. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
The Aspergillus Genome Database: multispecies curation and incorporation of RNA-Seq data to improve structural gene annotations.

PubMed

Cerqueira, Gustavo C; Arnaud, Martha B; Inglis, Diane O; Skrzypek, Marek S; Binkley, Gail; Simison, Matt; Miyasato, Stuart R; Binkley, Jonathan; Orvis, Joshua; Shah, Prachi; Wymore, Farrell; Sherlock, Gavin; Wortman, Jennifer R

2014-01-01

The Aspergillus Genome Database (AspGD; http://www.aspgd.org) is a freely available web-based resource that was designed for Aspergillus researchers and is also a valuable source of information for the entire fungal research community. In addition to being a repository and central point of access to genome, transcriptome and polymorphism data, AspGD hosts a comprehensive comparative genomics toolbox that facilitates the exploration of precomputed orthologs among the 20 currently available Aspergillus genomes. AspGD curators perform gene product annotation based on review of the literature for four key Aspergillus species: Aspergillus nidulans, Aspergillus oryzae, Aspergillus fumigatus and Aspergillus niger. We have iteratively improved the structural annotation of Aspergillus genomes through the analysis of publicly available transcription data, mostly expressed sequenced tags, as described in a previous NAR Database article (Arnaud et al. 2012). In this update, we report substantive structural annotation improvements for A. nidulans, A. oryzae and A. fumigatus genomes based on recently available RNA-Seq data. Over 26 000 loci were updated across these species; although those primarily comprise the addition and extension of untranslated regions (UTRs), the new analysis also enabled over 1000 modifications affecting the coding sequence of genes in each target genome.
Ontology-oriented retrieval of putative microRNAs in Vitis vinifera via GrapeMiRNA: a web database of de novo predicted grape microRNAs.

PubMed

Lazzari, Barbara; Caprera, Andrea; Cestaro, Alessandro; Merelli, Ivan; Del Corvo, Marcello; Fontana, Paolo; Milanesi, Luciano; Velasco, Riccardo; Stella, Alessandra

2009-06-29

Two complete genome sequences are available for Vitis vinifera Pinot noir. Based on the sequence and gene predictions produced by the IASMA, we performed an in silico detection of putative microRNA genes and of their targets, and collected the most reliable microRNA predictions in a web database. The application is available at http://www.itb.cnr.it/ptp/grapemirna/. The program FindMiRNA was used to detect putative microRNA genes in the grape genome. A very high number of predictions was retrieved, calling for validation. Nine parameters were calculated and, based on the grape microRNAs dataset available at miRBase, thresholds were defined and applied to FindMiRNA predictions having targets in gene exons. In the resulting subset, predictions were ranked according to precursor positions and sequence similarity, and to target identity. To further validate FindMiRNA predictions, comparisons to the Arabidopsis genome, to the grape Genoscope genome, and to the grape EST collection were performed. Results were stored in a MySQL database and a web interface was prepared to query the database and retrieve predictions of interest. The GrapeMiRNA database encompasses 5,778 microRNA predictions spanning the whole grape genome. Predictions are integrated with information that can be of use in selection procedures. Tools added in the web interface also allow to inspect predictions according to gene ontology classes and metabolic pathways of targets. The GrapeMiRNA database can be of help in selecting candidate microRNA genes to be validated.
User Guidelines for the Brassica Database: BRAD.

PubMed

Wang, Xiaobo; Cheng, Feng; Wang, Xiaowu

2016-01-01

The genome sequence of Brassica rapa was first released in 2011. Since then, further Brassica genomes have been sequenced or are undergoing sequencing. It is therefore necessary to develop tools that help users to mine information from genomic data efficiently. This will greatly aid scientific exploration and breeding application, especially for those with low levels of bioinformatic training. Therefore, the Brassica database (BRAD) was built to collect, integrate, illustrate, and visualize Brassica genomic datasets. BRAD provides useful searching and data mining tools, and facilitates the search of gene annotation datasets, syntenic or non-syntenic orthologs, and flanking regions of functional genomic elements. It also includes genome-analysis tools such as BLAST and GBrowse. One of the important aims of BRAD is to build a bridge between Brassica crop genomes with the genome of the model species Arabidopsis thaliana, thus transferring the bulk of A. thaliana gene study information for use with newly sequenced Brassica crops.
LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.

PubMed

Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun

2012-01-01

Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene ontology (GO) annotation, promoter identification, gene expression (co-expression), and evolutionary analysis. This database not only provides a way to define lineage-specific and species-specific gene clusters but also facilitates future studies on gene co-regulation, epigenetic control of gene expression (DNA methylation and histone marks), and chromosomal structures in a context of gene clusters and species evolution. LCGbase is freely available at http://lcgbase.big.ac.cn/LCGbase.
Genome sequence analysis of a flocculant-producing bacterium, Paenibacillus shenyangensis.

PubMed

Fu, Lili; Jiang, Binhui; Liu, Jinliang; Zhao, Xin; Liu, Qian; Hu, Xiaomin

2016-03-01

To explore the metabolic process of Paenibacillus shenyangensis that is an efficient bioflocculant-producing bacterium. The biosynthesis mechanism of bioflocculation was used to enrich the genome of Paenibacillus shenyangensis and provide a basis for molecular genetics and functional genomics analyses. According to the analysis of de novo assembly, a total of 5,501,467 bp clean reads were generated, and were assembled into 92 contigs. 4800 unigenes were predicted of which 4393 were annotated showing a specific gene function in the NCBI-Nr database. 3423 genes were found in the database of cluster of orthologous groups. Among the 168 Kyoto Encyclopedia of Genes and Genomes database, cell growth and metabolism were the main biological processes, and a potential metabolic pathway was predicted from glucose to exopolysaccharide within the starch and sucrose metabolism pathway. By using the high-throughput sequencing technology, we provide a genome analysis of Paenibacillus shenyangensis that predicts the main metabolic processes and a potential pathway of exopolysaccharide biosynthesis.
Adaptive Mistranslation Accelerates the Evolution of Fluconazole Resistance and Induces Major Genomic and Gene Expression Alterations in Candida albicans

PubMed Central

Santamaría, Rodrigo; Lee, Wanseon; Rung, Johan; Tocci, Noemi; Abbey, Darren; Bezerra, Ana R.; Carreto, Laura; Moura, Gabriela R.; Bayés, Mónica; Gut, Ivo G.; Csikasz-Nagy, Attila; Cavalieri, Duccio; Berman, Judith

2017-01-01

ABSTRACT Regulated erroneous protein translation (adaptive mistranslation) increases proteome diversity and produces advantageous phenotypic variability in the human pathogen Candida albicans. It also increases fitness in the presence of fluconazole, but the underlying molecular mechanism is not understood. To address this question, we evolved hypermistranslating and wild-type strains in the absence and presence of fluconazole and compared their fluconazole tolerance and resistance trajectories during evolution. The data show that mistranslation increases tolerance and accelerates the acquisition of resistance to fluconazole. Genome sequencing, array-based comparative genome analysis, and gene expression profiling revealed that during the course of evolution in fluconazole, the range of mutational and gene deregulation differences was distinctively different and broader in the hypermistranslating strain, including multiple chromosome duplications, partial chromosome deletions, and polyploidy. Especially, the increased accumulation of loss-of-heterozygosity events, aneuploidy, translational and cell surface modifications, and differences in drug efflux seem to mediate more rapid drug resistance acquisition under mistranslation. Our observations support a pivotal role for adaptive mistranslation in the evolution of drug resistance in C. albicans. IMPORTANCE Infectious diseases caused by drug-resistant fungi are an increasing threat to public health because of the high mortality rates and high costs associated with treatment. Thus, understanding of the molecular mechanisms of drug resistance is of crucial interest for the medical community. Here we investigated the role of regulated protein mistranslation, a characteristic mechanism used by C. albicans to diversify its proteome, in the evolution of fluconazole resistance. Such codon ambiguity is usually considered highly deleterious, yet recent studies found that mistranslation can boost adaptation in stressful environments. Our data reveal that CUG ambiguity diversifies the genome in multiple ways and that the full spectrum of drug resistance mechanisms in C. albicans goes beyond the traditional pathways that either regulate drug efflux or alter the interactions of drugs with their targets. The present work opens new avenues to understand the molecular and genetic basis of microbial drug resistance. PMID:28808688
GrTEdb: the first web-based database of transposable elements in cotton (Gossypium raimondii).

PubMed

Xu, Zhenzhen; Liu, Jing; Ni, Wanchao; Peng, Zhen; Guo, Yue; Ye, Wuwei; Huang, Fang; Zhang, Xianggui; Xu, Peng; Guo, Qi; Shen, Xinlian; Du, Jianchang

2017-01-01

Although several diploid and tetroploid Gossypium species genomes have been sequenced, the well annotated web-based transposable elements (TEs) database is lacking. To better understand the roles of TEs in structural, functional and evolutionary dynamics of the cotton genome, a comprehensive, specific, and user-friendly web-based database, Gossypium raimondii transposable elements database (GrTEdb), was constructed. A total of 14 332 TEs were structurally annotated and clearly categorized in G. raimondii genome, and these elements have been classified into seven distinct superfamilies based on the order of protein-coding domains, structures and/or sequence similarity, including 2929 Copia-like elements, 10 368 Gypsy-like elements, 299 L1 , 12 Mutators , 435 PIF-Harbingers , 275 CACTAs and 14 Helitrons . Meanwhile, the web-based sequence browsing, searching, downloading and blast tool were implemented to help users easily and effectively to annotate the TEs or TE fragments in genomic sequences from G. raimondii and other closely related Gossypium species. GrTEdb provides resources and information related with TEs in G. raimondii , and will facilitate gene and genome analyses within or across Gossypium species, evaluating the impact of TEs on their host genomes, and investigating the potential interaction between TEs and protein-coding genes in Gossypium species. http://www.grtedb.org/. © The Author(s) 2017. Published by Oxford University Press.
The Ensembl genome database project.

PubMed

Hubbard, T; Barker, D; Birney, E; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Huminiecki, L; Kasprzyk, A; Lehvaslaiho, H; Lijnzaad, P; Melsopp, C; Mongin, E; Pettett, R; Pocock, M; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Clamp, M

2002-01-01

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.
RefSeq microbial genomes database: new representation and annotation strategy.

PubMed

Tatusova, Tatiana; Ciufo, Stacy; Fedorov, Boris; O'Neill, Kathleen; Tolstoy, Igor

2014-01-01

The source of the microbial genomic sequences in the RefSeq collection is the set of primary sequence records submitted to the International Nucleotide Sequence Database public archives. These can be accessed through the Entrez search and retrieval system at http://www.ncbi.nlm.nih.gov/genome. Next-generation sequencing has enabled researchers to perform genomic sequencing at rates that were unimaginable in the past. Microbial genomes can now be sequenced in a matter of hours, which has led to a significant increase in the number of assembled genomes deposited in the public archives. This huge increase in DNA sequence data presents new challenges for the annotation, analysis and visualization bioinformatics tools. New strategies have been developed for the annotation and representation of reference genomes and sequence variations derived from population studies and clinical outbreaks.
The MAR databases: development and implementation of databases specific for marine metagenomics.

PubMed

Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen; Willassen, Nils P

2018-01-04

We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
ASGARD: an open-access database of annotated transcriptomes for emerging model arthropod species.

PubMed

Zeng, Victor; Extavour, Cassandra G

2012-01-01

The increased throughput and decreased cost of next-generation sequencing (NGS) have shifted the bottleneck genomic research from sequencing to annotation, analysis and accessibility. This is particularly challenging for research communities working on organisms that lack the basic infrastructure of a sequenced genome, or an efficient way to utilize whatever sequence data may be available. Here we present a new database, the Assembled Searchable Giant Arthropod Read Database (ASGARD). This database is a repository and search engine for transcriptomic data from arthropods that are of high interest to multiple research communities but currently lack sequenced genomes. We demonstrate the functionality and utility of ASGARD using de novo assembled transcriptomes from the milkweed bug Oncopeltus fasciatus, the cricket Gryllus bimaculatus and the amphipod crustacean Parhyale hawaiensis. We have annotated these transcriptomes to assign putative orthology, coding region determination, protein domain identification and Gene Ontology (GO) term annotation to all possible assembly products. ASGARD allows users to search all assemblies by orthology annotation, GO term annotation or Basic Local Alignment Search Tool. User-friendly features of ASGARD include search term auto-completion suggestions based on database content, the ability to download assembly product sequences in FASTA format, direct links to NCBI data for predicted orthologs and graphical representation of the location of protein domains and matches to similar sequences from the NCBI non-redundant database. ASGARD will be a useful repository for transcriptome data from future NGS studies on these and other emerging model arthropods, regardless of sequencing platform, assembly or annotation status. This database thus provides easy, one-stop access to multi-species annotated transcriptome information. We anticipate that this database will be useful for members of multiple research communities, including developmental biology, physiology, evolutionary biology, ecology, comparative genomics and phylogenomics. Database URL: asgard.rc.fas.harvard.edu.
VitisExpDB: a database resource for grape functional genomics.

PubMed

Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L

2008-02-28

The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores approximately 320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of approximately 20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website http://cropdisease.ars.usda.gov/vitis_at/main-page.htm.
VitisExpDB: A database resource for grape functional genomics

PubMed Central

Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L

2008-01-01

Background The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. Description VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores ~320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of ~20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. Conclusion The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website . PMID:18307813
EDGAR: A software framework for the comparative analysis of prokaryotic genomes

PubMed Central

Blom, Jochen; Albaum, Stefan P; Doppmeier, Daniel; Pühler, Alfred; Vorhölter, Frank-Jörg; Zakrzewski, Martha; Goesmann, Alexander

2009-01-01

Background The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results To support these studies EDGAR – "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface , where the precomputed data sets can be browsed. PMID:19457249
Expanded national database collection and data coverage in the FINDbase worldwide database for clinically relevant genomic variation allele frequencies

PubMed Central

Viennas, Emmanouil; Komianou, Angeliki; Mizzi, Clint; Stojiljkovic, Maja; Mitropoulou, Christina; Muilu, Juha; Vihinen, Mauno; Grypioti, Panagiota; Papadaki, Styliani; Pavlidis, Cristiana; Zukic, Branka; Katsila, Theodora; van der Spek, Peter J.; Pavlovic, Sonja; Tzimas, Giannis; Patrinos, George P.

2017-01-01

FINDbase (http://www.findbase.org) is a comprehensive data repository that records the prevalence of clinically relevant genomic variants in various populations worldwide, such as pathogenic variants leading mostly to monogenic disorders and pharmacogenomics biomarkers. The database also records the incidence of rare genetic diseases in various populations, all in well-distinct data modules. Here, we report extensive data content updates in all data modules, with direct implications to clinical pharmacogenomics. Also, we report significant new developments in FINDbase, namely (i) the release of a new version of the ETHNOS software that catalyzes development curation of national/ethnic genetic databases, (ii) the migration of all FINDbase data content into 90 distinct national/ethnic mutation databases, all built around Microsoft's PivotViewer (http://www.getpivot.com) software (iii) new data visualization tools and (iv) the interrelation of FINDbase with DruGeVar database with direct implications in clinical pharmacogenomics. The abovementioned updates further enhance the impact of FINDbase, as a key resource for Genomic Medicine applications. PMID:27924022
Database resources of the National Center for Biotechnology Information

PubMed Central

Wheeler, David L.; Church, Deanna M.; Lash, Alex E.; Leipe, Detlef D.; Madden, Thomas L.; Pontius, Joan U.; Schuler, Gregory D.; Schriml, Lynn M.; Tatusova, Tatiana A.; Wagner, Lukas; Rapp, Barbara A.

2001-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s Web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, GeneMap’99, Human–Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, Cancer Genome Anatomy Project (CGAP), SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov. PMID:11125038
Performance of Candida ID, a New Chromogenic Medium for Presumptive Identification of Candida Species, in Comparison to CHROMagar Candida

PubMed Central

Willinger, Birgit; Hillowoth, Cornelia; Selitsch, Brigitte; Manafi, Mammad

2001-01-01

Candida ID agar allows identification of Candida albicans and differentiation of other Candida species. In comparison with CHROMagar Candida, we evaluated the performance of this medium directly from 596 clinical specimens. In particular, detection of C. albicans after 24 h of incubation was easier on Candida ID (sensitivity, 96.8%) than on CHROMagar (sensitivity, 49.6%). PMID:11574621

Inhibitory effect of alpha-mangostin on Candida biofilms.

PubMed

Kaomongkolgit, Ruchadaporn; Jamdee, Kusuma

2017-04-01

The objective of this study was to determine the inhibitory effect of alpha-mangostin on Candida biofilms. Candida species including Candida albicans, Candida krusei, Candida tropicalis, and Candida glabrata were tested. Candida biofilms were formed in flat-bottomed 96-well microtiter plates. The metabolic activity of cells within biofilms was quantified using the XTT assay. The results demonstrated that alpha-mangostin showed a significant anti-biofilm effect on both developing biofilms and preformed biofilms of Candida species. It may be concluded that alpha-mangostin could be an anti-biofilm agent against Candida species. Further in vivo investigations are needed to uncover the therapeutic values of this medicinal plant.
Cazymes Analysis Toolkit (CAT): Webservice for searching and analyzing carbohydrateactive enzymes in a newly sequenced organism using CAZy database

DOE Office of Scientific and Technical Information (OSTI.GOV)

Karpinets, Tatiana V; Park, Byung; Syed, Mustafa H

2010-01-01

The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire non-redundant sequences of the CAZy database. Themore » second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains (DUF) and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit (CAT), and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.« less
Supervised Learning for Detection of Duplicates in Genomic Sequence Databases.

PubMed

Chen, Qingyu; Zobel, Justin; Zhang, Xiuzhen; Verspoor, Karin

2016-01-01

First identified as an issue in 1996, duplication in biological databases introduces redundancy and even leads to inconsistency when contradictory information appears. The amount of data makes purely manual de-duplication impractical, and existing automatic systems cannot detect duplicates as precisely as can experts. Supervised learning has the potential to address such problems by building automatic systems that learn from expert curation to detect duplicates precisely and efficiently. While machine learning is a mature approach in other duplicate detection contexts, it has seen only preliminary application in genomic sequence databases. We developed and evaluated a supervised duplicate detection method based on an expert curated dataset of duplicates, containing over one million pairs across five organisms derived from genomic sequence databases. We selected 22 features to represent distinct attributes of the database records, and developed a binary model and a multi-class model. Both models achieve promising performance; under cross-validation, the binary model had over 90% accuracy in each of the five organisms, while the multi-class model maintains high accuracy and is more robust in generalisation. We performed an ablation study to quantify the impact of different sequence record features, finding that features derived from meta-data, sequence identity, and alignment quality impact performance most strongly. The study demonstrates machine learning can be an effective additional tool for de-duplication of genomic sequence databases. All Data are available as described in the supplementary material.
CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database.

PubMed

Park, Byung H; Karpinets, Tatiana V; Syed, Mustafa H; Leuze, Michael R; Uberbacher, Edward C

2010-12-01

The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire nonredundant sequences of the CAZy database. The second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit, and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.
PGSB/MIPS Plant Genome Information Resources and Concepts for the Analysis of Complex Grass Genomes.

PubMed

Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X

2016-01-01

PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.
GEMINI: a computationally-efficient search engine for large gene expression datasets.

PubMed

DeFreitas, Timothy; Saddiki, Hachem; Flaherty, Patrick

2016-02-24

Low-cost DNA sequencing allows organizations to accumulate massive amounts of genomic data and use that data to answer a diverse range of research questions. Presently, users must search for relevant genomic data using a keyword, accession number of meta-data tag. However, in this search paradigm the form of the query - a text-based string - is mismatched with the form of the target - a genomic profile. To improve access to massive genomic data resources, we have developed a fast search engine, GEMINI, that uses a genomic profile as a query to search for similar genomic profiles. GEMINI implements a nearest-neighbor search algorithm using a vantage-point tree to store a database of n profiles and in certain circumstances achieves an [Formula: see text] expected query time in the limit. We tested GEMINI on breast and ovarian cancer gene expression data from The Cancer Genome Atlas project and show that it achieves a query time that scales as the logarithm of the number of records in practice on genomic data. In a database with 10(5) samples, GEMINI identifies the nearest neighbor in 0.05 sec compared to a brute force search time of 0.6 sec. GEMINI is a fast search engine that uses a query genomic profile to search for similar profiles in a very large genomic database. It enables users to identify similar profiles independent of sample label, data origin or other meta-data information.
GenColors: annotation and comparative genomics of prokaryotes made easy.

PubMed

Romualdi, Alessandro; Felder, Marius; Rose, Dominic; Gausmann, Ulrike; Schilhabel, Markus; Glöckner, Gernot; Platzer, Matthias; Sühnel, Jürgen

2007-01-01

GenColors (gencolors.fli-leibniz.de) is a new web-based software/database system aimed at an improved and accelerated annotation of prokaryotic genomes considering information on related genomes and making extensive use of genome comparison. It offers a seamless integration of data from ongoing sequencing projects and annotated genomic sequences obtained from GenBank. A variety of export/import filters manages an effective data flow from sequence assembly and manipulation programs (e.g., GAP4) to GenColors and back as well as to standard GenBank file(s). The genome comparison tools include best bidirectional hits, gene conservation, syntenies, and gene core sets. Precomputed UniProt matches allow annotation and analysis in an effective manner. In addition to these analysis options, base-specific quality data (coverage and confidence) can also be handled if available. The GenColors system can be used both for annotation purposes in ongoing genome projects and as an analysis tool for finished genomes. GenColors comes in two types, as dedicated genome browsers and as the Jena Prokaryotic Genome Viewer (JPGV). Dedicated genome browsers contain genomic information on a set of related genomes and offer a large number of options for genome comparison. The system has been efficiently used in the genomic sequencing of Borrelia garinii and is currently applied to various ongoing genome projects on Borrelia, Legionella, Escherichia, and Pseudomonas genomes. One of these dedicated browsers, the Spirochetes Genome Browser (sgb.fli-leibniz.de) with Borrelia, Leptospira, and Treponema genomes, is freely accessible. The others will be released after finalization of the corresponding genome projects. JPGV (jpgv.fli-leibniz.de) offers information on almost all finished bacterial genomes, as compared to the dedicated browsers with reduced genome comparison functionality, however. As of January 2006, this viewer includes 632 genomic elements (e.g., chromosomes and plasmids) of 293 species. The system provides versatile quick and advanced search options for all currently known prokaryotic genomes and generates circular and linear genome plots. Gene information sheets contain basic gene information, database search options, and links to external databases. GenColors is also available on request for local installation.
A DATABASE FOR TRACKING TOXICOGENOMIC SAMPLES AND PROCEDURES WITH GENOMIC, PROTEOMIC AND METABONOMIC COMPONENTS

EPA Science Inventory

A Database for Tracking Toxicogenomic Samples and Procedures with Genomic, Proteomic and Metabonomic Components
Wenjun Bao1, Jennifer Fostel2, Michael D. Waters2, B. Alex Merrick2, Drew Ekman3, Mitchell Kostich4, Judith Schmid1, David Dix1
Office of Research and Developmen...
Fast neutron mutants database and web displays at SoyBase

USDA-ARS?s Scientific Manuscript database

SoyBase, the USDA-ARS soybean genetics and genomics database, has been expanded to include data for the fast neutron mutants produced by Bolon, Vance, et al. In addition to the expected text and sequence homology searches and visualization of the indels in the context of the genome sequence viewer, ...
PineElm_SSRdb: a microsatellite marker database identified from genomic, chloroplast, mitochondrial and EST sequences of pineapple (Ananas comosus (L.) Merrill).

PubMed

Chaudhary, Sakshi; Mishra, Bharat Kumar; Vivek, Thiruvettai; Magadum, Santoshkumar; Yasin, Jeshima Khan

2016-01-01

Simple Sequence Repeats or microsatellites are resourceful molecular genetic markers. There are only few reports of SSR identification and development in pineapple. Complete genome sequence of pineapple available in the public domain can be used to develop numerous novel SSRs. Therefore, an attempt was made to identify SSRs from genomic, chloroplast, mitochondrial and EST sequences of pineapple which will help in deciphering genetic makeup of its germplasm resources. A total of 359511 SSRs were identified in pineapple (356385 from genome sequence, 45 from chloroplast sequence, 249 in mitochondrial sequence and 2832 from EST sequences). The list of EST-SSR markers and their details are available in the database. PineElm_SSRdb is an open source database available for non-commercial academic purpose at http://app.bioelm.com/ with a mapping tool which can develop circular maps of selected marker set. This database will be of immense use to breeders, researchers and graduates working on Ananas spp. and to others working on cross-species transferability of markers, investigating diversity, mapping and DNA fingerprinting.
Biological Databases for Human Research

PubMed Central

Zou, Dong; Ma, Lina; Yu, Jun; Zhang, Zhang

2015-01-01

The completion of the Human Genome Project lays a foundation for systematically studying the human genome from evolutionary history to precision medicine against diseases. With the explosive growth of biological data, there is an increasing number of biological databases that have been developed in aid of human-related research. Here we present a collection of human-related biological databases and provide a mini-review by classifying them into different categories according to their data types. As human-related databases continue to grow not only in count but also in volume, challenges are ahead in big data storage, processing, exchange and curation. PMID:25712261
Diversities of interaction of murine macrophages with three strains of Candida albicans represented by MyD88, CARD9 gene expressions and ROS, IL-10 and TNF-α secretion

PubMed Central

Zhang, Xiaohuan; Ge, Yanping; Li, Wenqing; Hu, Yan

2014-01-01

Aim: To explore the mechanisms underlying the different responses of macrophages to distinct Candida albicans strains. Methods: Bone marrow was collected from mice. Macrophages were independently incubated with 3 Candida albicans strains. Results: MyD88 expression in Candida albicans 3683 group was significantly higher than that in Candida albicans 3630 group and Candida albicans SC5314 group, and marked difference was also observed between later two groups (P<0.05). CARD9 expression in Candida albicans 3630 group was higher than that in Candida albicans 3683 group and Candida albicans SC5314 group. Fluorescence intensity was 46.78±0.79 in Candida albicans 3630 group, 32.60±1.31 in Candida albicans 3683 group and 19.40±0.58 in Candida albicans SC5314, and significant difference was observed between any two groups (P<0.05). TNF-α and IL-10 were 18.9843±0.7081 pg/ml and 11.6690±0.3167 pg/ml, respectively, in Candida albicans 3683 group, which were markedly higher than those in Candida albicans 3630 group and Candida albicans SC5314 group (P<0.05 and 0.01). Conclusion: Different Candida albicans strains may induce CARD9 expression and alter the production of ROS, TNF-α and IL-10 in macrophages, which may be one of mechanisms underlying the different killing effects of macrophages on distinct Candida albicans strains. PMID:25664026
Ortholog Identification and Comparative Analysis of Microbial Genomes Using MBGD and RECOG.

PubMed

Uchiyama, Ikuo

2017-01-01

Comparative genomics is becoming an essential approach for identification of genes associated with a specific function or phenotype. Here, we introduce the microbial genome database for comparative analysis (MBGD), which is a comprehensive ortholog database among the microbial genomes available so far. MBGD contains several precomputed ortholog tables including the standard ortholog table covering the entire taxonomic range and taxon-specific ortholog tables for various major taxa. In addition, MBGD allows the users to create an ortholog table within any specified set of genomes through dynamic calculations. In particular, MBGD has a "My MBGD" mode where users can upload their original genome sequences and incorporate them into orthology analysis. The created ortholog table can serve as the basis for various comparative analyses. Here, we describe the use of MBGD and briefly explain how to utilize the orthology information during comparative genome analysis in combination with the stand-alone comparative genomics software RECOG, focusing on the application to comparison of closely related microbial genomes.
Rapid identification of drug resistant Candida species causing recurrent vulvovaginal candidiasis.

PubMed

Diba, Kambiz; Namaki, Atefeh; Ayatolahi, Haleh; Hanifian, Haleh

2012-01-01

Some yeast agents including Candida albicans, Candida tropicalis and Candida glabrata have a role in recurrent vulvovaginal candidiasis. We studied the frequency of both common and recurrent vulvovaginal candidiasis in symptomatic cases which were referred to Urmia Medical Sciences University related gynecology clinics using morphologic and molecular methods. The aim of this study was the identification of Candida species isolated from recurrent vulvovaginal candidiasis cases using a rapid and reliable molecular method. Vaginal swabs obtained from each case, were cultured on differential media including cornmeal agar and CHROM agar Candida. After 48 hours at 37℃, the cultures were studied for growth characteristics and color production respectively. All isolates were identified using the molecular method of PCR - restriction fragment length polymorphism. Among all clinical specimens, we detected 19 ( 16 % ) non fungal agents, 87 ( 82.1 % ) yeasts and 2 ( 1.9 % ) multiple infections. The yeast isolates identified morphologically included Candida albicans ( n = 62 ), Candida glabrata ( n = 9 ), Candida tropicalis ( n = 8 ), Candida parapsilosis ( n = 8 ) and Candida guilliermondii and Candida krusei ( n = 1 each ). We also obtained very similar results for Candida albicans, Candida glabrata and Candida tropicalis as the most common clinical isolates, by using PCR - Restriction Fragment Length Polymorphism. Use of two differential methods, morphologic and molecular, enabled us to identify most medically important Candida species which particularly cause recurrent vulvovaginal candidiasis.
RatMap--rat genome tools and data.

PubMed

Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M; Ståhl, Fredrik

2005-01-01

The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB-Genetics at Goteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided.
RatMap—rat genome tools and data

PubMed Central

Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M.; Ståhl, Fredrik

2005-01-01

The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB–Genetics at Göteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided. PMID:15608244
The Pathway Tools software.

PubMed

Karp, Peter D; Paley, Suzanne; Romero, Pedro

2002-01-01

Bioinformatics requires reusable software tools for creating model-organism databases (MODs). The Pathway Tools is a reusable, production-quality software environment for creating a type of MOD called a Pathway/Genome Database (PGDB). A PGDB such as EcoCyc (see http://ecocyc.org) integrates our evolving understanding of the genes, proteins, metabolic network, and genetic network of an organism. This paper provides an overview of the four main components of the Pathway Tools: The PathoLogic component supports creation of new PGDBs from the annotated genome of an organism. The Pathway/Genome Navigator provides query, visualization, and Web-publishing services for PGDBs. The Pathway/Genome Editors support interactive updating of PGDBs. The Pathway Tools ontology defines the schema of PGDBs. The Pathway Tools makes use of the Ocelot object database system for data management services for PGDBs. The Pathway Tools has been used to build PGDBs for 13 organisms within SRI and by external users.
UCbase 2.0: ultraconserved sequences database (2014 update)

PubMed Central

Lomonaco, Vincenzo; Martoglia, Riccardo; Mandreoli, Federica; Anderlucci, Laura; Emmett, Warren; Bicciato, Silvio; Taccioli, Cristian

2014-01-01

UCbase 2.0 (http://ucbase.unimore.it) is an update, extension and evolution of UCbase, a Web tool dedicated to the analysis of ultraconserved sequences (UCRs). UCRs are 481 sequences >200 bases sharing 100% identity among human, mouse and rat genomes. They are frequently located in genomic regions known to be involved in cancer or differentially expressed in human leukemias and carcinomas. UCbase 2.0 is a platform-independent Web resource that includes the updated version of the human genome annotation (hg19), information linking disorders to chromosomal coordinates based on the Systematized Nomenclature of Medicine classification, a query tool to search for Single Nucleotide Polymorphisms (SNPs) and a new text box to directly interrogate the database using a MySQL interface. To facilitate the interactive visual interpretation of UCR chromosomal positioning, UCbase 2.0 now includes a graph visualization interface directly linked to UCSC genome browser. Database URL: http://ucbase.unimore.it PMID:24951797
PGMapper: a web-based tool linking phenotype to genes.

PubMed

Xiong, Qing; Qiu, Yuhui; Gu, Weikuan

2008-04-01

With the availability of whole genome sequence in many species, linkage analysis, positional cloning and microarray are gradually becoming powerful tools for investigating the links between phenotype and genotype or genes. However, in these methods, causative genes underlying a quantitative trait locus, or a disease, are usually located within a large genomic region or a large set of genes. Examining the function of every gene is very time consuming and needs to retrieve and integrate the information from multiple databases or genome resources. PGMapper is a software tool for automatically matching phenotype to genes from a defined genome region or a group of given genes by combining the mapping information from the Ensembl database and gene function information from the OMIM and PubMed databases. PGMapper is currently available for candidate gene search of human, mouse, rat, zebrafish and 12 other species. Available online at http://www.genediscovery.org/pgmapper/index.jsp.
GWFASTA: server for FASTA search in eukaryotic and microbial genomes.

PubMed

Issac, Biju; Raghava, G P S

2002-09-01

Similarity searches are a powerful method for solving important biological problems such as database scanning, evolutionary studies, gene prediction, and protein structure prediction. FASTA is a widely used sequence comparison tool for rapid database scanning. Here we describe the GWFASTA server that was developed to assist the FASTA user in similarity searches against partially and/or completely sequenced genomes. GWFASTA consists of more than 60 microbial genomes, eight eukaryote genomes, and proteomes of annotatedgenomes. Infact, it provides the maximum number of databases for similarity searching from a single platform. GWFASTA allows the submission of more than one sequence as a single query for a FASTA search. It also provides integrated post-processing of FASTA output, including compositional analysis of proteins, multiple sequences alignment, and phylogenetic analysis. Furthermore, it summarizes the search results organism-wise for prokaryotes and chromosome-wise for eukaryotes. Thus, the integration of different tools for sequence analyses makes GWFASTA a powerful toolfor biologists.

Description of Diutina gen. nov., Diutina siamensis, f.a. sp. nov., and reassignment of Candida catenulata, Candida mesorugosa, Candida neorugosa, Candida pseudorugosa, Candida ranongensis, Candida rugosa and Candida scorzettiae to the genus Diutina.

PubMed

Khunnamwong, Pannida; Lertwattanasakul, Noppon; Jindamorakot, Sasitorn; Limtong, Savitree; Lachance, Marc-André

2015-12-01

Three strains (DMKU-RE28, DMKU-RE43T and DMKU-RE123) of a novel anamorphic yeast species were isolated from rice leaf tissue collected in Thailand. DNA sequence analysis demonstrated that the species forms a sister pair with Candida ranongensis CBS 10861T but differs by 24-30 substitutions in the LSU rRNA gene D1/D2 domains and 30-35 substitutions in the ITS region. A phylogenetic analysis based on both the small and the large rRNA gene subunits confirmed this connection and demonstrated the presence of a clade that also includes Candida catenulata, Candida mesorugosa, Candida neorugosa, Candida pseudorugosa, Candida rugosa and Candida scorzettiae. The clade is not closely affiliated to any known teleomorphic genus, and forms a well-separated lineage from currently recognized genera of the Saccharomycetales. Hence, the genus Diutina gen. nov. is proposed to accommodate members of the clade, including Diutina siamensis f.a. sp. nov. and the preceding seven Candida species. The type strain is DMKU-RE43T ( = CBS 13388T = BCC 61183T = NBRC 109695T).
CROPPER: a metagene creator resource for cross-platform and cross-species compendium studies.

PubMed

Paananen, Jussi; Storvik, Markus; Wong, Garry

2006-09-22

Current genomic research methods provide researchers with enormous amounts of data. Combining data from different high-throughput research technologies commonly available in biological databases can lead to novel findings and increase research efficiency. However, combining data from different heterogeneous sources is often a very arduous task. These sources can be different microarray technology platforms, genomic databases, or experiments performed on various species. Our aim was to develop a software program that could facilitate the combining of data from heterogeneous sources, and thus allow researchers to perform genomic cross-platform/cross-species studies and to use existing experimental data for compendium studies. We have developed a web-based software resource, called CROPPER that uses the latest genomic information concerning different data identifiers and orthologous genes from the Ensembl database. CROPPER can be used to combine genomic data from different heterogeneous sources, allowing researchers to perform cross-platform/cross-species compendium studies without the need for complex computational tools or the requirement of setting up one's own in-house database. We also present an example of a simple cross-platform/cross-species compendium study based on publicly available Parkinson's disease data derived from different sources. CROPPER is a user-friendly and freely available web-based software resource that can be successfully used for cross-species/cross-platform compendium studies.
EuPathDB: the eukaryotic pathogen genomics database resource

PubMed Central

Aurrecoechea, Cristina; Barreto, Ana; Basenko, Evelina Y.; Brestelli, John; Brunk, Brian P.; Cade, Shon; Crouch, Kathryn; Doherty, Ryan; Falke, Dave; Fischer, Steve; Gajria, Bindu; Harb, Omar S.; Heiges, Mark; Hertz-Fowler, Christiane; Hu, Sufen; Iodice, John; Kissinger, Jessica C.; Lawrence, Cris; Li, Wei; Pinney, Deborah F.; Pulman, Jane A.; Roos, David S.; Shanmugasundram, Achchuthan; Silva-Franco, Fatima; Steinbiss, Sascha; Stoeckert, Christian J.; Spruill, Drew; Wang, Haiming; Warrenfeltz, Susanne; Zheng, Jie

2017-01-01

The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host–pathogen interactions. PMID:27903906
GenoQuery: a new querying module for functional annotation in a genomic warehouse

PubMed Central

Lemoine, Frédéric; Labedan, Bernard; Froidevaux, Christine

2008-01-01

Motivation: We have to cope with both a deluge of new genome sequences and a huge amount of data produced by high-throughput approaches used to exploit these genomic features. Crossing and comparing such heterogeneous and disparate data will help improving functional annotation of genomes. This requires designing elaborate integration systems such as warehouses for storing and querying these data. Results: We have designed a relational genomic warehouse with an original multi-layer architecture made of a databases layer and an entities layer. We describe a new querying module, GenoQuery, which is based on this architecture. We use the entities layer to define mixed queries. These mixed queries allow searching for instances of biological entities and their properties in the different databases, without specifying in which database they should be found. Accordingly, we further introduce the central notion of alternative queries. Such queries have the same meaning as the original mixed queries, while exploiting complementarities yielded by the various integrated databases of the warehouse. We explain how GenoQuery computes all the alternative queries of a given mixed query. We illustrate how useful this querying module is by means of a thorough example. Availability: http://www.lri.fr/~lemoine/GenoQuery/ Contact: chris@lri.fr, lemoine@lri.fr PMID:18586731
MBGD update 2013: the microbial genome database for exploring the diversity of microbial world.

PubMed

Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu

2013-01-01

The microbial genome database for comparative analysis (MBGD, available at http://mbgd.genome.ad.jp/) is a platform for microbial genome comparison based on orthology analysis. As its unique feature, MBGD allows users to conduct orthology analysis among any specified set of organisms; this flexibility allows MBGD to adapt to a variety of microbial genomic study. Reflecting the huge diversity of microbial world, the number of microbial genome projects now becomes several thousands. To efficiently explore the diversity of the entire microbial genomic data, MBGD now provides summary pages for pre-calculated ortholog tables among various taxonomic groups. For some closely related taxa, MBGD also provides the conserved synteny information (core genome alignment) pre-calculated using the CoreAligner program. In addition, efficient incremental updating procedure can create extended ortholog table by adding additional genomes to the default ortholog table generated from the representative set of genomes. Combining with the functionalities of the dynamic orthology calculation of any specified set of organisms, MBGD is an efficient and flexible tool for exploring the microbial genome diversity.
High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

PubMed Central

Seaver, Samuel M. D.; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M. T.; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D.; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D.; Henry, Christopher S.

2014-01-01

The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today’s annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed. PMID:24927599
High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource.

PubMed

Seaver, Samuel M D; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M T; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D; Henry, Christopher S

2014-07-01

The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed.
Genomics and Public Health Research: Can the State Allow Access to Genomic Databases?

PubMed Central

Cousineau, J; Girard, N; Monardes, C; Leroux, T; Jean, M Stanton

2012-01-01

Because many diseases are multifactorial disorders, the scientific progress in genomics and genetics should be taken into consideration in public health research. In this context, genomic databases will constitute an important source of information. Consequently, it is important to identify and characterize the State’s role and authority on matters related to public health, in order to verify whether it has access to such databases while engaging in public health genomic research. We first consider the evolution of the concept of public health, as well as its core functions, using a comparative approach (e.g. WHO, PAHO, CDC and the Canadian province of Quebec). Following an analysis of relevant Quebec legislation, the precautionary principle is examined as a possible avenue to justify State access to and use of genomic databases for research purposes. Finally, we consider the Influenza pandemic plans developed by WHO, Canada, and Quebec, as examples of key tools framing public health decision-making process. We observed that State powers in public health, are not, in Quebec, well adapted to the expansion of genomics research. We propose that the scope of the concept of research in public health should be clear and include the following characteristics: a commitment to the health and well-being of the population and to their determinants; the inclusion of both applied research and basic research; and, an appropriate model of governance (authorization, follow-up, consent, etc.). We also suggest that the strategic approach version of the precautionary principle could guide collective choices in these matters. PMID:23113174
LDSplitDB: a database for studies of meiotic recombination hotspots in MHC using human genomic data.

PubMed

Guo, Jing; Chen, Hao; Yang, Peng; Lee, Yew Ti; Wu, Min; Przytycka, Teresa M; Kwoh, Chee Keong; Zheng, Jie

2018-04-20

Meiotic recombination happens during the process of meiosis when chromosomes inherited from two parents exchange genetic materials to generate chromosomes in the gamete cells. The recombination events tend to occur in narrow genomic regions called recombination hotspots. Its dysregulation could lead to serious human diseases such as birth defects. Although the regulatory mechanism of recombination events is still unclear, DNA sequence polymorphisms have been found to play crucial roles in the regulation of recombination hotspots. To facilitate the studies of the underlying mechanism, we developed a database named LDSplitDB which provides an integrative and interactive data mining and visualization platform for the genome-wide association studies of recombination hotspots. It contains the pre-computed association maps of the major histocompatibility complex (MHC) region in the 1000 Genomes Project and the HapMap Phase III datasets, and a genome-scale study of the European population from the HapMap Phase II dataset. Besides the recombination profiles, related data of genes, SNPs and different types of epigenetic modifications, which could be associated with meiotic recombination, are provided for comprehensive analysis. To meet the computational requirement of the rapidly increasing population genomics data, we prepared a lookup table of 400 haplotypes for recombination rate estimation using the well-known LDhat algorithm which includes all possible two-locus haplotype configurations. To the best of our knowledge, LDSplitDB is the first large-scale database for the association analysis of human recombination hotspots with DNA sequence polymorphisms. It provides valuable resources for the discovery of the mechanism of meiotic recombination hotspots. The information about MHC in this database could help understand the roles of recombination in human immune system. DATABASE URL: http://histone.scse.ntu.edu.sg/LDSplitDB.
GTRAC: fast retrieval from compressed collections of genomic variants

PubMed Central

Tatwawadi, Kedar; Hernaez, Mikel; Ochoa, Idoia; Weissman, Tsachy

2016-01-01

Motivation: The dramatic decrease in the cost of sequencing has resulted in the generation of huge amounts of genomic data, as evidenced by projects such as the UK10K and the Million Veteran Project, with the number of sequenced genomes ranging in the order of 10 K to 1 M. Due to the large redundancies among genomic sequences of individuals from the same species, most of the medical research deals with the variants in the sequences as compared with a reference sequence, rather than with the complete genomic sequences. Consequently, millions of genomes represented as variants are stored in databases. These databases are constantly updated and queried to extract information such as the common variants among individuals or groups of individuals. Previous algorithms for compression of this type of databases lack efficient random access capabilities, rendering querying the database for particular variants and/or individuals extremely inefficient, to the point where compression is often relinquished altogether. Results: We present a new algorithm for this task, called GTRAC, that achieves significant compression ratios while allowing fast random access over the compressed database. For example, GTRAC is able to compress a Homo sapiens dataset containing 1092 samples in 1.1 GB (compression ratio of 160), while allowing for decompression of specific samples in less than a second and decompression of specific variants in 17 ms. GTRAC uses and adapts techniques from information theory, such as a specialized Lempel-Ziv compressor, and tailored succinct data structures. Availability and Implementation: The GTRAC algorithm is available for download at: https://github.com/kedartatwawadi/GTRAC Contact: kedart@stanford.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27587665
GTRAC: fast retrieval from compressed collections of genomic variants.

PubMed

Tatwawadi, Kedar; Hernaez, Mikel; Ochoa, Idoia; Weissman, Tsachy

2016-09-01

The dramatic decrease in the cost of sequencing has resulted in the generation of huge amounts of genomic data, as evidenced by projects such as the UK10K and the Million Veteran Project, with the number of sequenced genomes ranging in the order of 10 K to 1 M. Due to the large redundancies among genomic sequences of individuals from the same species, most of the medical research deals with the variants in the sequences as compared with a reference sequence, rather than with the complete genomic sequences. Consequently, millions of genomes represented as variants are stored in databases. These databases are constantly updated and queried to extract information such as the common variants among individuals or groups of individuals. Previous algorithms for compression of this type of databases lack efficient random access capabilities, rendering querying the database for particular variants and/or individuals extremely inefficient, to the point where compression is often relinquished altogether. We present a new algorithm for this task, called GTRAC, that achieves significant compression ratios while allowing fast random access over the compressed database. For example, GTRAC is able to compress a Homo sapiens dataset containing 1092 samples in 1.1 GB (compression ratio of 160), while allowing for decompression of specific samples in less than a second and decompression of specific variants in 17 ms. GTRAC uses and adapts techniques from information theory, such as a specialized Lempel-Ziv compressor, and tailored succinct data structures. The GTRAC algorithm is available for download at: https://github.com/kedartatwawadi/GTRAC CONTACT: : kedart@stanford.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
TIA: algorithms for development of identity-linked SNP islands for analysis by massively parallel DNA sequencing.

PubMed

Farris, M Heath; Scott, Andrew R; Texter, Pamela A; Bartlett, Marta; Coleman, Patricia; Masters, David

2018-04-11

Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA sequencing (MPS) technologies and human genome SNP databases allow for the design of suites of identity-linked target regions, amenable to sequencing in a multiplexed and massively parallel manner. Therefore, tools are needed for leveraging the genotypic information found within SNP databases for the discovery of genomic targets that can be evaluated on MPS platforms. The SNP island target identification algorithm (TIA) was developed as a user-tunable system to leverage SNP information within databases. Using data within the 1000 Genomes Project SNP database, human genome regions were identified that contain globally ubiquitous identity-linked SNPs and that were responsive to targeted resequencing on MPS platforms. Algorithmic filters were used to exclude target regions that did not conform to user-tunable SNP island target characteristics. To validate the accuracy of TIA for discovering these identity-linked SNP islands within the human genome, SNP island target regions were amplified from 70 contributor genomic DNA samples using the polymerase chain reaction. Multiplexed amplicons were sequenced using the Illumina MiSeq platform, and the resulting sequences were analyzed for SNP variations. 166 putative identity-linked SNPs were targeted in the identified genomic regions. Of the 309 SNPs that provided discerning power across individual SNP profiles, 74 previously undefined SNPs were identified during evaluation of targets from individual genomes. Overall, DNA samples of 70 individuals were uniquely identified using a subset of the suite of identity-linked SNP islands. TIA offers a tunable genome search tool for the discovery of targeted genomic regions that are scalable in the population frequency and numbers of SNPs contained within the SNP island regions. It also allows the definition of sequence length and sequence variability of the target region as well as the less variable flanking regions for tailoring to MPS platforms. As shown in this study, TIA can be used to discover identity-linked SNP islands within the human genome, useful for differentiating individuals by targeted resequencing on MPS technologies.
The Saccharomyces Genome Database Variant Viewer.

PubMed

Sheppard, Travis K; Hitz, Benjamin C; Engel, Stacia R; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla S; Demeter, Janos; Hellerstedt, Sage T; Karra, Kalpana; Nash, Robert S; Paskov, Kelley M; Skrzypek, Marek S; Weng, Shuai; Wong, Edith D; Cherry, J Michael

2016-01-04

The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Building a genome database using an object-oriented approach.

PubMed

Barbasiewicz, Anna; Liu, Lin; Lang, B Franz; Burger, Gertraud

2002-01-01

GOBASE is a relational database that integrates data associated with mitochondria and chloroplasts. The most important data in GOBASE, i. e., molecular sequences and taxonomic information, are obtained from the public sequence data repository at the National Center for Biotechnology Information (NCBI), and are validated by our experts. Maintaining a curated genomic database comes with a towering labor cost, due to the shear volume of available genomic sequences and the plethora of annotation errors and omissions in records retrieved from public repositories. Here we describe our approach to increase automation of the database population process, thereby reducing manual intervention. As a first step, we used Unified Modeling Language (UML) to construct a list of potential errors. Each case was evaluated independently, and an expert solution was devised, and represented as a diagram. Subsequently, the UML diagrams were used as templates for writing object-oriented automation programs in the Java programming language.
NCBI-compliant genome submissions: tips and tricks to save time and money.

PubMed

Pirovano, Walter; Boetzer, Marten; Derks, Martijn F L; Smit, Sandra

2017-03-01

Genome sequences nowadays play a central role in molecular biology and bioinformatics. These sequences are shared with the scientific community through sequence databases. The sequence repositories of the International Nucleotide Sequence Database Collaboration (INSDC, comprising GenBank, ENA and DDBJ) are the largest in the world. Preparing an annotated sequence in such a way that it will be accepted by the database is challenging because many validation criteria apply. In our opinion, it is an undesirable situation that researchers who want to submit their sequence need either a lot of experience or help from partners to get the job done. To save valuable time and money, we list a number of recommendations for people who want to submit an annotated genome to a sequence database, as well as for tool developers, who could help to ease the process. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes

PubMed Central

Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim

2010-01-01

Motivation: Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith–Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid™, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. Availability: The database can be accessed through http://proteinworlddb.org Contact: otto@fiocruz.br PMID:20089515
NGSmethDB 2017: enhanced methylomes and differential methylation

PubMed Central

Lebrón, Ricardo; Gómez-Martín, Cristina; Carpena, Pedro; Bernaola-Galván, Pedro; Barturen, Guillermo; Hackenberg, Michael; Oliver, José L.

2017-01-01

The 2017 update of NGSmethDB stores whole genome methylomes generated from short-read data sets obtained by bisulfite sequencing (WGBS) technology. To generate high-quality methylomes, stringent quality controls were integrated with third-part software, adding also a two-step mapping process to exploit the advantages of the new genome assembly models. The samples were all profiled under constant parameter settings, thus enabling comparative downstream analyses. Besides a significant increase in the number of samples, NGSmethDB now includes two additional data-types, which are a valuable resource for the discovery of methylation epigenetic biomarkers: (i) differentially methylated single-cytosines; and (ii) methylation segments (i.e. genome regions of homogeneous methylation). The NGSmethDB back-end is now based on MongoDB, a NoSQL hierarchical database using JSON-formatted documents and dynamic schemas, thus accelerating sample comparative analyses. Besides conventional database dumps, track hubs were implemented, which improved database access, visualization in genome browsers and comparative analyses to third-part annotations. In addition, the database can be also accessed through a RESTful API. Lastly, a Python client and a multiplatform virtual machine allow for program-driven access from user desktop. This way, private methylation data can be compared to NGSmethDB without the need to upload them to public servers. Database website: http://bioinfo2.ugr.es/NGSmethDB. PMID:27794041
novPTMenzy: a database for enzymes involved in novel post-translational modifications

PubMed Central

Khater, Shradha; Mohanty, Debasisa

2015-01-01

With the recent discoveries of novel post-translational modifications (PTMs) which play important roles in signaling and biosynthetic pathways, identification of such PTM catalyzing enzymes by genome mining has been an area of major interest. Unlike well-known PTMs like phosphorylation, glycosylation, SUMOylation, no bioinformatics resources are available for enzymes associated with novel and unusual PTMs. Therefore, we have developed the novPTMenzy database which catalogs information on the sequence, structure, active site and genomic neighborhood of experimentally characterized enzymes involved in five novel PTMs, namely AMPylation, Eliminylation, Sulfation, Hydroxylation and Deamidation. Based on a comprehensive analysis of the sequence and structural features of these known PTM catalyzing enzymes, we have created Hidden Markov Model profiles for the identification of similar PTM catalyzing enzymatic domains in genomic sequences. We have also created predictive rules for grouping them into functional subfamilies and deciphering their mechanistic details by structure-based analysis of their active site pockets. These analytical modules have been made available as user friendly search interfaces of novPTMenzy database. It also has a specialized analysis interface for some PTMs like AMPylation and Eliminylation. The novPTMenzy database is a unique resource that can aid in discovery of unusual PTM catalyzing enzymes in newly sequenced genomes. Database URL: http://www.nii.ac.in/novptmenzy.html PMID:25931459
SolEST database: a "one-stop shop" approach to the study of Solanaceae transcriptomes.

PubMed

D'Agostino, Nunzio; Traini, Alessandra; Frusciante, Luigi; Chiusano, Maria Luisa

2009-11-30

Since no genome sequences of solanaceous plants have yet been completed, expressed sequence tag (EST) collections represent a reliable tool for broad sampling of Solanaceae transcriptomes, an attractive route for understanding Solanaceae genome functionality and a powerful reference for the structural annotation of emerging Solanaceae genome sequences. We describe the SolEST database http://biosrv.cab.unina.it/solestdb which integrates different EST datasets from both cultivated and wild Solanaceae species and from two species of the genus Coffea. Background as well as processed data contained in the database, extensively linked to external related resources, represent an invaluable source of information for these plant families. Two novel features differentiate SolEST from other resources: i) the option of accessing and then visualizing Solanaceae EST/TC alignments along the emerging tomato and potato genome sequences; ii) the opportunity to compare different Solanaceae assemblies generated by diverse research groups in the attempt to address a common complaint in the SOL community. Different databases have been established worldwide for collecting Solanaceae ESTs and are related in concept, content and utility to the one presented herein. However, the SolEST database has several distinguishing features that make it appealing for the research community and facilitates a "one-stop shop" for the study of Solanaceae transcriptomes.
MIPS: curated databases and comprehensive secondary data resources in 2010.

PubMed

Mewes, H Werner; Ruepp, Andreas; Theis, Fabian; Rattei, Thomas; Walter, Mathias; Frishman, Dmitrij; Suhre, Karsten; Spannagl, Manuel; Mayer, Klaus F X; Stümpflen, Volker; Antonov, Alexey

2011-01-01

The Munich Information Center for Protein Sequences (MIPS at the Helmholtz Center for Environmental Health, Neuherberg, Germany) has many years of experience in providing annotated collections of biological data. Selected data sets of high relevance, such as model genomes, are subjected to careful manual curation, while the bulk of high-throughput data is annotated by automatic means. High-quality reference resources developed in the past and still actively maintained include Saccharomyces cerevisiae, Neurospora crassa and Arabidopsis thaliana genome databases as well as several protein interaction data sets (MPACT, MPPI and CORUM). More recent projects are PhenomiR, the database on microRNA-related phenotypes, and MIPS PlantsDB for integrative and comparative plant genome research. The interlinked resources SIMAP and PEDANT provide homology relationships as well as up-to-date and consistent annotation for 38,000,000 protein sequences. PPLIPS and CCancer are versatile tools for proteomics and functional genomics interfacing to a database of compilations from gene lists extracted from literature. A novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis. All databases described here, as well as the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.helmholtz-muenchen.de).

MIPS: curated databases and comprehensive secondary data resources in 2010

PubMed Central

Mewes, H. Werner; Ruepp, Andreas; Theis, Fabian; Rattei, Thomas; Walter, Mathias; Frishman, Dmitrij; Suhre, Karsten; Spannagl, Manuel; Mayer, Klaus F.X.; Stümpflen, Volker; Antonov, Alexey

2011-01-01

The Munich Information Center for Protein Sequences (MIPS at the Helmholtz Center for Environmental Health, Neuherberg, Germany) has many years of experience in providing annotated collections of biological data. Selected data sets of high relevance, such as model genomes, are subjected to careful manual curation, while the bulk of high-throughput data is annotated by automatic means. High-quality reference resources developed in the past and still actively maintained include Saccharomyces cerevisiae, Neurospora crassa and Arabidopsis thaliana genome databases as well as several protein interaction data sets (MPACT, MPPI and CORUM). More recent projects are PhenomiR, the database on microRNA-related phenotypes, and MIPS PlantsDB for integrative and comparative plant genome research. The interlinked resources SIMAP and PEDANT provide homology relationships as well as up-to-date and consistent annotation for 38 000 000 protein sequences. PPLIPS and CCancer are versatile tools for proteomics and functional genomics interfacing to a database of compilations from gene lists extracted from literature. A novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis. All databases described here, as well as the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.helmholtz-muenchen.de). PMID:21109531
GenoMycDB: a database for comparative analysis of mycobacterial genes and genomes.

PubMed

Catanho, Marcos; Mascarenhas, Daniel; Degrave, Wim; Miranda, Antonio Basílio de

2006-03-31

Several databases and computational tools have been created with the aim of organizing, integrating and analyzing the wealth of information generated by large-scale sequencing projects of mycobacterial genomes and those of other organisms. However, with very few exceptions, these databases and tools do not allow for massive and/or dynamic comparison of these data. GenoMycDB (http://www.dbbm.fiocruz.br/GenoMycDB) is a relational database built for large-scale comparative analyses of completely sequenced mycobacterial genomes, based on their predicted protein content. Its central structure is composed of the results obtained after pair-wise sequence alignments among all the predicted proteins coded by the genomes of six mycobacteria: Mycobacterium tuberculosis (strains H37Rv and CDC1551), M. bovis AF2122/97, M. avium subsp. paratuberculosis K10, M. leprae TN, and M. smegmatis MC2 155. The database stores the computed similarity parameters of every aligned pair, providing for each protein sequence the predicted subcellular localization, the assigned cluster of orthologous groups, the features of the corresponding gene, and links to several important databases. Tables containing pairs or groups of potential homologs between selected species/strains can be produced dynamically by user-defined criteria, based on one or multiple sequence similarity parameters. In addition, searches can be restricted according to the predicted subcellular localization of the protein, the DNA strand of the corresponding gene and/or the description of the protein. Massive data search and/or retrieval are available, and different ways of exporting the result are offered. GenoMycDB provides an on-line resource for the functional classification of mycobacterial proteins as well as for the analysis of genome structure, organization, and evolution.
The BIG Data Center: from deposition to integration to translation.

PubMed

2017-01-04

Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Using the Saccharomyces Genome Database (SGD) for analysis of genomic information

PubMed Central

Skrzypek, Marek S.; Hirschman, Jodi

2011-01-01

Analysis of genomic data requires access to software tools that place the sequence-derived information in the context of biology. The Saccharomyces Genome Database (SGD) integrates functional information about budding yeast genes and their products with a set of analysis tools that facilitate exploring their biological details. This unit describes how the various types of functional data available at SGD can be searched, retrieved, and analyzed. Starting with the guided tour of the SGD Home page and Locus Summary page, this unit highlights how to retrieve data using YeastMine, how to visualize genomic information with GBrowse, how to explore gene expression patterns with SPELL, and how to use Gene Ontology tools to characterize large-scale datasets. PMID:21901739
DEFINING THE CHEMICAL SPACE OF PUBLIC GENOMIC DATA (S)

EPA Science Inventory

The current project aims to chemically index the genomics content of public genomic databases to make these data accessible in relation to other publicly available, chemically-indexed toxicological information. By defining the chemical space of public genomic data, it is possibl...
Lindnera (Pichia) fabianii blood infection after mesenteric ischemia.

PubMed

Gabriel, Frederic; Noel, Thierry; Accoceberry, Isabelle

2012-04-01

Lindnera (Pichia) fabianii (teleomorph of Candida fabianii) is a yeast species rarely involved in human infections. This report describes the first known human case of a Lindnera fabianii blood infection after mesenteric ischemia. The 53-year-old patient was hospitalized in the intensive care unit after a suicide attempt and was suffering from a mesenteric ischemia and acute renal failure. Lindnera fabianii was recovered from an oropharyngeal swab, then isolated from stool and urine samples before the diagnosis of the blood infection. Caspofungin intravenous treatment was associated with a successful outcome. Final unequivocal identification of the strain was done by sequencing the internal transcribed spacer (ITS) region, and regions of 18S rDNA gene and of the translation elongation factor-1α gene. Until our work, the genomic databases did not contain the complete ITS region of L. fabianii as a single nucleotide sequence (encompassing ITS1, the 5.8S rDNA and ITS2), and misidentification with other yeast species, e.g., Lindnera (Pichia) mississippiensis, could have occurred. Our work demonstrates that the usual DNA barcoding method based on sequencing of the ITS region may fail to provide the correct identification of some taxa, and that partial sequencing of the EF1α gene may be much more effective for the accurate delineation and molecular identification of new emerging opportunistic yeast pathogens.
The Global Genome Biodiversity Network (GGBN) Data Standard specification

PubMed Central

Droege, G.; Barker, K.; Seberg, O.; Coddington, J.; Benson, E.; Berendsohn, W. G.; Bunk, B.; Butler, C.; Cawsey, E. M.; Deck, J.; Döring, M.; Flemons, P.; Gemeinholzer, B.; Güntsch, A.; Hollowell, T.; Kelbert, P.; Kostadinov, I.; Kottmann, R.; Lawlor, R. T.; Lyal, C.; Mackenzie-Dodds, J.; Meyer, C.; Mulcahy, D.; Nussbeck, S. Y.; O'Tuama, É.; Orrell, T.; Petersen, G.; Robertson, T.; Söhngen, C.; Whitacre, J.; Wieczorek, J.; Yilmaz, P.; Zetzsche, H.; Zhang, Y.; Zhou, X.

2016-01-01

Genomic samples of non-model organisms are becoming increasingly important in a broad range of studies from developmental biology, biodiversity analyses, to conservation. Genomic sample definition, description, quality, voucher information and metadata all need to be digitized and disseminated across scientific communities. This information needs to be concise and consistent in today’s ever-increasing bioinformatic era, for complementary data aggregators to easily map databases to one another. In order to facilitate exchange of information on genomic samples and their derived data, the Global Genome Biodiversity Network (GGBN) Data Standard is intended to provide a platform based on a documented agreement to promote the efficient sharing and usage of genomic sample material and associated specimen information in a consistent way. The new data standard presented here build upon existing standards commonly used within the community extending them with the capability to exchange data on tissue, environmental and DNA sample as well as sequences. The GGBN Data Standard will reveal and democratize the hidden contents of biodiversity biobanks, for the convenience of everyone in the wider biobanking community. Technical tools exist for data providers to easily map their databases to the standard. Database URL: http://terms.tdwg.org/wiki/GGBN_Data_Standard PMID:27694206
Malassezia furfur in infantile seborrheic dermatitis.

PubMed

Wananukul, Siriwan; Chindamporn, Ariya; Yumyourn, Poomjit; Payungporn, Sunchai; Samathi, Chanchuree; Poovorawan, Yong

2005-01-01

Our objective was to study both incidence and various strains of Malassezia in infantile seborrheic dermatitis (ISD). Sixty infants between 2 weeks and 2 years old with clinical diagnosis of ISD at the Department of Pediatrics, King Chulalongkorn Memorial Hospital from May 2002 to April 2003 were recruited. Malassezia spp. were isolated from cultured skin samples of the patients, genomic DNA was extracted and the ITS1 rDNA region was amplified. The PCR product was examined by agarose gel electrophoresis and DNA sequences were determined. The ITS1 sequences were also subjected to phylogenetic analysis and species identification. ISD is most commonly found in infants below the age of 2 months (64%), followed by those between 2 and 4 months (28%) old. Cultures yielded yeast-like colonies in 15 specimens. PCR yielded 200-bp products (Candida) in 3 patients and 300-bp products (Malassezia furfur) in 12 patients (18%). Sugar fermentation using API 20C aux performed on the three 200-bp PCR products yielded Candida species. M. furfur was the only Malassezia recovered from skin scrapings of children with ISD.
Gene editing in clinical isolates of Candida parapsilosis using CRISPR/Cas9.

PubMed

Lombardi, Lisa; Turner, Siobhán A; Zhao, Fang; Butler, Geraldine

2017-08-14

Candida parapsilosis is one of the most common causes of candidiasis, particularly in the very young and the very old. Studies of gene function are limited by the lack of a sexual cycle, the diploid genome, and a paucity of molecular tools. We describe here the development of a plasmid-based CRISPR-Cas9 system for gene editing in C. parapsilosis. A major advantage of the system is that it can be used in any genetic background, which we showed by editing genes in 20 different isolates. Gene editing is carried out in a single transformation step. The CAS9 gene is expressed only when the plasmid is present, and it can be removed easily from transformed strains. There is theoretically no limit to the number of genes that can be edited in any strain. Gene editing is increased by homology-directed repair in the presence of a repair template. Editing by non-homologous end joining (NHEJ) also occurs in some genetic backgrounds. Finally, we used the system to introduce unique tags at edited sites.
Candida famata (Debaryomyces hansenii)

NASA Astrophysics Data System (ADS)

Sibirny, Andriy A.; Voronovsky, Andriy Y.

Debaryomyces hansenii (teleomorph of asporogenous strains known as Candida famata ) belongs to the group of so named ‘ flavinogenic yeasts ’ capable of riboflavin oversynthesis during starvation for iron. Some strains of C. famata belong to the most flavinogenic organisms known (accumulate 20 mg of riboflavin in 1 ml of the medium) and were used for industrial production of riboflavin in USA for long time. Many strains of D. hansenii are characterized by high salt tolerance and are used for ageing of cheeses whereas some others are able to convert xylose to xylitol, anti-caries sweetener. Transformation system has been developed for D. hansenii. It includes collection of host recipient strains, vectors with complementation and dominant markers and several transformation protocols based on protoplasting and electroporation. Besides, methods of multicopy gene insertion and insertional mutagenesis have been developed and several strong constitutive and regulatable promoters have been cloned. All structural genes of riboflavin synthesis and some regulatory genes involved in this process have been identified. Genome of D. hansenii has been sequenced in the frame of French National program ‘Genolevure’ and is opened for public access
The environmental and intrinsic yeast diversity of Cuban cocoa bean heap fermentations.

PubMed

Fernández Maura, Yurelkys; Balzarini, Tom; Clapé Borges, Pablo; Evrard, Pierre; De Vuyst, Luc; Daniel, H-M

2016-09-16

The environmental yeast diversity of spontaneous cocoa bean fermentations in east Cuba was investigated. Seven fermentations, 25 equipment- and handling-related samples, and 115 environmental samples, such as flowers, leaf and cocoa pod surfaces, as well as drosophilid insects, were analysed. The basic fermentation parameters temperature and pH were recorded during five fermentations for at least six days. A total of 435 yeast isolates were identified by a combination of PCR-fingerprinting of genomic DNA with the M13 primer and sequence analysis of DNA from representative isolates, using the internal transcribed spacer region, the D1/D2 region of the large subunit rRNA gene, and an actin gene-encoding fragment, as required. Among 65 yeast species detected, Pichia manshurica and Hanseniaspora opuntiae were the most frequently isolated species, obtained from five and four fermentations, followed in frequency by Pichia kudriavzevii from two fermentations. Saccharomyces cerevisiae was isolated only occasionally. Cocoa fermentation yeast species were also present on processing equipment. The repeated isolation of a preliminarily as Yamadazyma sp. classified species, a group of strains similar to Saccharomycopsis crataegensis from fermentations and equipment, and the isolation of fifteen other potentially novel yeast species in low numbers provides material for further studies. Environmental samples showed higher yeast diversity compared to the fermentations, included the most frequent fermentation species, whereas the most frequently isolated environmental species were Candida carpophila, Candida conglobata, and Candida quercitrusa. Potential selective advantages of the most frequently isolated species were only partly explained by the physiological traits tested. For instance, tolerance to higher ethanol concentrations was more frequent in strains of Pichia spp. and S. cerevisiae compared to Hanseniaspora spp.; the ability to also assimilate ethanol might have conferred a selective advantage to Pichia spp. In contrast, high glucose tolerance was common among strains of Hanseniaspora spp., Torulaspora delbrueckii, and Candida tropicalis, among which only Hanseniaspora spp. were frequently isolated. Copyright © 2016 Elsevier B.V. All rights reserved.
An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species

PubMed Central

Galpert, Deborah; del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

2015-01-01

Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification. PMID:26605337
Genomic and probiotic characterization of SJP-SNU strain of Pichia kudriavzevii.

PubMed

Hong, Seung-Min; Kwon, Hyuk-Joon; Park, Se-Joon; Seong, Won-Jin; Kim, Ilhwan; Kim, Jae-Hong

2018-05-17

The yeast strain SJP-SNU was investigated as a probiotic and was characterized with respect to growth temperature, bile salt resistance, hydrogen sulfide reducing activity, intestinal survival ability and chicken embryo pathogenicity. In addition, we determined the complete genomic and mitochondrial sequences of SJP-SNU and conducted comparative genomics analyses. SJP-SNU grew rapidly at 37 °C and formed colonies on MacConkey agar containing bile salt. SJP-SNU reduced hydrogen sulfide produced by Salmonella serotype Enteritidis and, after being fed to 4-week-old chickens, could be isolated from cecal feces. SJP-SNU did not cause mortality in 10-day-old chicken embryos. From 13 initial contigs, 11 were finally assembled and represented 10 chromosomal sequences and 1 mitochondrial DNA sequence. Comparative genomic analyses revealed that SJP-SNU was a strain of Pichia kudriavzevii. Although SJP-SNU possesses pathogenicity-related genes, they showed very low amino acid sequence identities to those of Candida albicans. Furthermore, SJP-SNU possessed useful genes, such as phytases and cellulase. Thus, SJP-SNU is a useful yeast possessing the basic traits of a probiotic, and further studies to demonstrate its efficacy as a probiotic in the future may be warranted.
An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species.

PubMed

Galpert, Deborah; Del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

2015-01-01

Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification.
Putative Microsatellite DNA Marker-Based Wheat Genomic Resource for Varietal Improvement and Management.

PubMed

Jaiswal, Sarika; Sheoran, Sonia; Arora, Vasu; Angadi, Ulavappa B; Iquebal, Mir A; Raghav, Nishu; Aneja, Bharti; Kumar, Deepender; Singh, Rajender; Sharma, Pradeep; Singh, G P; Rai, Anil; Tiwari, Ratan; Kumar, Dinesh

2017-01-01

Wheat fulfills 20% of global caloric requirement. World needs 60% more wheat for 9 billion population by 2050 but climate change with increasing temperature is projected to affect wheat productivity adversely. Trait improvement and management of wheat germplasm requires genomic resource. Simple Sequence Repeats (SSRs) being highly polymorphic and ubiquitously distributed in the genome, can be a marker of choice but there is no structured marker database with options to generate primer pairs for genotyping on desired chromosome/physical location. Previously associated markers with different wheat trait are also not available in any database. Limitations of in vitro SSR discovery can be overcome by genome-wide in silico mining of SSR. Triticum aestivum SSR database ( TaSSRDb ) is an integrated online database with three-tier architecture, developed using PHP and MySQL and accessible at http://webtom.cabgrid.res.in/wheatssr/. For genotyping, Primer3 standalone code computes primers on user request. Chromosome-wise SSR calling for all the three sub genomes along with choice of motif types is provided in addition to the primer generation for desired marker. We report here a database of highest number of SSRs (476,169) from complex, hexaploid wheat genome (~17 GB) along with previously reported 268 SSR markers associated with 11 traits. Highest (116.93 SSRs/Mb) and lowest (74.57 SSRs/Mb) SSR densities were found on 2D and 3A chromosome, respectively. To obtain homozygous locus, e-PCR was done. Such 30 loci were randomly selected for PCR validation in panel of 18 wheat Advance Varietal Trial (AVT) lines. TaSSRDb can be a valuable genomic resource tool for linkage mapping, gene/QTL (Quantitative trait locus) discovery, diversity analysis, traceability and variety identification. Varietal specific profiling and differentiation can supplement DUS (Distinctiveness, Uniformity, and Stability) testing, EDV (Essentially Derived Variety)/IV (Initial Variety) disputes, seed purity and hybrid wheat testing. All these are required in germplasm management as well as also in the endeavor of wheat productivity.
Putative Microsatellite DNA Marker-Based Wheat Genomic Resource for Varietal Improvement and Management

PubMed Central

Jaiswal, Sarika; Sheoran, Sonia; Arora, Vasu; Angadi, Ulavappa B.; Iquebal, Mir A.; Raghav, Nishu; Aneja, Bharti; Kumar, Deepender; Singh, Rajender; Sharma, Pradeep; Singh, G. P.; Rai, Anil; Tiwari, Ratan; Kumar, Dinesh

2017-01-01

Wheat fulfills 20% of global caloric requirement. World needs 60% more wheat for 9 billion population by 2050 but climate change with increasing temperature is projected to affect wheat productivity adversely. Trait improvement and management of wheat germplasm requires genomic resource. Simple Sequence Repeats (SSRs) being highly polymorphic and ubiquitously distributed in the genome, can be a marker of choice but there is no structured marker database with options to generate primer pairs for genotyping on desired chromosome/physical location. Previously associated markers with different wheat trait are also not available in any database. Limitations of in vitro SSR discovery can be overcome by genome-wide in silico mining of SSR. Triticum aestivum SSR database (TaSSRDb) is an integrated online database with three-tier architecture, developed using PHP and MySQL and accessible at http://webtom.cabgrid.res.in/wheatssr/. For genotyping, Primer3 standalone code computes primers on user request. Chromosome-wise SSR calling for all the three sub genomes along with choice of motif types is provided in addition to the primer generation for desired marker. We report here a database of highest number of SSRs (476,169) from complex, hexaploid wheat genome (~17 GB) along with previously reported 268 SSR markers associated with 11 traits. Highest (116.93 SSRs/Mb) and lowest (74.57 SSRs/Mb) SSR densities were found on 2D and 3A chromosome, respectively. To obtain homozygous locus, e-PCR was done. Such 30 loci were randomly selected for PCR validation in panel of 18 wheat Advance Varietal Trial (AVT) lines. TaSSRDb can be a valuable genomic resource tool for linkage mapping, gene/QTL (Quantitative trait locus) discovery, diversity analysis, traceability and variety identification. Varietal specific profiling and differentiation can supplement DUS (Distinctiveness, Uniformity, and Stability) testing, EDV (Essentially Derived Variety)/IV (Initial Variety) disputes, seed purity and hybrid wheat testing. All these are required in germplasm management as well as also in the endeavor of wheat productivity. PMID:29234333
MEPD: a Medaka gene expression pattern database

PubMed Central

Henrich, Thorsten; Ramialison, Mirana; Quiring, Rebecca; Wittbrodt, Beate; Furutani-Seiki, Makoto; Wittbrodt, Joachim; Kondoh, Hisato

2003-01-01

The Medaka Expression Pattern Database (MEPD) stores and integrates information of gene expression during embryonic development of the small freshwater fish Medaka (Oryzias latipes). Expression patterns of genes identified by ESTs are documented by images and by descriptions through parameters such as staining intensity, category and comments and through a comprehensive, hierarchically organized dictionary of anatomical terms. Sequences of the ESTs are available and searchable through BLAST. ESTs in the database are clustered upon entry and have been blasted against public data-bases. The BLAST results are updated regularly, stored within the database and searchable. The MEPD is a project within the Medaka Genome Initiative (MGI) and entries will be interconnected to integrated genomic map databases. MEPD is accessible through the WWW at http://medaka.dsp.jst.go.jp/MEPD. PMID:12519950
The path to enlightenment: making sense of genomic and proteomic information.

PubMed

Maurer, Martin H

2004-05-01

Whereas genomics describes the study of genome, mainly represented by its gene expression on the DNA or RNA level, the term proteomics denotes the study of the proteome, which is the protein complement encoded by the genome. In recent years, the number of proteomic experiments increased tremendously. While all fields of proteomics have made major technological advances, the biggest step was seen in bioinformatics. Biological information management relies on sequence and structure databases and powerful software tools to translate experimental results into meaningful biological hypotheses and answers. In this resource article, I provide a collection of databases and software available on the Internet that are useful to interpret genomic and proteomic data. The article is a toolbox for researchers who have genomic or proteomic datasets and need to put their findings into a biological context.
The emergence of commercial genomics: analysis of the rise of a biotechnology subsector during the Human Genome Project, 1990 to 2004.

PubMed

Wiechers, Ilse R; Perin, Noah C; Cook-Deegan, Robert

2013-01-01

Development of the commercial genomics sector within the biotechnology industry relied heavily on the scientific commons, public funding, and technology transfer between academic and industrial research. This study tracks financial and intellectual property data on genomics firms from 1990 through 2004, thus following these firms as they emerged in the era of the Human Genome Project and through the 2000 to 2001 market bubble. A database was created based on an early survey of genomics firms, which was expanded using three web-based biotechnology services, scientific journals, and biotechnology trade and technical publications. Financial data for publicly traded firms was collected through the use of four databases specializing in firm financials. Patent searches were conducted using firm names in the US Patent and Trademark Office website search engine and the DNA Patent Database. A biotechnology subsector of genomics firms emerged in parallel to the publicly funded Human Genome Project. Trends among top firms show that hiring, capital improvement, and research and development expenditures continued to grow after a 2000 to 2001 bubble. The majority of firms are small businesses with great diversity in type of research and development, products, and services provided. Over half the public firms holding patents have the majority of their intellectual property portfolio in DNA-based patents. These data allow estimates of investment, research and development expenditures, and jobs that paralleled the rise of genomics as a sector within biotechnology between 1990 and 2004.
A Utility Maximizing and Privacy Preserving Approach for Protecting Kinship in Genomic Databases.

PubMed

Kale, Gulce; Ayday, Erman; Tastan, Oznur

2017-09-12

Rapid and low cost sequencing of genomes enabled widespread use of genomic data in research studies and personalized customer applications, where genomic data is shared in public databases. Although the identities of the participants are anonymized in these databases, sensitive information about individuals can still be inferred. One such information is kinship. We define two routes kinship privacy can leak and propose a technique to protect kinship privacy against these risks while maximizing the utility of shared data. The method involves systematic identification of minimal portions of genomic data to mask as new participants are added to the database. Choosing the proper positions to hide is cast as an optimization problem in which the number of positions to mask is minimized subject to privacy constraints that ensure the familial relationships are not revealed.We evaluate the proposed technique on real genomic data. Results indicate that concurrent sharing of data pertaining to a parent and an offspring results in high risks of kinship privacy, whereas the sharing data from further relatives together is often safer. We also show arrival order of family members have a high impact on the level of privacy risks and on the utility of sharing data. Available at: https://github.com/tastanlab/Kinship-Privacy. erman@cs.bilkent.edu.tr or oznur.tastan@cs.bilkent.edu.tr. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Tripal: a construction toolkit for online genome databases.

PubMed

Ficklin, Stephen P; Sanderson, Lacey-Anne; Cheng, Chun-Huai; Staton, Margaret E; Lee, Taein; Cho, Il-Hyung; Jung, Sook; Bett, Kirstin E; Main, Doreen

2011-01-01

As the availability, affordability and magnitude of genomics and genetics research increases so does the need to provide online access to resulting data and analyses. Availability of a tailored online database is the desire for many investigators or research communities; however, managing the Information Technology infrastructure needed to create such a database can be an undesired distraction from primary research or potentially cost prohibitive. Tripal provides simplified site development by merging the power of Drupal, a popular web Content Management System with that of Chado, a community-derived database schema for storage of genomic, genetic and other related biological data. Tripal provides an interface that extends the content management features of Drupal to the data housed in Chado. Furthermore, Tripal provides a web-based Chado installer, genomic data loaders, web-based editing of data for organisms, genomic features, biological libraries, controlled vocabularies and stock collections. Also available are Tripal extensions that support loading and visualizations of NCBI BLAST, InterPro, Kyoto Encyclopedia of Genes and Genomes and Gene Ontology analyses, as well as an extension that provides integration of Tripal with GBrowse, a popular GMOD tool. An Application Programming Interface is available to allow creation of custom extensions by site developers, and the look-and-feel of the site is completely customizable through Drupal-based PHP template files. Addition of non-biological content and user-management is afforded through Drupal. Tripal is an open source and freely available software package found at http://tripal.sourceforge.net.
Tripal: a construction toolkit for online genome databases

PubMed Central

Sanderson, Lacey-Anne; Cheng, Chun-Huai; Staton, Margaret E.; Lee, Taein; Cho, Il-Hyung; Jung, Sook; Bett, Kirstin E.; Main, Doreen

2011-01-01

As the availability, affordability and magnitude of genomics and genetics research increases so does the need to provide online access to resulting data and analyses. Availability of a tailored online database is the desire for many investigators or research communities; however, managing the Information Technology infrastructure needed to create such a database can be an undesired distraction from primary research or potentially cost prohibitive. Tripal provides simplified site development by merging the power of Drupal, a popular web Content Management System with that of Chado, a community-derived database schema for storage of genomic, genetic and other related biological data. Tripal provides an interface that extends the content management features of Drupal to the data housed in Chado. Furthermore, Tripal provides a web-based Chado installer, genomic data loaders, web-based editing of data for organisms, genomic features, biological libraries, controlled vocabularies and stock collections. Also available are Tripal extensions that support loading and visualizations of NCBI BLAST, InterPro, Kyoto Encyclopedia of Genes and Genomes and Gene Ontology analyses, as well as an extension that provides integration of Tripal with GBrowse, a popular GMOD tool. An Application Programming Interface is available to allow creation of custom extensions by site developers, and the look-and-feel of the site is completely customizable through Drupal-based PHP template files. Addition of non-biological content and user-management is afforded through Drupal. Tripal is an open source and freely available software package found at http://tripal.sourceforge.net PMID:21959868
Database Resources of the BIG Data Center in 2018

PubMed Central

Xu, Xingjian; Hao, Lili; Zhu, Junwei; Tang, Bixia; Zhou, Qing; Song, Fuhai; Chen, Tingting; Zhang, Sisi; Dong, Lili; Lan, Li; Wang, Yanqing; Sang, Jian; Hao, Lili; Liang, Fang; Cao, Jiabao; Liu, Fang; Liu, Lin; Wang, Fan; Ma, Yingke; Xu, Xingjian; Zhang, Lijuan; Chen, Meili; Tian, Dongmei; Li, Cuiping; Dong, Lili; Du, Zhenglin; Yuan, Na; Zeng, Jingyao; Zhang, Zhewen; Wang, Jinyue; Shi, Shuo; Zhang, Yadong; Pan, Mengyu; Tang, Bixia; Zou, Dong; Song, Shuhui; Sang, Jian; Xia, Lin; Wang, Zhennan; Li, Man; Cao, Jiabao; Niu, Guangyi; Zhang, Yang; Sheng, Xin; Lu, Mingming; Wang, Qi; Xiao, Jingfa; Zou, Dong; Wang, Fan; Hao, Lili; Liang, Fang; Li, Mengwei; Sun, Shixiang; Zou, Dong; Li, Rujiao; Yu, Chunlei; Wang, Guangyu; Sang, Jian; Liu, Lin; Li, Mengwei; Li, Man; Niu, Guangyi; Cao, Jiabao; Sun, Shixiang; Xia, Lin; Yin, Hongyan; Zou, Dong; Xu, Xingjian; Ma, Lina; Chen, Huanxin; Sun, Yubin; Yu, Lei; Zhai, Shuang; Sun, Mingyuan; Zhang, Zhang; Zhao, Wenming; Xiao, Jingfa; Bao, Yiming; Song, Shuhui; Hao, Lili; Li, Rujiao; Ma, Lina; Sang, Jian; Wang, Yanqing; Tang, Bixia; Zou, Dong; Wang, Fan

2018-01-01

Abstract The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. PMID:29036542
The Changing Face of Scientific Discourse: Analysis of Genomic and Proteomic Database Usage and Acceptance.

ERIC Educational Resources Information Center

Brown, Cecelia

2003-01-01

Discusses the growth in use and acceptance of Web-based genomic and proteomic databases (GPD) in scholarly communication. Confirms the role of GPD in the scientific literature cycle, suggests GPD are a storage and retrieval mechanism for molecular biology information, and recommends that existing models of scientific communication be updated to…
A searchable, whole genome resource designed for protein variant analysis in diverse lineages of U.S. beef cattle

USDA-ARS?s Scientific Manuscript database

A key feature of a gene's function is the variety of protein isoforms it encodes in a population. However, the genetic diversity in bovine whole genome databases tends to be underrepresented because these databases contain an abundance of sequence from the most influential sires. Our first aim was ...
Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

NASA Astrophysics Data System (ADS)

Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

2016-08-01

Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.
CicerTransDB 1.0: a resource for expression and functional study of chickpea transcription factors.

PubMed

Gayali, Saurabh; Acharya, Shankar; Lande, Nilesh Vikram; Pandey, Aarti; Chakraborty, Subhra; Chakraborty, Niranjan

2016-07-29

Transcription factor (TF) databases are major resource for systematic studies of TFs in specific species as well as related family members. Even though there are several publicly available multi-species databases, the information on the amount and diversity of TFs within individual species is fragmented, especially for newly sequenced genomes of non-model species of agricultural significance. We constructed CicerTransDB (Cicer Transcription Factor Database), the first database of its kind, which would provide a centralized putatively complete list of TFs in a food legume, chickpea. CicerTransDB, available at www.cicertransdb.esy.es , is based on chickpea (Cicer arietinum L.) annotation v 1.0. The database is an outcome of genome-wide domain study and manual classification of TF families. This database not only provides information of the gene, but also gene ontology, domain and motif architecture. CicerTransDB v 1.0 comprises information of 1124 genes of chickpea and enables the user to not only search, browse and download sequences but also retrieve sequence features. CicerTransDB also provides several single click interfaces, transconnecting to various other databases to ease further analysis. Several webAPI(s) integrated in the database allow end-users direct access of data. A critical comparison of CicerTransDB with PlantTFDB (Plant Transcription Factor Database) revealed 68 novel TFs in the chickpea genome, hitherto unexplored. Database URL: http://www.cicertransdb.esy.es.
Portrait of Candida Species Biofilm Regulatory Network Genes.

PubMed

Araújo, Daniela; Henriques, Mariana; Silva, Sónia

2017-01-01

Most cases of candidiasis have been attributed to Candida albicans, but Candida glabrata, Candida parapsilosis and Candida tropicalis, designated as non-C. albicans Candida (NCAC), have been identified as frequent human pathogens. Moreover, Candida biofilms are an escalating clinical problem associated with significant rates of mortality. Biofilms have distinct developmental phases, including adhesion/colonisation, maturation and dispersal, controlled by complex regulatory networks. This review discusses recent advances regarding Candida species biofilm regulatory network genes, which are key components for candidiasis. Copyright © 2016 Elsevier Ltd. All rights reserved.
Description of Groenewaldozyma gen. nov. for placement of Candida auringiensis, Candida salmanticensis and Candida tartarivorans.

PubMed

Kurtzman, Cletus P

2016-07-01

DNA sequence analyses have demonstrated that species of the polyphyletic anamorphic ascomycete genus Candida may be members of described teleomorphic genera, members of the Candida tropicalis clade upon which the genus Candida is circumscribed, or members of isolated clades that represent undescribed genera. From phylogenetic analysis of gene sequences from nuclear large subunit rRNA, mitochondrial small subunit rRNA and cytochrome oxidase II, Candida auringiensis (NRRL Y-17674(T), CBS 6913(T)), Candida salmanticensis (NRRL Y-17090(T), CBS 5121(T)), and Candida tartarivorans (NRRL Y-27291(T), CBS 7955(T)) were shown to be members of an isolated clade and are proposed for reclassification in the genus Groenewaldozyma gen. nov. (MycoBank MB 815817). Neighbouring taxa include species of the Wickerhamiella clade and Candida blankii.
Usefulness of the Non-conventional Caenorhabditis elegans Model to Assess Candida Virulence.

PubMed

Ortega-Riveros, Marcelo; De-la-Pinta, Iker; Marcos-Arias, Cristina; Ezpeleta, Guillermo; Quindós, Guillermo; Eraso, Elena

2017-10-01

Invasive candidiasis is caused mainly by Candida albicans, but other Candida species have increasing etiologies. These species show different virulence and susceptibility levels to antifungal drugs. The aims of this study were to evaluate the usefulness of the non-conventional model Caenorhabditis elegans to assess the in vivo virulence of seven different Candida species and to compare the virulence in vivo with the in vitro production of proteinases and phospholipases, hemolytic activity and biofilm development capacity. One culture collection strain of each of seven Candida species (C. albicans, Candida dubliniensis, Candida glabrata, Candida krusei, Candida metapsilosis, Candida orthopsilosis and Candida parapsilosis) was studied. A double mutant C. elegans AU37 strain (glp-4;sek-1) was infected with Candida by ingestion, and the analysis of nematode survival was performed in liquid medium every 24 h until 120 h. Candida establishes a persistent lethal infection in the C. elegans intestinal tract. C. albicans and C. krusei were the most pathogenic species, whereas C. dubliniensis infection showed the lowest mortality. C. albicans was the only species with phospholipase activity, was the greatest producer of aspartyl proteinase and had a higher hemolytic activity. C. albicans and C. krusei caused higher mortality than the rest of the Candida species studied in the C. elegans model of candidiasis.
The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika

2010-01-27

Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set ofmore » tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in EST library sequencing approaches, and thus represent a rich resource for studies of environmental genomics.« less
Microbial Genome Analysis and Comparisons: Web-based Protocols and Resources

USDA-ARS?s Scientific Manuscript database

Fully annotated genome sequences of many microorganisms are publicly available as a resource. However, in-depth analysis of these genomes using specialized tools is required to derive meaningful information. We describe here the utility of three powerful publicly available genome databases and ana...
Phospholipase and proteinase activities of Candida spp. isolates from vulvovaginitis in Iran.

PubMed

Shirkhani, S; Sepahvand, A; Mirzaee, M; Anbari, K

2016-09-01

This study aims to characterize phospholipase and proteinase activities of Candida isolates from 82 vulvovaginal candidiasis (VVC) and to study the relationship of these activities with vulvovaginitis. Totally 82 Candida isolates from vagina samples of VVC patients were randomly collected over the period between September and December 2014 from hospitalized patients at the general hospitals of Lorestan province, Iran. Isolates were previously identified by conventional mycological methods. The phospholipase and proteinase activities were evaluated by Egg yolk agar, Tween 80 opacity medium and agar plate methods. The most common Candida species was identified Candida albicans (n=34, 41.5%), followed by Candida famata (n=13, 15.8%), Candida tropicalis (n=11, 13.4%), and Candida parapsilosis (n=9, 11%). The most phospholipase activity was observed in Candida colliculosa (40%), followed by C. famata (38.5%), and Candida krusei (33.3%). The findings revealed that the correlation between phospholipase production by Candida spp. and the presence of VVC was not found to be statistically significant (P=0.91). All Candida spp. exhibited considerable proteinase activity; so that 100% of C. colliculosa, C. parapsilosis, Candida kefyr, and Candida intermedia isolates produced high proteinase activity with Pz 4+ scores. There was a significant correlation between proteinase production by Candida spp. and the presence of VVC (P=0.009). The obtained findings revealed that Candida spp. isolates may produce both virulence factors, phospholipase and proteinase. Although the phospholipase production was only observed in <40% of the isolates; however there was a significant association between proteinase production by Candida spp. and VVC. Copyright © 2016. Published by Elsevier Masson SAS.
Epidemiologic and microbiologic evaluation of nosocomial infections associated with Candida spp in children: A multicenter study from Istanbul, Turkey.

PubMed

Sutcu, Murat; Salman, Nuran; Akturk, Hacer; Dalgıc, Nazan; Turel, Ozden; Kuzdan, Canan; Kadayifci, Eda Kepenekli; Sener, Dicle; Karbuz, Adem; Erturan, Zayre; Somer, Ayper

2016-10-01

The purpose of this study was to establish species distribution of Candida isolates from pediatric patients in Istanbul, Turkey, and to determine risk factors associated with nosocomial Candida infections. This study was conducted between June 2013 and June 2014 by participation of 7 medical centers in Istanbul. Candida spp strains isolated from the clinical specimens of pediatric patients were included. Clinical features were recorded on a standardized data collection sheet. A total of 134 systemic Candida infections were identified in 134 patients. The patients were admitted in pediatric and neonatal intensive care units (41.8% and 9.7%, respectively) and in pediatric wards (48.5%). Candida albicans was the most prevalent species (47%), followed by Candida parapsilosis (13.4%), Candida tropicalis (8.2%), Candida glabrata (4.5%), Candida lusitaniae (3.7%), Candida kefyr (2.2%), Candida guilliermondii (1.5%), Candida dubliniensis (0.7%), and Candida krusei (0.7%). Types of Candida infections were candidemia (50.7%), urinary tract infection (33.6%), surgical site infection (4.5%), central nervous system infection (3.7%), catheter infection (3.7%), and intra-abdominal infection (3.7%). In multivariate analysis, younger age (1-24 months) and detection of non-albicans Candida spp was found to be risk factors associated with candidemia (P = 0.040; odds ratio [OR], 4.1; 95% confidence interval [CI], 1.06-15.86; and P = 0.02; OR, 2.4; 95% CI, 1.10-5.53, respectively). This study provides an update for the epidemiology of nosocomial Candida infections in Istanbul, which is important for the management of patients and implementation of appropriate infection control measures. Copyright © 2016 Association for Professionals in Infection Control and Epidemiology, Inc. Published by Elsevier Inc. All rights reserved.
In Vitro Antifungal Susceptibility of Oral Candida Isolates from Patients Suffering from Caries and Chronic Periodontitis.

PubMed

De-la-Torre, Janire; Ortiz-Samperio, María Esther; Marcos-Arias, Cristina; Marichalar-Mendia, Xabier; Eraso, Elena; Echebarria-Goicouria, María Ángeles; Aguirre-Urizar, José Manuel; Quindós, Guillermo

2017-06-01

Caries and chronic periodontitis are common oral diseases where a higher Candida colonization is reported. Antifungal agents could be adjuvant drugs for the therapy of both clinical conditions. The aim of the current study has been to evaluate the in vitro activities of conventional and new antifungal drugs against oral Candida isolates from patients suffering from caries and/or chronic periodontitis. In vitro activities of amphotericin B, fluconazole, itraconazole, miconazole, nystatin, posaconazole and voriconazole against 126 oral Candida isolates (75 Candida albicans, 18 Candida parapsilosis, 11 Candida dubliniensis, six Candida guilliermondii, five Candida lipolytica, five Candida glabrata, four Candida tropicalis and two Candida krusei) from 61 patients were tested by the CLSI M27-A3 method. Most antifungal drugs were highly active, and resistance was observed in less than 5% of tested isolates. Miconazole was the most active antifungal drug, being more than 98% of isolates susceptible. Fluconazole, itraconazole, and the new triazoles, posaconazole and voriconazole, were also very active. Miconazole, fluconazole and voriconazole have excellent in vitro activities against all Candida isolates and could represent suitable treatment for a hypothetically adjunctive therapy of caries and chronic periodontitis.
NemaPath: online exploration of KEGG-based metabolic pathways for nematodes

PubMed Central

Wylie, Todd; Martin, John; Abubucker, Sahar; Yin, Yong; Messina, David; Wang, Zhengyuan; McCarter, James P; Mitreva, Makedonka

2008-01-01

Background Nematode.net is a web-accessible resource for investigating gene sequences from parasitic and free-living nematode genomes. Beyond the well-characterized model nematode C. elegans, over 500,000 expressed sequence tags (ESTs) and nearly 600,000 genome survey sequences (GSSs) have been generated from 36 nematode species as part of the Parasitic Nematode Genomics Program undertaken by the Genome Center at Washington University School of Medicine. However, these sequencing data are not present in most publicly available protein databases, which only include sequences in Swiss-Prot. Swiss-Prot, in turn, relies on GenBank/Embl/DDJP for predicted proteins from complete genomes or full-length proteins. Description Here we present the NemaPath pathway server, a web-based pathway-level visualization tool for navigating putative metabolic pathways for over 30 nematode species, including 27 parasites. The NemaPath approach consists of two parts: 1) a backend tool to align and evaluate nematode genomic sequences (curated EST contigs) against the annotated Kyoto Encyclopedia of Genes and Genomes (KEGG) protein database; 2) a web viewing application that displays annotated KEGG pathway maps based on desired confidence levels of primary sequence similarity as defined by a user. NemaPath also provides cross-referenced access to nematode genome information provided by other tools available on Nematode.net, including: detailed NemaGene EST cluster information; putative translations; GBrowse EST cluster views; links from nematode data to external databases for corresponding synonymous C. elegans counterparts, subject matches in KEGG's gene database, and also KEGG Ontology (KO) identification. Conclusion The NemaPath server hosts metabolic pathway mappings for 30 nematode species and is available on the World Wide Web at . The nematode source sequences used for the metabolic pathway mappings are available via FTP , as provided by the Genome Center at Washington University School of Medicine. PMID:18983679
New Chromogenic Agar Medium for the Identification of Candida spp.

PubMed Central

Cooke, Venitia M.; Miles, R. J.; Price, R. G.; Midgley, G.; Khamri, W.; Richardson, A. C.

2002-01-01

A new chromogenic agar medium (Candida diagnostic agar [CDA]) for differentiation of Candida spp. is described. This medium is based on Sabouraud dextrose agar (Oxoid CM41) and contains (per liter) 40.0 g of glucose, 10.0 g of mycological peptone, and 15.0 g of agar along with a novel chromogenic glucosaminidase substrate, ammonium 4-{2-[4-(2-acetamido-2-deoxy-β-d-glucopyranosyloxy)-3-methoxyphenyl]-vinyl}-1-(propan-3-yl-oate)-quinolium bromide (0.32 g liter−1). The glucosaminidase substrate in CDA was hydrolyzed by Candida albicans and Candida dubliniensis, yielding white colonies with deep-red spots on a yellow transparent background after 24 to 48 h of incubation at 37°C. Colonies of Candida tropicalis and Candida kefyr were uniformly pink, and colonies of other Candida spp., including Candida glabrata and Candida parapsilosis, were white. CDA was evaluated by using 115 test strains of Candida spp. and other clinically important yeasts and was compared with two commercially available chromogenic agars (Candida ID agar [bioMerieux] and CHROMagar Candida [CHROMagar Company Ltd.]). On all three agars, colonies of C. albicans were not distinguished from colonies of C. dubliniensis. However, for the group containing C. albicans plus C. dubliniensis, both the sensitivity and the specificity of detection when CDA was used were 100%, compared with values of 97.6 and 100%, respectively, with CHROMagar Candida and 100 and 96.8%, respectively, with Candida ID agar. In addition, for the group containing C. tropicalis plus C. kefyr, the sensitivity and specificity of detection when CDA was used were also 100%, compared with 72.7 and 98.1%, respectively, with CHROMagar Candida. Candida ID agar did not differentiate C. tropicalis and C. kefyr strains but did differentiate members of a broader group (C. tropicalis, C. kefyr, Candida lusitaniae plus Candida guilliermondii); the sensitivity and specificity of detection for members of this group were 94.7 and 93.8%, respectively. In addition to the increased sensitivity and/or specificity of Candida detection when CDA was used, differentiation of colony types on CDA (red spotted, pink, or no color) was unambiguous and did not require precise assessment of colony color. PMID:12089051
Direct Isolation of Candida spp. from Blood Cultures on the Chromogenic Medium CHROMagar Candida

PubMed Central

Horvath, Lynn L.; Hospenthal, Duane R.; Murray, Clinton K.; Dooley, David P.

2003-01-01

CHROMagar Candida is a selective and differential chromogenic medium that has been shown to be useful for identification of Candida albicans, Candida krusei, Candida tropicalis, and perhaps Candida glabrata. Colony morphology and color have been well defined when CHROMagar Candida has been used to isolate yeast directly from clinical specimens, including stool, urine, respiratory, vaginal, oropharyngeal, and esophageal sources. Direct isolation of yeast on CHROMagar Candida from blood cultures has not been evaluated. We evaluated whether the color and colony characteristics produced by Candida spp. on CHROMagar Candida were altered when yeasts were isolated directly from blood cultures. Fifty clinical isolates of Candida were inoculated into aerobic and anaerobic blood culture bottles and incubated at 35°C in an automated blood culture system. When growth was detected, an aliquot was removed and plated onto CHROMagar Candida. As a control, CHROMagar Candida plates were inoculated with the same isolate of yeast grown on Sabouraud dextrose agar simultaneously. No significant difference was detected in color or colony morphology between the blood and control isolates in any of the tested organisms. All C. albicans (n = 12), C. tropicalis (n = 12), C. glabrata (n = 9), and C. krusei (n = 5) isolates exhibited the expected species-specific colony characteristics and color, whether isolated directly from blood or from control cultures. CHROMagar Candida can be reliably used for direct isolation of yeast from blood cultures. Direct isolation could allow mycology laboratories to more rapidly identify Candida spp., enable clinicians to more quickly make antifungal agent selections, and potentially decrease patient morbidity and mortality. PMID:12791890
Bolbase: a comprehensive genomics database for Brassica oleracea

PubMed Central

2013-01-01

Background Brassica oleracea is a morphologically diverse species in the family Brassicaceae and contains a group of nutrition-rich vegetable crops, including common heading cabbage, cauliflower, broccoli, kohlrabi, kale, Brussels sprouts. This diversity along with its phylogenetic membership in a group of three diploid and three tetraploid species, and the recent availability of genome sequences within Brassica provide an unprecedented opportunity to study intra- and inter-species divergence and evolution in this species and its close relatives. Description We have developed a comprehensive database, Bolbase, which provides access to the B. oleracea genome data and comparative genomics information. The whole genome of B. oleracea is available, including nine fully assembled chromosomes and 1,848 scaffolds, with 45,758 predicted genes, 13,382 transposable elements, and 3,581 non-coding RNAs. Comparative genomics information is available, including syntenic regions among B. oleracea, Brassica rapa and Arabidopsis thaliana, synonymous (Ks) and non-synonymous (Ka) substitution rates between orthologous gene pairs, gene families or clusters, and differences in quantity, category, and distribution of transposable elements on chromosomes. Bolbase provides useful search and data mining tools, including a keyword search, a local BLAST server, and a customized GBrowse tool, which can be used to extract annotations of genome components, identify similar sequences and visualize syntenic regions among species. Users can download all genomic data and explore comparative genomics in a highly visual setting. Conclusions Bolbase is the first resource platform for the B. oleracea genome and for genomic comparisons with its relatives, and thus it will help the research community to better study the function and evolution of Brassica genomes as well as enhance molecular breeding research. This database will be updated regularly with new features, improvements to genome annotation, and new genomic sequences as they become available. Bolbase is freely available at http://ocri-genomics.org/bolbase. PMID:24079801
Improved orthologous databases to ease protozoan targets inference.

PubMed

Kotowski, Nelson; Jardim, Rodrigo; Dávila, Alberto M R

2015-09-29

Homology inference helps on identifying similarities, as well as differences among organisms, which provides a better insight on how closely related one might be to another. In addition, comparative genomics pipelines are widely adopted tools designed using different bioinformatics applications and algorithms. In this article, we propose a methodology to build improved orthologous databases with the potential to aid on protozoan target identification, one of the many tasks which benefit from comparative genomics tools. Our analyses are based on OrthoSearch, a comparative genomics pipeline originally designed to infer orthologs through protein-profile comparison, supported by an HMM, reciprocal best hits based approach. Our methodology allows OrthoSearch to confront two orthologous databases and to generate an improved new one. Such can be later used to infer potential protozoan targets through a similarity analysis against the human genome. The protein sequences of Cryptosporidium hominis, Entamoeba histolytica and Leishmania infantum genomes were comparatively analyzed against three orthologous databases: (i) EggNOG KOG, (ii) ProtozoaDB and (iii) Kegg Orthology (KO). That allowed us to create two new orthologous databases, "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB", with 16,938 and 27,701 orthologous groups, respectively. Such new orthologous databases were used for a regular OrthoSearch run. By confronting "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB" databases and protozoan species we were able to detect the following total of orthologous groups and coverage (relation between the inferred orthologous groups and the species total number of proteins): Cryptosporidium hominis: 1,821 (11 %) and 3,254 (12 %); Entamoeba histolytica: 2,245 (13 %) and 5,305 (19 %); Leishmania infantum: 2,702 (16 %) and 4,760 (17 %). Using our HMM-based methodology and the largest created orthologous database, it was possible to infer 13 orthologous groups which represent potential protozoan targets; these were found because of our distant homology approach. We also provide the number of species-specific, pair-to-pair and core groups from such analyses, depicted in Venn diagrams. The orthologous databases generated by our HMM-based methodology provide a broader dataset, with larger amounts of orthologous groups when compared to the original databases used as input. Those may be used for several homology inference analyses, annotation tasks and protozoan targets identification.

VaProS: a database-integration approach for protein/genome information retrieval.

PubMed

Gojobori, Takashi; Ikeo, Kazuho; Katayama, Yukie; Kawabata, Takeshi; Kinjo, Akira R; Kinoshita, Kengo; Kwon, Yeondae; Migita, Ohsuke; Mizutani, Hisashi; Muraoka, Masafumi; Nagata, Koji; Omori, Satoshi; Sugawara, Hideaki; Yamada, Daichi; Yura, Kei

2016-12-01

Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein-protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts' knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/ .
In vitro susceptibility of Candida spp. to fluconazole, itraconazole and voriconazole and the correlation between triazoles susceptibility: Results from a five-year study.

PubMed

Lei, J; Xu, J; Wang, T

2018-06-01

Candida spp. is a common cause of invasive fungal disease. The aim of this study was to examine the susceptibility of Candida spp. to fluconazole, itraconazole and voriconazole and explore the correlation between triazoles susceptibility. The antifungal susceptibility in the present study was measured by ATB Fungus 3 method, and the potential relationship was examined by obtaining the correlation of measured minimal inhibitory concentrations (MICs) of Candida spp. isolates. A total of 2099 clinical isolates of Candida spp. from 1441 patients were analyzed. The organisms included 1435 isolates of Candida albicans, 207 isolates of Candida glabrata, 65 isolates of Candida parapsilosis, 31 isolates of Candida krusei, 268 isolates of Candida tropicalis. Voriconazole and itraconazole were more active than fluconazole and against Candida spp. in vitro. The fluconazole, itraconazole and voriconazole MIC 90 (MIC for 90% of the isolates) for all Candida spp. isolates was 4mg/L, 1mg/L and 0.25mg/L, respectively. There was a moderate correlation between the fluconazole MIC s for Candida spp. isolates and this for voriconazole (R 2 =0.475; P<0.01) and itraconazole (R 2 =0.431; P<0.01). Voriconazole MICs for the Candida spp. isolates also correlated with those for itraconazole (R 2 =0.401; P<0.01). These observations suggest that the in vitro susceptibility of Candida spp. to fluconazole, itraconazole and voriconazole exhibits a moderate correlation. Published by Elsevier Masson SAS.
Synthesis and anticandidal activity of some imidazopyridine derivatives.

PubMed

Kaplancikli, Zafer Asim; Turan-Zitouni, Gülhan; Ozdemir, Ahmet; Revial, Gilbert

2008-12-01

New hydrazide derivatives of imidazo[1,2-a]pyridine have been synthesized and evaluated for anticandidal activity. The reaction of imidazo[1,2-a]pyridine-2-carboxylic acid hydrazides with various benzaldehydes gave N-(benzylidene)imidazo[ 1,2-a]pyridine-2-carboxylic acid hydrazide derivatives. Their anticandidal activities against Candida albicans and Candida glabrata (isolates obtained from Osmangazi University, Faculty of Medicine, Eskisehir, Turkey), Candida albicans (ATCC 90028), Candida utilis (NRLL Y-900), Candida tropicalis (NRLL Y-12968), Candida krusei (NRLL Y-7179), Candida zeylanoides (NRLL Y-1774), and Candida parapsilosis (NRLL Y-12696) were investigated.
Thinking beyond the Common Candida Species: Need for Species-Level Identification of Candida Due to the Emergence of Multidrug-Resistant Candida auris.

PubMed

Lockhart, Shawn R; Jackson, Brendan R; Vallabhaneni, Snigdha; Ostrosky-Zeichner, Luis; Pappas, Peter G; Chiller, Tom

2017-12-01

Candida species are one of the leading causes of nosocomial infections. Because much of the treatment for Candida infections is empirical, some institutions do not identify Candida to species level. With the worldwide emergence of the multidrug-resistant species Candida auris , identification of Candida to species level has new clinical relevance. Species should be identified for invasive candidiasis isolates, and species-level identification can be considered for selected noninvasive isolates to improve detection of C. auris . Copyright © 2017 American Society for Microbiology.
Antibiofilm activity of carboxymethyl chitosan on the biofilms of non-Candida albicans Candida species.

PubMed

Tan, Yulong; Leonhard, Matthias; Moser, Doris; Schneider-Stickler, Berit

2016-09-20

Although most cases of candidiasis have been attributed to Candida albicans, non-C. albicans Candida species have been isolated in increasing numbers in patients. In this study, we determined the inhibition of carboxymethyl chitosan (CM-chitosan) on single and mixed species biofilm of non-albicans Candida species, including Candida tropicalis, Candida parapsilosis, Candida krusei and Candida glabrata. Biofilm by all tested species in microtiter plates were inhibited nearly 70%. CM-chitosan inhibited mixed species biofilm in microtiter plates and also on medical materials surfaces. To investigate the mechanism, the effect of CM-chitosan on cell viability and biofilm growth was employed. CM-chitosan inhibited Candida planktonic growth as well as adhesion. Further biofilm formation was inhibited with CM-chitosan added at 90min, 12h or 24h after biofilm initiation. CM-chitosan was not only able to inhibit the metabolic activity of Candida cells, but was also active upon the establishment and the development of biofilms. Copyright © 2016 Elsevier Ltd. All rights reserved.
Candida and the paediatric lung.

PubMed

Pasqualotto, Alessandro C

2009-12-01

Although systemic candidosis is common in hospitalised children, Candida involvement of lung parenchyma is rare and usually perceived only at autopsy. The purpose of this article was to review the evidence regarding lung involvement in Candida infections, with special attention to paediatric patients. Primary Candida pneumonia is rare and usually associated with aspiration of oropharyngeal contents. The majority of cases of Candida pneumonia are secondary to haematological dissemination of Candida organisms from a distant site, usually the gastrointestinal tract or the skin. The diagnosis of pulmonary candidosis is difficult because there is no specific clinical or radiological presentation. In addition, the presence of Candida in sputum or other respiratory specimens mostly represents contamination. A definitive diagnosis of Candida pneumonia requires histopathologic proof of lung invasion in association with inflammation. Children can also be affected by pulmonary allergic reactions caused by Candida species. Treatment of Candida pneumonia is essentially the same as for candidaemia. Preliminary evidence suggests that patients with severe asthma sensitised to Candida species may also benefit from antifungal drugs.
Overcoming Species Boundaries in Peptide Identification with Bayesian Information Criterion-driven Error-tolerant Peptide Search (BICEPS)*

PubMed Central

Renard, Bernhard Y.; Xu, Buote; Kirchner, Marc; Zickmann, Franziska; Winter, Dominic; Korten, Simone; Brattig, Norbert W.; Tzur, Amit; Hamprecht, Fred A.; Steen, Hanno

2012-01-01

Currently, the reliable identification of peptides and proteins is only feasible when thoroughly annotated sequence databases are available. Although sequencing capacities continue to grow, many organisms remain without reliable, fully annotated reference genomes required for proteomic analyses. Standard database search algorithms fail to identify peptides that are not exactly contained in a protein database. De novo searches are generally hindered by their restricted reliability, and current error-tolerant search strategies are limited by global, heuristic tradeoffs between database and spectral information. We propose a Bayesian information criterion-driven error-tolerant peptide search (BICEPS) and offer an open source implementation based on this statistical criterion to automatically balance the information of each single spectrum and the database, while limiting the run time. We show that BICEPS performs as well as current database search algorithms when such algorithms are applied to sequenced organisms, whereas BICEPS only uses a remotely related organism database. For instance, we use a chicken instead of a human database corresponding to an evolutionary distance of more than 300 million years (International Chicken Genome Sequencing Consortium (2004) Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432, 695–716). We demonstrate the successful application to cross-species proteomics with a 33% increase in the number of identified proteins for a filarial nematode sample of Litomosoides sigmodontis. PMID:22493179
[Distribution of Candida species in vaginal specimens and evaluation of CHROMagar Candida medium].

PubMed

Gültekin, Berna; Yazici, Vesile; Aydin, Neriman

2005-07-01

Identification of Candida species is important to guide treatment in vulvovaginal candidiasis which is seen frequently and needs long-term therapy due to recurrence. The aim of this study was to determine the species distribution of Candida isolated from vaginal specimens and evaluation of CHROMagar Candida medium in the laboratory diagnosis. Samples from 80 patients who were clinically diagnosed as vaginitis have been analysed in our laboratory. Colonies appeared on CHROMagar Candida media after 48 hours of incubation at 35 degrees C were evaluated for their colors and characteristics. Candida strains were identified by germ tube test, growth on corn meal Tween 80 agar and when necessary also by API 20 C AUX commercial kit. A total of 84 Candida strains were isolated from 80 patients. Two different Candida species have been isolated from four (5%) of the samples. Among Candida strains isolated, 45 (53.6%) were C. albicans, 29 (34.5%) C. glabrata, 7 (8.3%) C. krusei, and 3 (3.6%) C. kefyr. All of the C. albicans and six of the seven C. krusei isolates have been identified correctly by CHROMagar Candida medium. These results showed that C. albicans is still the most frequently isolated species from vaginal samples. It was concluded that CHROMagar Candida medium is useful for identification of colonies due to frequently seen Candida species and also in differentiation of multiple Candida species grown on the same culture.
Determining Epigenetic Targets: A Beginner's Guide to Identifying Genome Functionality Through Database Analysis.

PubMed

Hay, Elizabeth A; Cowie, Philip; MacKenzie, Alasdair

2017-01-01

There can now be little doubt that the cis-regulatory genome represents the largest information source within the human genome essential for health. In addition to containing up to five times more information than the coding genome, the cis-regulatory genome also acts as a major reservoir of disease-associated polymorphic variation. The cis-regulatory genome, which is comprised of enhancers, silencers, promoters, and insulators, also acts as a major functional target for epigenetic modification including DNA methylation and chromatin modifications. These epigenetic modifications impact the ability of cis-regulatory sequences to maintain tissue-specific and inducible expression of genes that preserve health. There has been limited ability to identify and characterize the functional components of this huge and largely misunderstood part of the human genome that, for decades, was ignored as "Junk" DNA. In an attempt to address this deficit, the current chapter will first describe methods of identifying and characterizing functional elements of the cis-regulatory genome at a genome-wide level using databases such as ENCODE, the UCSC browser, and NCBI. We will then explore the databases on the UCSC genome browser, which provides access to DNA methylation and chromatin modification datasets. Finally, we will describe how we can superimpose the huge volume of study data contained in the NCBI archives onto that contained within the UCSC browser in order to glean relevant in vivo study data for any locus within the genome. An ability to access and utilize these information sources will become essential to informing the future design of experiments and subsequent determination of the role of epigenetics in health and disease and will form a critical step in our development of personalized medicine.
BRAD, the genetics and genomics database for Brassica plants.

PubMed

Cheng, Feng; Liu, Shengyi; Wu, Jian; Fang, Lu; Sun, Silong; Liu, Bo; Li, Pingxia; Hua, Wei; Wang, Xiaowu

2011-10-13

Brassica species include both vegetable and oilseed crops, which are very important to the daily life of common human beings. Meanwhile, the Brassica species represent an excellent system for studying numerous aspects of plant biology, specifically for the analysis of genome evolution following polyploidy, so it is also very important for scientific research. Now, the genome of Brassica rapa has already been assembled, it is the time to do deep mining of the genome data. BRAD, the Brassica database, is a web-based resource focusing on genome scale genetic and genomic data for important Brassica crops. BRAD was built based on the first whole genome sequence and on further data analysis of the Brassica A genome species, Brassica rapa (Chiifu-401-42). It provides datasets, such as the complete genome sequence of B. rapa, which was de novo assembled from Illumina GA II short reads and from BAC clone sequences, predicted genes and associated annotations, non coding RNAs, transposable elements (TE), B. rapa genes' orthologous to those in A. thaliana, as well as genetic markers and linkage maps. BRAD offers useful searching and data mining tools, including search across annotation datasets, search for syntenic or non-syntenic orthologs, and to search the flanking regions of a certain target, as well as the tools of BLAST and Gbrowse. BRAD allows users to enter almost any kind of information, such as a B. rapa or A. thaliana gene ID, physical position or genetic marker. BRAD, a new database which focuses on the genetics and genomics of the Brassica plants has been developed, it aims at helping scientists and breeders to fully and efficiently use the information of genome data of Brassica plants. BRAD will be continuously updated and can be accessed through http://brassicadb.org.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives.

PubMed

Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

2017-01-01

With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.
DNApod: DNA polymorphism annotation database from next-generation sequence read archives

PubMed Central

Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

2017-01-01

With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information. PMID:28234924
A RESTful application programming interface for the PubMLST molecular typing and genome databases

PubMed Central

Bray, James E.; Maiden, Martin C. J.

2017-01-01

Abstract Molecular typing is used to differentiate microorganisms at the subspecies or strain level for epidemiological investigations, infection control, public health and environmental sampling. DNA sequence-based typing methods require authoritative databases that link sequence variants to nomenclature in order to facilitate communication and comparison of identified types in national or global settings. The PubMLST website (https://pubmlst.org/) fulfils this role for over a hundred microorganisms for which it hosts curated molecular sequence typing data, providing sequence and allelic profile definitions for multi-locus sequence typing (MLST) and single-gene typing approaches. In recent years, these have expanded to cover the whole genome with schemes such as core genome MLST (cgMLST) and whole genome MLST (wgMLST) which catalogue the allelic diversity found in hundreds to thousands of genes. These approaches provide a common nomenclature for high-resolution strain characterization and comparison. Molecular typing information is linked to isolate provenance, phenotype, and increasingly genome assemblies, providing a resource for outbreak investigation and research in to population structure, gene association, global epidemiology and vaccine coverage. A Representational State Transfer (REST) Application Programming Interface (API) has been developed for the PubMLST website to make these large quantities of structured molecular typing and whole genome sequence data available for programmatic access by any third party application. The API is an integral component of the Bacterial Isolate Genome Sequence Database (BIGSdb) platform that is used to host PubMLST resources, and exposes all public data within the site. In addition to data browsing, searching and download, the API supports authentication and submission of new data to curator queues. Database URL: http://rest.pubmlst.org/ PMID:29220452
Detection of genomic rearrangements in cucumber using genomecmp software

NASA Astrophysics Data System (ADS)

Kulawik, Maciej; Pawełkowicz, Magdalena Ewa; Wojcieszek, Michał; PlÄ der, Wojciech; Nowak, Robert M.

2017-08-01

Comparative genomic by increasing information about the genomes sequences available in the databases is a rapidly evolving science. A simple comparison of the general features of genomes such as genome size, number of genes, and chromosome number presents an entry point into comparative genomic analysis. Here we present the utility of the new tool genomecmp for finding rearrangements across the compared sequences and applications in plant comparative genomics.
Detecting Infections Rapidly and Easily for Candidemia Trial, Part 2 (DIRECT2): A Prospective, Multicenter Study of the T2Candida Panel.

PubMed

Clancy, Cornelius J; Pappas, Peter G; Vazquez, Jose; Judson, Marc A; Kontoyiannis, Dimitrios P; Thompson, George R; Garey, Kevin W; Reboli, Annette; Greenberg, Richard N; Apewokin, Senu; Lyon, G Marshall; Ostrosky-Zeichner, Luis; Wu, Alan H B; Tobin, Ellis; Nguyen, M Hong; Caliendo, Angela M

2018-02-09

Blood cultures are approximately 50% sensitive for diagnosing invasive candidiasis. The T2Candida nanodiagnostic panel uses T2 magnetic resonance and a dedicated instrument to detect Candida directly within whole blood samples. Patients with Candida albicans, Candida glabrata, Candida parapsilosis, Candida tropicalis, or Candida krusei candidemia were identified at 14 centers using diagnostic blood cultures (dBCs). Follow-up blood samples were collected concurrently for testing by T2Candida and companion cultures (cBCs). T2Candida results are reported qualitatively for C. albicans/C. tropicalis, C. glabrata/C. krusei, and C. parapsilosis. T2Candida and cBCs were positive if they detected a species present in the dBC. Median time between collection of dBC and T2Candida/cBC samples in 152 patients was 55.5 hours (range, 16.4-148.4). T2Candida and cBCs were positive in 45% (69/152) and 24% (36/152) of patients, respectively (P < .0001). T2Candida clinical sensitivity was 89%, as positive results were obtained in 32/36 patients with positive cBCs. Combined test results were both positive (T2+/cBC+), 21% (32/152); T2+/cBC-, 24% (37/152); T2-/cBC+, 3% (4/152); and T2-/cBC-, 52% (79/152). Prior antifungal therapy, neutropenia, and C. albicans candidemia were independently associated with T2Candida positivity and T2+/cBC- results (P values < .05). T2Candida was sensitive for diagnosing candidemia at the time of positive blood cultures. In patients receiving antifungal therapy, T2Candida identified bloodstream infections that were missed by cBCs. T2Candida may improve care by shortening times to Candida detection and species identification compared to blood cultures, retaining sensitivity during antifungal therapy and rendering active candidemia unlikely if results are negative. NCT01525095. © The Author(s) 2018. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.
Candida in acute pancreatitis.

PubMed

Chakrabarti, Arunaloke; Rao, Pooja; Tarai, Bansidhar; Shivaprakash, Mandya Rudramurthy; Wig, Jaidev

2007-01-01

A Candida infection of the pancreas, which previously was considered extremely unusual, has been increasingly reported in recent years. The present study was conducted with the aim of performing a cohort analysis of our patients with acute pancreatitis to find out the incidence, sites, and species of Candida involvement; and to evaluate the risk factors, severity, and course of illness of such patients. A total of 335 patients with acute pancreatitis were investigated for a possible Candida infection of the pancreas from January 2000 to May 2003. The clinical records of all those patients who were positive for Candida spp. isolation from pancreatic tissue were analyzed. The clinical records of 32 more cases, randomly selected from the patients who were investigated for candidal pancreatitis but were negative for Candida spp., were also analyzed in order to compare their findings with those patients with a true Candida infection of the pancreas. A true or possible Candida infection was observed in 41 (12.2%) of those 335 patients and Candida tropicalis was the most common isolate (43.9%). Candida spp. were isolated from pancreatic necrotic tissue in 22 (6.6%) patients (true infection). A possible Candida infection (positive drain fluid effluents at least twice, without any Candida isolation from pre/per operative samples from pancreas) was seen in 19 (5.7%) patients. Candida was also isolated exclusively from the blood in another 19 patients with a clinical diagnosis of acute pancreatitis. A risk factor analysis showed that patients with severe injury to the pancreas, on prophylactic fluconazole, and after surgical intervention were significantly more prone to develop a Candida infection. Patients with a Candida superinfection also had a significantly increased hospital stay and higher mortality. This study thus emphasizes the important role of Candida infection in patients with acute pancreatitis and demonstrates the need for early attention.
In silico analysis of cacao (Theobroma cacao L.) genes that involved in pathogen and disease responses

NASA Astrophysics Data System (ADS)

Agung, Muhammad Budi; Budiarsa, I. Made; Suwastika, I. Nengah

2017-02-01

Cocoa bean is one of the main commodities from Indonesia for the world, which still have problem regarding yield degradation due to pathogens and disease attack. Developing robust cacao plant that genetically resistant to pathogen and disease attack is an ideal solution in over taking on this problem. The aim of this study was to identify Theobroma cacao genes on database of cacao genome that homolog to response genes of pathogen and disease attack in other plant, through in silico analysis. Basic information survey and gene identification were performed in GenBank and The Arabidopsis Information Resource database. The In silico analysis contains protein BLAST, homology test of each gene's protein candidates, and identification of homologue gene in Cacao Genome Database using data source "Theobroma cacao cv. Matina 1-6 v1.1" genome. Identification found that Thecc1EG011959t1 (EDS1), Thecc1EG006803t1 (EDS5), Thecc1EG013842t1 (ICS1), and Thecc1EG015614t1 (BG_PPAP) gene of Cacao Genome Database were Theobroma cacao genes that homolog to plant's resistance genes which highly possible to have similar functions of each gene's homologue gene.
Candida baotianmanensis sp. nov. and Candida pseudoviswanathii sp. nov., two ascosporic yeast species isolated from the gut of beetles.

PubMed

Ren, Yong-Cheng; Xu, Long-Long; Zhang, Lin; Hui, Feng-Li

2015-10-01

Four yeast strains were isolated from the gut of beetles collected on Baotianman Mountain and People's Park of Nanyang in Henan Province, China. These strains produced unconjugated asci with one or two ellipsoidal to elongate ascospores in a persistent ascus. Phylogenetic analysis of the D1/D2 domains of the LSU rRNA gene sequences indicated that the isolates represent two novel sexual species in the Candida/Lodderomyces clade. Candida baotianmanensis sp. nov. was located in a statistically well-supported branch together with Candida maltosa. Candida pseudoviswanathii sp. nov. formed a subclade with its closest relative Candida viswanathii supported by a strong bootstrap value. The two novel species were distinguished from their most closely related described species, Candida maltosa and Candida viswanathii, in the D1/D2 LSU rRNA gene and internal transcribed spacer (ITS) sequences and in phenotypic traits. The type strain of Candida baotianmanensis sp. nov. is NYNU 14719T ( = CBS 13915T = CICC 33052T), and the type strain of Candida pseudoviswanathii sp. nov. is NYNU 14772T ( = CBS 13916T = CICC 33053T). The MycoBank numbers for Candida baotianmanensis sp. nov. and Candida pseudoviswanathii sp. nov. are MB 812621 and MB 812622.
Candida asparagi sp. nov., Candida diospyri sp. nov. and Candida qinlingensis sp. nov., novel anamorphic, ascomycetous yeast species.

PubMed

Lu, Hui-Zhong; Jia, Jian-Hua; Wang, Qi-Ming; Bai, Feng-Yan

2004-07-01

Among ascomycetous yeasts that were isolated from several nature reserve areas in China, three anamorphic strains isolated from soil (QL 5-5T) and fruit (QL 21-2T and SN 15-1T) were revealed, by conventional characterization and molecular phylogenetic analysis based on internal transcribed spacer and large subunit (26S) rRNA gene D1/D2 region sequencing, to represent three novel species in the genus Candida. Candida qinlingensis sp. nov. (type strain, QL 5-5T=AS 2.2524T=CBS 9768T) was related closely to a teleomorphic species, Williopsis pratensis. The close relatives of Candida diospyri sp. nov. (type strain, QL 21-2T=AS 2.2525T=CBS 9769T) are Candida friedrichii and Candida membranifaciens. Candida asparagi sp. nov. (type strain, SN 15-1T=AS 2.2526T=CBS 9770T) forms a clade with Candida fructus.
ActiveDriverDB: human disease mutations and genome variation in post-translational modification sites of proteins

PubMed Central

Krassowski, Michal; Paczkowska, Marta; Cullion, Kim; Huang, Tina; Dzneladze, Irakli; Ouellette, B F Francis; Yamada, Joseph T; Fradet-Turcotte, Amelie

2018-01-01

Abstract Interpretation of genetic variation is needed for deciphering genotype-phenotype associations, mechanisms of inherited disease, and cancer driver mutations. Millions of single nucleotide variants (SNVs) in human genomes are known and thousands are associated with disease. An estimated 21% of disease-associated amino acid substitutions corresponding to missense SNVs are located in protein sites of post-translational modifications (PTMs), chemical modifications of amino acids that extend protein function. ActiveDriverDB is a comprehensive human proteo-genomics database that annotates disease mutations and population variants through the lens of PTMs. We integrated >385,000 published PTM sites with ∼3.6 million substitutions from The Cancer Genome Atlas (TCGA), the ClinVar database of disease genes, and human genome sequencing projects. The database includes site-specific interaction networks of proteins, upstream enzymes such as kinases, and drugs targeting these enzymes. We also predicted network-rewiring impact of mutations by analyzing gains and losses of kinase-bound sequence motifs. ActiveDriverDB provides detailed visualization, filtering, browsing and searching options for studying PTM-associated mutations. Users can upload mutation datasets interactively and use our application programming interface in pipelines. Integrative analysis of mutations and PTMs may help decipher molecular mechanisms of phenotypes and disease, as exemplified by case studies of TP53, BRCA2 and VHL. The open-source database is available at https://www.ActiveDriverDB.org. PMID:29126202

PGG.Population: a database for understanding the genomic diversity and genetic ancestry of human populations

PubMed Central

Zhang, Chao; Gao, Yang; Liu, Jiaojiao; Xue, Zhe; Lu, Yan; Deng, Lian; Tian, Lei; Feng, Qidi

2018-01-01

Abstract There are a growing number of studies focusing on delineating genetic variations that are associated with complex human traits and diseases due to recent advances in next-generation sequencing technologies. However, identifying and prioritizing disease-associated causal variants relies on understanding the distribution of genetic variations within and among populations. The PGG.Population database documents 7122 genomes representing 356 global populations from 107 countries and provides essential information for researchers to understand human genomic diversity and genetic ancestry. These data and information can facilitate the design of research studies and the interpretation of results of both evolutionary and medical studies involving human populations. The database is carefully maintained and constantly updated when new data are available. We included miscellaneous functions and a user-friendly graphical interface for visualization of genomic diversity, population relationships (genetic affinity), ancestral makeup, footprints of natural selection, and population history etc. Moreover, PGG.Population provides a useful feature for users to analyze data and visualize results in a dynamic style via online illustration. The long-term ambition of the PGG.Population, together with the joint efforts from other researchers who contribute their data to our database, is to create a comprehensive depository of geographic and ethnic variation of human genome, as well as a platform bringing influence on future practitioners of medicine and clinical investigators. PGG.Population is available at https://www.pggpopulation.org. PMID:29112749
HPMCD: the database of human microbial communities from metagenomic datasets and microbial reference genomes.

PubMed

Forster, Samuel C; Browne, Hilary P; Kumar, Nitin; Hunt, Martin; Denise, Hubert; Mitchell, Alex; Finn, Robert D; Lawley, Trevor D

2016-01-04

The Human Pan-Microbe Communities (HPMC) database (http://www.hpmcd.org/) provides a manually curated, searchable, metagenomic resource to facilitate investigation of human gastrointestinal microbiota. Over the past decade, the application of metagenome sequencing to elucidate the microbial composition and functional capacity present in the human microbiome has revolutionized many concepts in our basic biology. When sufficient high quality reference genomes are available, whole genome metagenomic sequencing can provide direct biological insights and high-resolution classification. The HPMC database provides species level, standardized phylogenetic classification of over 1800 human gastrointestinal metagenomic samples. This is achieved by combining a manually curated list of bacterial genomes from human faecal samples with over 21000 additional reference genomes representing bacteria, viruses, archaea and fungi with manually curated species classification and enhanced sample metadata annotation. A user-friendly, web-based interface provides the ability to search for (i) microbial groups associated with health or disease state, (ii) health or disease states and community structure associated with a microbial group, (iii) the enrichment of a microbial gene or sequence and (iv) enrichment of a functional annotation. The HPMC database enables detailed analysis of human microbial communities and supports research from basic microbiology and immunology to therapeutic development in human health and disease. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects

PubMed Central

Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf

2014-01-01

CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB PMID:25281234
The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases

PubMed Central

Caspi, Ron; Altman, Tomer; Dale, Joseph M.; Dreher, Kate; Fulcher, Carol A.; Gilham, Fred; Kaipa, Pallavi; Karthikeyan, Athikkattuvalasu S.; Kothari, Anamika; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A.; Paley, Suzanne; Popescu, Liviu; Pujar, Anuradha; Shearer, Alexander G.; Zhang, Peifen; Karp, Peter D.

2010-01-01

The MetaCyc database (MetaCyc.org) is a comprehensive and freely accessible resource for metabolic pathways and enzymes from all domains of life. The pathways in MetaCyc are experimentally determined, small-molecule metabolic pathways and are curated from the primary scientific literature. With more than 1400 pathways, MetaCyc is the largest collection of metabolic pathways currently available. Pathways reactions are linked to one or more well-characterized enzymes, and both pathways and enzymes are annotated with reviews, evidence codes, and literature citations. BioCyc (BioCyc.org) is a collection of more than 500 organism-specific Pathway/Genome Databases (PGDBs). Each BioCyc PGDB contains the full genome and predicted metabolic network of one organism. The network, which is predicted by the Pathway Tools software using MetaCyc as a reference, consists of metabolites, enzymes, reactions and metabolic pathways. BioCyc PGDBs also contain additional features, such as predicted operons, transport systems, and pathway hole-fillers. The BioCyc Web site offers several tools for the analysis of the PGDBs, including Omics Viewers that enable visualization of omics datasets on two different genome-scale diagrams and tools for comparative analysis. The BioCyc PGDBs generated by SRI are offered for adoption by any party interested in curation of metabolic, regulatory, and genome-related information about an organism. PMID:19850718
Molecular Genetics Information System (MOLGENIS): alternatives in developing local experimental genomics databases.

PubMed

Swertz, Morris A; De Brock, E O; Van Hijum, Sacha A F T; De Jong, Anne; Buist, Girbe; Baerends, Richard J S; Kok, Jan; Kuipers, Oscar P; Jansen, Ritsert C

2004-09-01

Genomic research laboratories need adequate infrastructure to support management of their data production and research workflow. But what makes infrastructure adequate? A lack of appropriate criteria makes any decision on buying or developing a system difficult. Here, we report on the decision process for the case of a molecular genetics group establishing a microarray laboratory. Five typical requirements for experimental genomics database systems were identified: (i) evolution ability to keep up with the fast developing genomics field; (ii) a suitable data model to deal with local diversity; (iii) suitable storage of data files in the system; (iv) easy exchange with other software; and (v) low maintenance costs. The computer scientists and the researchers of the local microarray laboratory considered alternative solutions for these five requirements and chose the following options: (i) use of automatic code generation; (ii) a customized data model based on standards; (iii) storage of datasets as black boxes instead of decomposing them in database tables; (iv) loosely linking to other programs for improved flexibility; and (v) a low-maintenance web-based user interface. Our team evaluated existing microarray databases and then decided to build a new system, Molecular Genetics Information System (MOLGENIS), implemented using code generation in a period of three months. This case can provide valuable insights and lessons to both software developers and a user community embarking on large-scale genomic projects. http://www.molgenis.nl
CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects.

PubMed

Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf

2014-01-01

CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB. © The Author(s) 2014. Published by Oxford University Press.
Relational databases: a transparent framework for encouraging biology students to think informatically.

PubMed

Rice, Michael; Gladstone, William; Weir, Michael

2004-01-01

We discuss how relational databases constitute an ideal framework for representing and analyzing large-scale genomic data sets in biology. As a case study, we describe a Drosophila splice-site database that we recently developed at Wesleyan University for use in research and teaching. The database stores data about splice sites computed by a custom algorithm using Drosophila cDNA transcripts and genomic DNA and supports a set of procedures for analyzing splice-site sequence space. A generic Web interface permits the execution of the procedures with a variety of parameter settings and also supports custom structured query language queries. Moreover, new analytical procedures can be added by updating special metatables in the database without altering the Web interface. The database provides a powerful setting for students to develop informatic thinking skills.
Relational Databases: A Transparent Framework for Encouraging Biology Students To Think Informatically

PubMed Central

2004-01-01

We discuss how relational databases constitute an ideal framework for representing and analyzing large-scale genomic data sets in biology. As a case study, we describe a Drosophila splice-site database that we recently developed at Wesleyan University for use in research and teaching. The database stores data about splice sites computed by a custom algorithm using Drosophila cDNA transcripts and genomic DNA and supports a set of procedures for analyzing splice-site sequence space. A generic Web interface permits the execution of the procedures with a variety of parameter settings and also supports custom structured query language queries. Moreover, new analytical procedures can be added by updating special metatables in the database without altering the Web interface. The database provides a powerful setting for students to develop informatic thinking skills. PMID:15592597
Update on Genomic Databases and Resources at the National Center for Biotechnology Information.

PubMed

Tatusova, Tatiana

2016-01-01

The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data.
Characterization of Candida species isolated from cases of lower respiratory tract infection.

PubMed

Jha, B J; Dey, S; Tamang, M D; Joshy, M E; Shivananda, P G; Brahmadatan, K N

2006-01-01

(1) To identify and characterize the Candida species isolates from lower respiratory tract infection. (2) to determine the rate of isolation of Candida species from sputum samples. This study was carried out in the Department of Microbiology, Manipal Teaching Hospital, Pokhara, Nepal from June 2002 to January 2003. A total of 462 sputum samples were collected from patients suspected lower respiratory tract infection. The samples were processed as Gram staining to find out the suitability of the specimen, cultured on Sabouraud's Dextrose Agar (SDA) and also on blood agar and chocolate agar to identify the potential lower respiratory tract pathogens. For the identification of Candida, sputum samples were processed for Gram stain, culture, germ tube test, production of chlamydospore, sugar fermentation and assimilation test. For the identification of bacteria, Gram stain, culture, and biochemical tests were performed by standardized procedure. Out of 462 samples, 246 (53.24%) samples grew potential pathogens of lower respiratory tract. Among them Haemophilus influenzae 61(24.79%) and Streptococcus pneumoniae 57 (23.17%) were the predominant bacterial pathogens. Candida species were isolated from 30 samples (12.2%). The majority of Candida species amongst the Candida isolates were Candida albicans 21(70%) followed by Candida tropicalis 4(13.33%). Candida krusei 3(10%), Candida parapsilosis 1(3.33%) and Candida stellatoidea 1(3.33%). The highest rate of isolation of Candida was between the age of 71 and 80. Candida isolation from sputum samples is important as found in the present study in which Candida species were the third most common pathogen isolated from patients with lower respiratory tract infection.
Candida species diversity and antifungal susceptibility patterns in oral samples of HIV/AIDS patients in Baja California, Mexico.

PubMed

Clark-Ordóñez, Isadora; Callejas-Negrete, Olga A; Aréchiga-Carvajal, Elva T; Mouriño-Pérez, Rosa R

2017-04-01

Candidiasis is the most common opportunistic fungal infection in HIV patients. The aims of this study were to identify the prevalence of carriers of Candida, Candida species diversity, and in vitro susceptibility to antifungal drugs. In 297 HIV/AIDS patients in Baja California, Mexico, Candida strains were identified by molecular methods (PCR-RFLP) from isolates of oral rinses of patients in Tijuana, Mexicali, and Ensenada. 56.3% of patients were colonized or infected with Candida. In Tijuana, there was a significantly higher percentage of carriers (75.5%). Out of the 181 strains that were isolated, 71.8% were Candida albicans and 28.2% were non-albicans species. The most common non-albicans species was Candida tropicalis (12.2%), followed by Candida glabrata (8.3%), Candida parapsilosis (2.2%), Candida krusei (1.7%), and Candida guilliermondii (1.1%). Candida dubliniensis was not isolated. Two associated species were found in 11 patients. In Mexicali and Ensenada, there was a lower proportion of Candida carriers compared to other regions in Mexico and worldwide, however, in Tijuana, a border town with many peculiarities, a higher carrier rate was found. In this population, only a high viral load was associated with oral Candida carriers. Other factors such as gender, use of antiretroviral therapy, CD4+ T-lymphocyte levels, time since diagnosis, and alcohol/ tobacco consumption, were not associated with Candida carriers. © The Author 2016. Published by Oxford University Press on behalf of The International Society for Human and Animal Mycology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
BeetleBase in 2010: Revisions to Provide Comprehensive Genomic Information for Tribolium castaneum

USDA-ARS?s Scientific Manuscript database

BeetleBase (http://www.beetlebase.org) has been updated to provide more comprehensive genomic information for the red flour beetle Tribolium castaneum. The database contains genomic sequence scaffolds mapped to 10 linkage groups (genome assembly release Tcas_3.0), genetic linkage maps, the official ...
Yeast microbiota of natural cavities of manatees (Trichechus inunguis and Trichechus manatus) in Brazil and its relevance for animal health and management in captivity.

PubMed

Sidrim, José Júlio Costa; Carvalho, Vitor Luz; Castelo-Branco, Débora de Souza Collares Maia; Brilhante, Raimunda Sâmia Nogueira; Bandeira, Tereza de Jesus Pinheiro Gomes; Cordeiro, Rossana de Aguiar; Guedes, Gláucia Morgana de Melo; Barbosa, Giovanna Riello; Lazzarini, Stella Maris; Oliveira, Daniella Carvalho Ribeiro; de Meirelles, Ana Carolina Oliveira; Attademo, Fernanda Löffler Niemeyer; Freire, Augusto Carlos da Bôaviagem; Moreira, José Luciano Bezerra; Monteiro, André Jalles; Rocha, Marcos Fábio Gadelha

2015-10-01

The aim of this study was to characterize the yeast microbiota of natural cavities of manatees kept in captivity in Brazil. Sterile swabs from the oral cavity, nostrils, genital opening, and rectum of 50 Trichechus inunguis and 26 Trichechus manatus were collected. The samples were plated on Sabouraud agar with chloramphenicol and incubated at 25 °C for 5 days. The yeasts isolated were phenotypically identified by biochemical and micromorphological tests. Overall, 141 strains were isolated, of which 112 were from T. inunguis (Candida albicans, Candida parapsilosis sensu stricto, Candida orthopsilosis, Candida metapsilosis, Candida guilliermondii, Candida pelliculosa, Candida tropicalis, Candida glabrata, Candida famata, Candida krusei, Candida norvegensis, Candida ciferri, Trichosporon sp., Rhodotorula sp., Cryptococcus laurentii) and 29 were from T. manatus (C. albicans, C. tropicalis, C. famata, C. guilliermondii, C. krusei, Rhodotorula sp., Rhodotorula mucilaginosa, Rhodotorula minuta, Trichosporon sp.). This was the first systematic study to investigate the importance of yeasts as components of the microbiota of sirenians, demonstrating the presence of potentially pathogenic species, which highlights the importance of maintaining adequate artificial conditions for the health of captive manatees.
NGSmethDB 2017: enhanced methylomes and differential methylation.

PubMed

Lebrón, Ricardo; Gómez-Martín, Cristina; Carpena, Pedro; Bernaola-Galván, Pedro; Barturen, Guillermo; Hackenberg, Michael; Oliver, José L

2017-01-04

The 2017 update of NGSmethDB stores whole genome methylomes generated from short-read data sets obtained by bisulfite sequencing (WGBS) technology. To generate high-quality methylomes, stringent quality controls were integrated with third-part software, adding also a two-step mapping process to exploit the advantages of the new genome assembly models. The samples were all profiled under constant parameter settings, thus enabling comparative downstream analyses. Besides a significant increase in the number of samples, NGSmethDB now includes two additional data-types, which are a valuable resource for the discovery of methylation epigenetic biomarkers: (i) differentially methylated single-cytosines; and (ii) methylation segments (i.e. genome regions of homogeneous methylation). The NGSmethDB back-end is now based on MongoDB, a NoSQL hierarchical database using JSON-formatted documents and dynamic schemas, thus accelerating sample comparative analyses. Besides conventional database dumps, track hubs were implemented, which improved database access, visualization in genome browsers and comparative analyses to third-part annotations. In addition, the database can be also accessed through a RESTful API. Lastly, a Python client and a multiplatform virtual machine allow for program-driven access from user desktop. This way, private methylation data can be compared to NGSmethDB without the need to upload them to public servers. Database website: http://bioinfo2.ugr.es/NGSmethDB. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ensembl core software resources: storage and programmatic access for DNA sequence and genome annotation.

PubMed

Ruffier, Magali; Kähäri, Andreas; Komorowska, Monika; Keenan, Stephen; Laird, Matthew; Longden, Ian; Proctor, Glenn; Searle, Steve; Staines, Daniel; Taylor, Kieron; Vullo, Alessandro; Yates, Andrew; Zerbino, Daniel; Flicek, Paul

2017-01-01

The Ensembl software resources are a stable infrastructure to store, access and manipulate genome assemblies and their functional annotations. The Ensembl 'Core' database and Application Programming Interface (API) was our first major piece of software infrastructure and remains at the centre of all of our genome resources. Since its initial design more than fifteen years ago, the number of publicly available genomic, transcriptomic and proteomic datasets has grown enormously, accelerated by continuous advances in DNA-sequencing technology. Initially intended to provide annotation for the reference human genome, we have extended our framework to support the genomes of all species as well as richer assembly models. Cross-referenced links to other informatics resources facilitate searching our database with a variety of popular identifiers such as UniProt and RefSeq. Our comprehensive and robust framework storing a large diversity of genome annotations in one location serves as a platform for other groups to generate and maintain their own tailored annotation. We welcome reuse and contributions: our databases and APIs are publicly available, all of our source code is released with a permissive Apache v2.0 licence at http://github.com/Ensembl and we have an active developer mailing list ( http://www.ensembl.org/info/about/contact/index.html ). http://www.ensembl.org. © The Author(s) 2017. Published by Oxford University Press.
Candida/Candida biofilms. First description of dual-species Candida albicans/C. rugosa biofilm.

PubMed

Martins, Carlos Henrique Gomes; Pires, Regina Helena; Cunha, Aline Oliveira; Pereira, Cristiane Aparecida Martins; Singulani, Junya de Lacorte; Abrão, Fariza; Moraes, Thais de; Mendes-Giannini, Maria José Soares

2016-04-01

Denture liners have physical properties that favour plaque accumulation and colonization by Candida species, irritating oral tissues and causing denture stomatitis. To isolate and determine the incidence of oral Candida species in dental prostheses, oral swabs were collected from the dental prostheses of 66 patients. All the strains were screened for their ability to form biofilms; both monospecies and dual-species combinations were tested. Candida albicans (63 %) was the most frequently isolated microorganism; Candida tropicalis (14 %), Candida glabrata (13 %), Candida rugosa (5 %), Candida parapsilosis (3 %), and Candida krusei (2 %) were also detected. The XTT assay showed that C. albicans SC5314 possessed a biofilm-forming ability significantly higher (p < 0.001) than non-albicans Candida strains, after 6 h 37 °C. The total C. albicans CFU from a dual-species biofilm was less than the total CFU of a monospecies C. albicans biofilm. In contrast to the profuse hyphae verified in monospecies C. albicans biofilms, micrographies showed that the C. albicans/non-albicans Candida biofilms consisted of sparse yeast forms and profuse budding yeast cells that generated a network. These results suggested that C. albicans and the tested Candida species could co-exist in biofilms displaying apparent antagonism. The study provide the first description of C. albicans/C. rugosa mixed biofilm. Copyright © 2016 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Atypical Squamous Cells in Liquid-Based Cervical Cytology: Microbiology, Inflammatory Infiltrate, and Human Papillomavirus-DNA Testing.

PubMed

Gomes de Oliveira, Geilson; Eleutério, Renata Mirian Nunes; Silveira Gonçalves, Ana Katherine; Giraldo, Paulo César; Eleutério, José

2018-01-01

The aim of this study was to assess the correlation between atypical squamous cells (ASC) and inflammatory infiltrate and vaginal microbiota using cervical liquid-based cytological (SurePath®) and high-risk human papillomavirus (HR-HPV) tests. A cross-sectional study was conducted using a 6-year database from a laboratory in Fortaleza (Brazil). Files from 1,346 ASC cases were divided into subgroups and results concerning inflammation and vaginal microorganisms diagnosed by cytology were compared with HR-HPV test results. An absence of specific microorganisms (ASM) was the most frequent finding (ASC of undetermined significance, ASC-US = 74%; ASC - cannot exclude high-grade squamous intraepithelial lesion, ASC-H = 68%), followed by bacterial vaginosis (ASC-US = 20%; ASC- H = 25%) and Candida spp. (ASC-US = 6%; ASC-H = 5%). Leukocyte infiltrate was present in 71% of ASC-US and 85% of ASC-H (p = 0.0040), and in these specific cases HR-HPV tests were positive for 65 and 64%, respectively. A positive HR-HPV test was relatively more frequent when a specific microorganism was present, and Candida spp. was associated with HR-HPV-positive results (p = 0.0156), while an ASM was associated with negative HR-HPV results (p = 0.0370). ASC-US is associated with an absence of inflammation or vaginosis, while ASC-H smears are associated with Trichomonas vaginalis and inflammatory infiltrate. A positive HR-HPV is associated with Candida spp. in ASC cytology. © 2017 S. Karger AG, Basel.
Oral tissues and orthodontic treatment: common side effects.

PubMed

Farronato, G; Giannini, L; Galbiati, G; Cannalire, P; Martinelli, G; Tubertini, I; Maspero, C

2013-01-01

The aim of this paper was to provide a literature review about the problems that can occur during orthodontic treatment. Using the PubMed database we collected items that would provide information regarding the direct consequences of the placement of an orthodontic appliance: coming to the discussion of the following topics: candida infections, the effects on the soft tissues, the effects on periodontal tissues and effects on hard tissues. The presence of appliances in the oral cavity increases the prevalence of people with candida, specifically the species Candida Albicans is the most frequently isolated. The balance between the clearance of the microorganism, the colonization and the state of candidiasis depends both on the virulence of the fungus, and the competence of the host immune system. On soft tissues, cases of ulceration of the upper jaw by a rapid palatal expander and pyogenic granuloma due to quad helix appliance have been reported. The second one is mostly observed on vestibular gingiva. The first one was found, however, in patients suffering from diabetes mellitus type 1 because of the tissue modifications induced by this pathological condition. The more severe periodontic effects are those caused by incorrect use of orthodontic elastic separators. Finally, the White Spot Lesions are the direct consequences of a wrong conditioning of enamel when attaching the bracket. They represent a first stage of caries in the positioning area of the bracket. The orthodontist is required to intercept these issues not to affect the success of the treatment.
EUCANEXT: an integrated database for the exploration of genomic and transcriptomic data from Eucalyptus species

PubMed Central

Nascimento, Leandro Costa; Salazar, Marcela Mendes; Lepikson-Neto, Jorge; Camargo, Eduardo Leal Oliveira; Parreiras, Lucas Salera; Carazzolle, Marcelo Falsarella

2017-01-01

Abstract Tree species of the genus Eucalyptus are the most valuable and widely planted hardwoods in the world. Given the economic importance of Eucalyptus trees, much effort has been made towards the generation of specimens with superior forestry properties that can deliver high-quality feedstocks, customized to the industrýs needs for both cellulosic (paper) and lignocellulosic biomass production. In line with these efforts, large sets of molecular data have been generated by several scientific groups, providing invaluable information that can be applied in the development of improved specimens. In order to fully explore the potential of available datasets, the development of a public database that provides integrated access to genomic and transcriptomic data from Eucalyptus is needed. EUCANEXT is a database that analyses and integrates publicly available Eucalyptus molecular data, such as the E. grandis genome assembly and predicted genes, ESTs from several species and digital gene expression from 26 RNA-Seq libraries. The database has been implemented in a Fedora Linux machine running MySQL and Apache, while Perl CGI was used for the web interfaces. EUCANEXT provides a user-friendly web interface for easy access and analysis of publicly available molecular data from Eucalyptus species. This integrated database allows for complex searches by gene name, keyword or sequence similarity and is publicly accessible at http://www.lge.ibi.unicamp.br/eucalyptusdb. Through EUCANEXT, users can perform complex analysis to identify genes related traits of interest using RNA-Seq libraries and tools for differential expression analysis. Moreover, all the bioinformatics pipeline here described, including the database schema and PERL scripts, are readily available and can be applied to any genomic and transcriptomic project, regardless of the organism. Database URL: http://www.lge.ibi.unicamp.br/eucalyptusdb PMID:29220468
Creation of a Genome-Wide Metabolic Pathway Database for Populus trichocarpa Using a New Approach for Reconstruction and Curation of Metabolic Pathways for Plants1[W][OA

PubMed Central

Zhang, Peifen; Dreher, Kate; Karthikeyan, A.; Chi, Anjo; Pujar, Anuradha; Caspi, Ron; Karp, Peter; Kirkup, Vanessa; Latendresse, Mario; Lee, Cynthia; Mueller, Lukas A.; Muller, Robert; Rhee, Seung Yon

2010-01-01

Metabolic networks reconstructed from sequenced genomes or transcriptomes can help visualize and analyze large-scale experimental data, predict metabolic phenotypes, discover enzymes, engineer metabolic pathways, and study metabolic pathway evolution. We developed a general approach for reconstructing metabolic pathway complements of plant genomes. Two new reference databases were created and added to the core of the infrastructure: a comprehensive, all-plant reference pathway database, PlantCyc, and a reference enzyme sequence database, RESD, for annotating metabolic functions of protein sequences. PlantCyc (version 3.0) includes 714 metabolic pathways and 2,619 reactions from over 300 species. RESD (version 1.0) contains 14,187 literature-supported enzyme sequences from across all kingdoms. We used RESD, PlantCyc, and MetaCyc (an all-species reference metabolic pathway database), in conjunction with the pathway prediction software Pathway Tools, to reconstruct a metabolic pathway database, PoplarCyc, from the recently sequenced genome of Populus trichocarpa. PoplarCyc (version 1.0) contains 321 pathways with 1,807 assigned enzymes. Comparing PoplarCyc (version 1.0) with AraCyc (version 6.0, Arabidopsis [Arabidopsis thaliana]) showed comparable numbers of pathways distributed across all domains of metabolism in both databases, except for a higher number of AraCyc pathways in secondary metabolism and a 1.5-fold increase in carbohydrate metabolic enzymes in PoplarCyc. Here, we introduce these new resources and demonstrate the feasibility of using them to identify candidate enzymes for specific pathways and to analyze metabolite profiling data through concrete examples. These resources can be searched by text or BLAST, browsed, and downloaded from our project Web site (http://plantcyc.org). PMID:20522724

EU Laws on Privacy in Genomic Databases and Biobanking.

PubMed

Townend, David

2016-03-01

Both the European Union and the Council of Europe have a bearing on privacy in genomic databases and biobanking. In terms of legislation, the processing of personal data as it relates to the right to privacy is currently largely regulated in Europe by Directive 95/46/EC, which requires that processing be "fair and lawful" and follow a set of principles, meaning that the data be processed only for stated purposes, be sufficient for the purposes of the processing, be kept only for so long as is necessary to achieve those purposes, and be kept securely and only in an identifiable state for such time as is necessary for the processing. The European privacy regime does not require the de-identification (anonymization) of personal data used in genomic databases or biobanks, and alongside this practice informed consent as well as governance and oversight mechanisms provide for the protection of genomic data. © 2016 American Society of Law, Medicine & Ethics.
UCbase 2.0: ultraconserved sequences database (2014 update).

PubMed

Lomonaco, Vincenzo; Martoglia, Riccardo; Mandreoli, Federica; Anderlucci, Laura; Emmett, Warren; Bicciato, Silvio; Taccioli, Cristian

2014-01-01

UCbase 2.0 (http://ucbase.unimore.it) is an update, extension and evolution of UCbase, a Web tool dedicated to the analysis of ultraconserved sequences (UCRs). UCRs are 481 sequences >200 bases sharing 100% identity among human, mouse and rat genomes. They are frequently located in genomic regions known to be involved in cancer or differentially expressed in human leukemias and carcinomas. UCbase 2.0 is a platform-independent Web resource that includes the updated version of the human genome annotation (hg19), information linking disorders to chromosomal coordinates based on the Systematized Nomenclature of Medicine classification, a query tool to search for Single Nucleotide Polymorphisms (SNPs) and a new text box to directly interrogate the database using a MySQL interface. To facilitate the interactive visual interpretation of UCR chromosomal positioning, UCbase 2.0 now includes a graph visualization interface directly linked to UCSC genome browser. Database URL: http://ucbase.unimore.it. © The Author(s) 2014. Published by Oxford University Press.
Open Window: When Easily Identifiable Genomes and Traits Are in the Public Domain

PubMed Central

Angrist, Misha

2014-01-01

“One can't be of an enquiring and experimental nature, and still be very sensible.” - Charles Fort [1] As the costs of personal genetic testing “self-quantification” fall, publicly accessible databases housing people's genotypic and phenotypic information are gradually increasing in number and scope. The latest entrant is openSNP, which allows participants to upload their personal genetic/genomic and self-reported phenotypic data. I believe the emergence of such open repositories of human biological data is a natural reflection of inquisitive and digitally literate people's desires to make genomic and phenotypic information more easily available to a community beyond the research establishment. Such unfettered databases hold the promise of contributing mightily to science, science education and medicine. That said, in an age of increasingly widespread governmental and corporate surveillance, we would do well to be mindful that genomic DNA is uniquely identifying. Participants in open biological databases are engaged in a real-time experiment whose outcome is unknown. PMID:24647311
Mouse Genome Database: From sequence to phenotypes and disease models

PubMed Central

Richardson, Joel E.; Kadin, James A.; Smith, Cynthia L.; Blake, Judith A.; Bult, Carol J.

2015-01-01

Summary The Mouse Genome Database (MGD, www.informatics.jax.org) is the international scientific database for genetic, genomic, and biological data on the laboratory mouse to support the research requirements of the biomedical community. To accomplish this goal, MGD provides broad data coverage, serves as the authoritative standard for mouse nomenclature for genes, mutants, and strains, and curates and integrates many types of data from literature and electronic sources. Among the key data sets MGD supports are: the complete catalog of mouse genes and genome features, comparative homology data for mouse and vertebrate genes, the authoritative set of Gene Ontology (GO) annotations for mouse gene functions, a comprehensive catalog of mouse mutations and their phenotypes, and a curated compendium of mouse models of human diseases. Here, we describe the data acquisition process, specifics about MGD's key data areas, methods to access and query MGD data, and outreach and user help facilities. genesis 53:458–473, 2015. © 2015 The Authors. Genesis Published by Wiley Periodicals, Inc. PMID:26150326
Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence-Function Space and Genome Context to Discover Novel Functions.

PubMed

Gerlt, John A

2017-08-22

The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of "genomic enzymology" web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence-function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems.
Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence–Function Space and Genome Context to Discover Novel Functions

PubMed Central

2017-01-01

The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of “genomic enzymology” web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence–function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems. PMID:28826221
The emergence of commercial genomics: analysis of the rise of a biotechnology subsector during the Human Genome Project, 1990 to 2004

PubMed Central

2013-01-01

Background Development of the commercial genomics sector within the biotechnology industry relied heavily on the scientific commons, public funding, and technology transfer between academic and industrial research. This study tracks financial and intellectual property data on genomics firms from 1990 through 2004, thus following these firms as they emerged in the era of the Human Genome Project and through the 2000 to 2001 market bubble. Methods A database was created based on an early survey of genomics firms, which was expanded using three web-based biotechnology services, scientific journals, and biotechnology trade and technical publications. Financial data for publicly traded firms was collected through the use of four databases specializing in firm financials. Patent searches were conducted using firm names in the US Patent and Trademark Office website search engine and the DNA Patent Database. Results A biotechnology subsector of genomics firms emerged in parallel to the publicly funded Human Genome Project. Trends among top firms show that hiring, capital improvement, and research and development expenditures continued to grow after a 2000 to 2001 bubble. The majority of firms are small businesses with great diversity in type of research and development, products, and services provided. Over half the public firms holding patents have the majority of their intellectual property portfolio in DNA-based patents. Conclusions These data allow estimates of investment, research and development expenditures, and jobs that paralleled the rise of genomics as a sector within biotechnology between 1990 and 2004. PMID:24050173
The need for high-quality whole-genome sequence databases in microbial forensics.

PubMed

Sjödin, Andreas; Broman, Tina; Melefors, Öjar; Andersson, Gunnar; Rasmusson, Birgitta; Knutsson, Rickard; Forsman, Mats

2013-09-01

Microbial forensics is an important part of a strengthened capability to respond to biocrime and bioterrorism incidents to aid in the complex task of distinguishing between natural outbreaks and deliberate acts. The goal of a microbial forensic investigation is to identify and criminally prosecute those responsible for a biological attack, and it involves a detailed analysis of the weapon--that is, the pathogen. The recent development of next-generation sequencing (NGS) technologies has greatly increased the resolution that can be achieved in microbial forensic analyses. It is now possible to identify, quickly and in an unbiased manner, previously undetectable genome differences between closely related isolates. This development is particularly relevant for the most deadly bacterial diseases that are caused by bacterial lineages with extremely low levels of genetic diversity. Whole-genome analysis of pathogens is envisaged to be increasingly essential for this purpose. In a microbial forensic context, whole-genome sequence analysis is the ultimate method for strain comparisons as it is informative during identification, characterization, and attribution--all 3 major stages of the investigation--and at all levels of microbial strain identity resolution (ie, it resolves the full spectrum from family to isolate). Given these capabilities, one bottleneck in microbial forensics investigations is the availability of high-quality reference databases of bacterial whole-genome sequences. To be of high quality, databases need to be curated and accurate in terms of sequences, metadata, and genetic diversity coverage. The development of whole-genome sequence databases will be instrumental in successfully tracing pathogens in the future.
Database Resources of the BIG Data Center in 2018.

PubMed

2018-01-04

The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Application of Genetic/Genomic Approaches to Allergic Disorders

PubMed Central

Baye, Tesfaye M.; Martin, Lisa J.; Khurana Hershey, Gurjit K.

2010-01-01

Completion of the human genome project and rapid progress in genetics and bioinformatics have enabled the development of large public databases, which include genetic and genomic data linked to clinical health data. With the massive amount of information available, clinicians and researchers have the unique opportunity to complement and integrate their daily practice with the existing resources to clarify the underlying etiology of complex phenotypes such as allergic diseases. The genome itself is now often utilized as a starting point for many studies and multiple innovative approaches have emerged applying genetic/genomic strategies to key questions in the field of allergy and immunology. There have been several successes, which have uncovered new insights into the biologic underpinnings of allergic disorders. Herein, we will provide an in depth review of genomic approaches to identifying genes and biologic networks involved in allergic diseases. We will discuss genetic and phenotypic variation, statistical approaches for gene discovery, public databases, functional genomics, clinical implications, and the challenges that remain. PMID:20638111
TRACTOR_DB: a database of regulatory networks in gamma-proteobacterial genomes

PubMed Central

González, Abel D.; Espinosa, Vladimir; Vasconcelos, Ana T.; Pérez-Rueda, Ernesto; Collado-Vides, Julio

2005-01-01

Experimental data on the Escherichia coli transcriptional regulatory system has been used in the past years to predict new regulatory elements (promoters, transcription factors (TFs), TFs' binding sites and operons) within its genome. As more genomes of gamma-proteobacteria are being sequenced, the prediction of these elements in a growing number of organisms has become more feasible, as a step towards the study of how different bacteria respond to environmental changes at the level of transcriptional regulation. In this work, we present TRACTOR_DB (TRAnscription FaCTORs' predicted binding sites in prokaryotic genomes), a relational database that contains computational predictions of new members of 74 regulons in 17 gamma-proteobacterial genomes. For these predictions we used a comparative genomics approach regarding which several proof-of-principle articles for large regulons have been published. TRACTOR_DB may be currently accessed at http://www.bioinfo.cu/Tractor_DB, http://www.tractor.lncc.br/ or at http://www.cifn.unam.mx/Computational_Genomics/tractorDB. Contact Email id is tractor@cifn.unam.mx. PMID:15608293
SorghumFDB: sorghum functional genomics database with multidimensional network analysis.

PubMed

Tian, Tian; You, Qi; Zhang, Liwei; Yi, Xin; Yan, Hengyu; Xu, Wenying; Su, Zhen

2016-01-01

Sorghum (Sorghum bicolor [L.] Moench) has excellent agronomic traits and biological properties, such as heat and drought-tolerance. It is a C4 grass and potential bioenergy-producing plant, which makes it an important crop worldwide. With the sorghum genome sequence released, it is essential to establish a sorghum functional genomics data mining platform. We collected genomic data and some functional annotations to construct a sorghum functional genomics database (SorghumFDB). SorghumFDB integrated knowledge of sorghum gene family classifications (transcription regulators/factors, carbohydrate-active enzymes, protein kinases, ubiquitins, cytochrome P450, monolignol biosynthesis related enzymes, R-genes and organelle-genes), detailed gene annotations, miRNA and target gene information, orthologous pairs in the model plants Arabidopsis, rice and maize, gene loci conversions and a genome browser. We further constructed a dynamic network of multidimensional biological relationships, comprised of the co-expression data, protein-protein interactions and miRNA-target pairs. We took effective measures to combine the network, gene set enrichment and motif analyses to determine the key regulators that participate in related metabolic pathways, such as the lignin pathway, which is a major biological process in bioenergy-producing plants.Database URL: http://structuralbiology.cau.edu.cn/sorghum/index.html. © The Author(s) 2016. Published by Oxford University Press.
The Global Genome Biodiversity Network (GGBN) Data Standard specification.

PubMed

Droege, G; Barker, K; Seberg, O; Coddington, J; Benson, E; Berendsohn, W G; Bunk, B; Butler, C; Cawsey, E M; Deck, J; Döring, M; Flemons, P; Gemeinholzer, B; Güntsch, A; Hollowell, T; Kelbert, P; Kostadinov, I; Kottmann, R; Lawlor, R T; Lyal, C; Mackenzie-Dodds, J; Meyer, C; Mulcahy, D; Nussbeck, S Y; O'Tuama, É; Orrell, T; Petersen, G; Robertson, T; Söhngen, C; Whitacre, J; Wieczorek, J; Yilmaz, P; Zetzsche, H; Zhang, Y; Zhou, X

2016-01-01

Genomic samples of non-model organisms are becoming increasingly important in a broad range of studies from developmental biology, biodiversity analyses, to conservation. Genomic sample definition, description, quality, voucher information and metadata all need to be digitized and disseminated across scientific communities. This information needs to be concise and consistent in today's ever-increasing bioinformatic era, for complementary data aggregators to easily map databases to one another. In order to facilitate exchange of information on genomic samples and their derived data, the Global Genome Biodiversity Network (GGBN) Data Standard is intended to provide a platform based on a documented agreement to promote the efficient sharing and usage of genomic sample material and associated specimen information in a consistent way. The new data standard presented here build upon existing standards commonly used within the community extending them with the capability to exchange data on tissue, environmental and DNA sample as well as sequences. The GGBN Data Standard will reveal and democratize the hidden contents of biodiversity biobanks, for the convenience of everyone in the wider biobanking community. Technical tools exist for data providers to easily map their databases to the standard.Database URL: http://terms.tdwg.org/wiki/GGBN_Data_Standard. © The Author(s) 2016. Published by Oxford University Press.
Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters.

PubMed

Schorn, Michelle A; Alanjary, Mohammad M; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R; Ziemert, Nadine; Moore, Bradley S

2016-12-01

Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites.
Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters

PubMed Central

Schorn, Michelle A.; Alanjary, Mohammad M.; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R.; Ziemert, Nadine

2016-01-01

Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites. PMID:27902408
Gramene database: navigating plant comparative genomics resources

USDA-ARS?s Scientific Manuscript database

Gramene (http://www.gramene.org) is an online, open source, curated resource for plant comparative genomics and pathway analysis designed to support researchers working in plant genomics, breeding, evolutionary biology, system biology, and metabolic engineering. It exploits phylogenetic relationship...
The relationship of Candida colonization of the oral and vaginal mucosae of mothers and oral mucosae of their newborns at birth.

PubMed

Al-Rusan, Rund M; Darwazeh, Azmi M G; Lataifeh, Isam M

2017-04-01

Vaginal Candida colonization is common during pregnancy. Vaginal Candida may transmit vertically to the mouth of newborns during labor. The aim of this study was to assess and compare oral Candida colonization between vaginally born newborns and cesarean-born newborns and to investigate the association of the mother's vaginal and oral Candida colonization and the newborn's oral colonization at the time of delivery. Culture swabs were collected from the oral and vaginal mucosae of 100 pregnant women and from the oral mucosa of their 100 full-term newborns. Fifty (50%) of the mothers gave birth vaginally and the other 50 (50%) by cesarean section. The prevalence of oral and vaginal Candida in pregnant mothers was 49% and 40%, respectively. Oral Candida colonization in newborns was 7%. Oral Candida was isolated from 5 of 50 (10%) in the vaginally born group and from 2 of 50 (4%) in the cesarean-born group (P = .44). In vaginally born group, oral Candida was isolated from 5 of 20 (25%) in those born to mothers with vaginal colonization of Candida, and 0 of 30 (0.0%) in mothers without vaginal colonization of Candida (P = .007). The mother's vaginal Candida may constitute an important source of oral Candida in the newborns, particularly in those delivered vaginally. Copyright © 2017 Elsevier Inc. All rights reserved.
Species spectrum and antifungal susceptibility profile of vaginal isolates of Candida in Kuwait.

PubMed

Alfouzan, W; Dhar, R; Ashkanani, H; Gupta, M; Rachel, C; Khan, Z U

2015-03-01

The study was undertaken to determine the prevalence of vulvovaginal candidiasis (VVC) among patients with vaginitis, frequency of different Candida species, and their susceptibility profile. Over six months period, high vaginal swabs were cultured on Sabouraud's dextrose agar and isolates were identified by culture on CHROMagar Candida and Vitek2 yeast identification system or/and API 20C (BioMerieux, France). Antifungal susceptibility of the Candida isolates was determined by E-test against amphotericin B, flucytosine, fluconazole, voriconazole, posaconazole and caspofungin. One thousand seven hundred and fifty-two women with vaginitis were screened for the prevalence of Candida spp. Vaginal swab cultures of 231 (13.2%) women yielded Candida spp. The isolation rates of different species were as follows: Candida albicans (73.9%), Candida glabrata (19.8%), Candida kefir (1.94%), Candida tropicalis (0.96%), Candida parapsilosis (0.96%), Candida krusei (0.96%), Candida guilliermondii (0.96%), and Saccharomyces cerevisiae (0.52%). All strains of C. albicans and non-C. albicans were susceptible to most of the antifungal agents tested. The high frequency with which C. albicans was recovered and its azole susceptibility support the continued use of azole agents for empirical therapy of uncomplicated VVC. However, a larger controlled study is required to determine the role of non-C. albicans in recurrent VVC. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Construction of Pará rubber tree genome and multi-transcriptome database accelerates rubber researches.

PubMed

Makita, Yuko; Kawashima, Mika; Lau, Nyok Sean; Othman, Ahmad Sofiman; Matsui, Minami

2018-01-19

Natural rubber is an economically important material. Currently the Pará rubber tree, Hevea brasiliensis is the main commercial source. Little is known about rubber biosynthesis at the molecular level. Next-generation sequencing (NGS) technologies brought draft genomes of three rubber cultivars and a variety of RNA sequencing (RNA-seq) data. However, no current genome or transcriptome databases (DB) are organized by gene. A gene-oriented database is a valuable support for rubber research. Based on our original draft genome sequence of H. brasiliensis RRIM600, we constructed a rubber tree genome and transcriptome DB. Our DB provides genome information including gene functional annotations and multi-transcriptome data of RNA-seq, full-length cDNAs including PacBio Isoform sequencing (Iso-Seq), ESTs and genome wide transcription start sites (TSSs) derived from CAGE technology. Using our original and publically available RNA-seq data, we calculated co-expressed genes for identifying functionally related gene sets and/or genes regulated by the same transcription factor (TF). Users can access multi-transcriptome data through both a gene-oriented web page and a genome browser. For the gene searching system, we provide keyword search, sequence homology search and gene expression search; users can also select their expression threshold easily. The rubber genome and transcriptome DB provides rubber tree genome sequence and multi-transcriptomics data. This DB is useful for comprehensive understanding of the rubber transcriptome. This will assist both industrial and academic researchers for rubber and economically important close relatives such as R. communis, M. esculenta and J. curcas. The Rubber Transcriptome DB release 2017.03 is accessible at http://matsui-lab.riken.jp/rubber/ .
Bovine Genome Database: supporting community annotation and analysis of the Bos taurus genome

PubMed Central

2010-01-01

Background A goal of the Bovine Genome Database (BGD; http://BovineGenome.org) has been to support the Bovine Genome Sequencing and Analysis Consortium (BGSAC) in the annotation and analysis of the bovine genome. We were faced with several challenges, including the need to maintain consistent quality despite diversity in annotation expertise in the research community, the need to maintain consistent data formats, and the need to minimize the potential duplication of annotation effort. With new sequencing technologies allowing many more eukaryotic genomes to be sequenced, the demand for collaborative annotation is likely to increase. Here we present our approach, challenges and solutions facilitating a large distributed annotation project. Results and Discussion BGD has provided annotation tools that supported 147 members of the BGSAC in contributing 3,871 gene models over a fifteen-week period, and these annotations have been integrated into the bovine Official Gene Set. Our approach has been to provide an annotation system, which includes a BLAST site, multiple genome browsers, an annotation portal, and the Apollo Annotation Editor configured to connect directly to our Chado database. In addition to implementing and integrating components of the annotation system, we have performed computational analyses to create gene evidence tracks and a consensus gene set, which can be viewed on individual gene pages at BGD. Conclusions We have provided annotation tools that alleviate challenges associated with distributed annotation. Our system provides a consistent set of data to all annotators and eliminates the need for annotators to format data. Involving the bovine research community in genome annotation has allowed us to leverage expertise in various areas of bovine biology to provide biological insight into the genome sequence. PMID:21092105

CGDSNPdb: a database resource for error-checked and imputed mouse SNPs.

PubMed

Hutchins, Lucie N; Ding, Yueming; Szatkiewicz, Jin P; Von Smith, Randy; Yang, Hyuna; de Villena, Fernando Pardo-Manuel; Churchill, Gary A; Graber, Joel H

2010-07-06

The Center for Genome Dynamics Single Nucleotide Polymorphism Database (CGDSNPdb) is an open-source value-added database with more than nine million mouse single nucleotide polymorphisms (SNPs), drawn from multiple sources, with genotypes assigned to multiple inbred strains of laboratory mice. All SNPs are checked for accuracy and annotated for properties specific to the SNP as well as those implied by changes to overlapping protein-coding genes. CGDSNPdb serves as the primary interface to two unique data sets, the 'imputed genotype resource' in which a Hidden Markov Model was used to assess local haplotypes and the most probable base assignment at several million genomic loci in tens of strains of mice, and the Affymetrix Mouse Diversity Genotyping Array, a high density microarray with over 600,000 SNPs and over 900,000 invariant genomic probes. CGDSNPdb is accessible online through either a web-based query tool or a MySQL public login. Database URL: http://cgd.jax.org/cgdsnpdb/
Multi-drug resistant oral Candida species isolated from HIV-positive patients in South Africa and Cameroon.

PubMed

Dos Santos Abrantes, Pedro Miguel; McArthur, Carole P; Africa, Charlene Wilma Joyce

2014-06-01

Candida species are a common cause of infection in immune-compromised HIV-positive individuals, who are usually treated with the antifungal drug, fluconazole, in public hospitals in Africa. However, information about the prevalence of drug resistance to fluconazole and other antifungal agents on Candida species is very limited. This study examined 128 Candida isolates from South Africa and 126 Cameroonian Candida isolates for determination of species prevalence and antifungal drug susceptibility. The isolates were characterized by growth on chromogenic and selective media and by their susceptibility to 9 antifungal drugs tested using the TREK™ YeastOne9 drug panel (Thermo Scientific, USA). Eighty-three percent (82.8%) of South African isolates were Candida albicans (106 isolates), 9.4% were Candida glabrata (12 isolates), and 7.8% were Candida dubliniensis (10 isolates). Of the Cameroonian isolates, 73.02% were C. albicans (92 isolates); 19.05% C. glabrata (24 isolates); 3.2% Candida tropicalis (4 isolates); 2.4% Candida krusei (3 isolates); 1.59% either Candida kefyr, Candida parapsilopsis, or Candida lusitaneae (2 isolates); and 0.79% C. dubliniensis (1 isolate). Widespread C. albicans resistance to azoles was detected phenotypically in both populations. Differences in drug resistance were seen within C. glabrata found in both populations. Echinocandin drugs were more effective on isolates obtained from the Cameroon than in South Africa. A multiple-drug resistant C. dubliniensis strain isolated from the South African samples was inhibited only by 5-flucytosine in vitro on the YO9 panel. Drug resistance among oral Candida species is common among African HIV patients in these 2 countries. Regional surveillance of Candida species drug susceptibility should be undertaken to ensure effective treatment for HIV-positive patients. Copyright © 2014 Elsevier Inc. All rights reserved.
Antifungal susceptibility patterns of colonized Candida species isolates from immunocompromised pediatric patients in five university hospitals.

PubMed

Badiee, Parisa; Choopanizadeh, Maral; Moghadam, Abdolkarim Ghadimi; Nasab, Ali Hossaini; Jafarian, Hadis; Shamsizadeh, Ahmad; Soltani, Jafar

2017-12-01

Colonization of Candida species is common in pediatric patients admitted to hematology-oncology wards. The aim of this study was to identify colonized Candida species and their susceptibility patterns in hematologic pediatric patients. Samples were collected from mouth, nose, urine and stool of the patients admitted to five university hospitals and cultured on sabouraud dextrose agar. The isolates were identified by API 20 C AUX system and their susceptibility patterns were evaluated by CLSI M27-A3 and S4. From 650 patients, 320 (49.2%) were colonized with 387 Candida species. Candida albicans was the most prevalent isolated species, followed by Candida glabrata, Candida tropicalis, Candida famata, Candida kefyr and Candida kuresi . The epidemiological cut off value (ECV) for all Candida species to amphotericin B was ≤0.25 μg except C. krusei (4 μg). The resistance rate to fluconazole in this study in C. albicans was 4.9% with ECV 8 μg/ml, followed by C. tropicalis 8.8% with ECV 0.5 μg/ml. Voriconazole and posaconazole were effective antifungal agents for all Candida isolates. The ECV of C. albicans, Candida parapsilosis, C. tropicalis, C. glabrata and C. krusei for itraconazole were 0.5, 0.25, 0.5, 1 and 2 μg, respectively. The resistant and intermediate rates of Candida species to caspofungin in this study were 2.9%, 5.9%, 18.8%, 47.9%, 0.0% and 16.7% in C. tropicalis, C. glabrata and C. parapsilosis respectively. C. albicans was the most prevalent species in pediatric colonized patients. New azole agents like voriconazole and posaconazole are effective against non-albicans Candida species. Increase in intermediate species is alarming to future emerging resistant species.
[Genetic mutation databases: stakes and perspectives for orphan genetic diseases].

PubMed

Humbertclaude, V; Tuffery-Giraud, S; Bareil, C; Thèze, C; Paulet, D; Desmet, F-O; Hamroun, D; Baux, D; Girardet, A; Collod-Béroud, G; Khau Van Kien, P; Roux, A-F; des Georges, M; Béroud, C; Claustres, M

2010-10-01

New technologies, which constantly become available for mutation detection and gene analysis, have contributed to an exponential rate of discovery of disease genes and variation in the human genome. The task of collecting and documenting this enormous amount of data in genetic databases represents a major challenge for the future of biological and medical science. The Locus Specific Databases (LSDBs) are so far the most efficient mutation databases. This review presents the main types of databases available for the analysis of mutations responsible for genetic disorders, as well as open perspectives for new therapeutic research or challenges for future medicine. Accurate and exhaustive collection of variations in human genomes will be crucial for research and personalized delivery of healthcare. Copyright © 2009 Elsevier Masson SAS. All rights reserved.
The epidemiology of Candida species associated with vulvovaginal candidiasis in an Iranian patient population.

PubMed

Mahmoudi Rad, M; Zafarghandi, S; Abbasabadi, B; Tavallaee, M

2011-04-01

Vulvovaginal candidiasis is a common infection among women worldwide. According to previous epidemiological studies, Candida albicans is the most common species of Candida. The prevalence of non-Candida species, however, is increasing. Identification of Candida species among the population will not only help health professionals to choose suitable antifungal treatments, but also prevent development of drug resistance. The aim of this study was to identify, using chromogenic agar medium, the Candida species associated with vulvovaginal candidiasis among a sample of the Iranian population. In a prospective cohort study during a two year period from March 2006 to March 2008, swab samples of vaginal discharge/secretion were taken from 200 patients admitted to the gynecology clinic of Mahdieh Hospital (Tehran, Iran) with a clinical presentation suggestive of vulvovaginal candidiasis. The isolates obtained were cultured on Sabouraud dextrose agar and chromogenic agar medium. Candida species were also identified by germ tube formation in serum, chlamydospore production on Corn Meal Agar and carbohydrate absorption using the API 20C-AUX kit. Participants were asked to complete a questionnaire investigating the risk factors associated with candidiasis. An assessment of the different species of recurrent and non-recurrent candidiasis was also made. Descriptive statistics, chi-square test, and t-test were used to analyze the data. A total of 191 isolates were obtained from 175 vaginal specimens. Candida albicans accounted for 67% of the strains including single and mixed infections. The other identified species were Candida glabrata (18.3%), Candida tropicalis (6.8%), Candida krusei (5.8%), Candida parapsilosis (1.6%), and Candida guilliermondii (0.5%) respectively. Mixed infection with two or more species of Candida was seen in 10.3% of patients. The most common mixed cause was the combination of Candida albicans and Candida glabrata. Participants who were sexually active and those who had orogenital sex were more likely to suffer recurrent vulvovaginal candidiasis. Candida albicans was the most common cause of recurrent and non-recurrent vulvovaginitis. The second most common species was Candida glabrata. This study suggests CHROMagar method as a convenient and cost effective yet reliable method to isolate the species of Candida especially in cases where more than one species is present. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
PGDD: a database of gene and genome duplication in plants

PubMed Central

Lee, Tae-Ho; Tang, Haibao; Wang, Xiyin; Paterson, Andrew H.

2013-01-01

Genome duplication (GD) has permanently shaped the architecture and function of many higher eukaryotic genomes. The angiosperms (flowering plants) are outstanding models in which to elucidate consequences of GD for higher eukaryotes, owing to their propensity for chromosomal duplication or even triplication in a few cases. Duplicated genome structures often require both intra- and inter-genome alignments to unravel their evolutionary history, also providing the means to deduce both obvious and otherwise-cryptic orthology, paralogy and other relationships among genes. The burgeoning sets of angiosperm genome sequences provide the foundation for a host of investigations into the functional and evolutionary consequences of gene and GD. To provide genome alignments from a single resource based on uniform standards that have been validated by empirical studies, we built the Plant Genome Duplication Database (PGDD; freely available at http://chibba.agtec.uga.edu/duplication/), a web service providing synteny information in terms of colinearity between chromosomes. At present, PGDD contains data for 26 plants including bryophytes and chlorophyta, as well as angiosperms with draft genome sequences. In addition to the inclusion of new genomes as they become available, we are preparing new functions to enhance PGDD. PMID:23180799
Comparison of the genomic sequence of the microminipig, a novel breed of swine, with the genomic database for conventional pig.

PubMed

Miura, Naoki; Kucho, Ken-Ichi; Noguchi, Michiko; Miyoshi, Noriaki; Uchiumi, Toshiki; Kawaguchi, Hiroaki; Tanimoto, Akihide

2014-01-01

The microminipig, which weighs less than 10 kg at an early stage of maturity, has been reported as a potential experimental model animal. Its extremely small size and other distinct characteristics suggest the possibility of a number of differences between the genome of the microminipig and that of conventional pigs. In this study, we analyzed the genomes of two healthy microminipigs using a next-generation sequencer SOLiD™ system. We then compared the obtained genomic sequences with a genomic database for the domestic pig (Sus scrofa). The mapping coverage of sequenced tag from the microminipig to conventional pig genomic sequences was greater than 96% and we detected no clear, substantial genomic variance from these data. The results may indicate that the distinct characteristics of the microminipig derive from small-scale alterations in the genome, such as Single Nucleotide Polymorphisms or translational modifications, rather than large-scale deletion or insertion polymorphisms. Further investigation of the entire genomic sequence of the microminipig with methods enabling deeper coverage is required to elucidate the genetic basis of its distinct phenotypic traits. Copyright © 2014 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.
Frequency of Candida albicans in Patients with Funguria.

PubMed

Jamil, Sana; Jamil, Naz; Saad, Uzma; Hafiz, Saleem; Siddiqui, Sualleha

2016-02-01

To determine the frequency of Candida albicansin patients with funguria. Descriptive cross-sectional study. Department of Microbiology, Sindh Institute of Urology and Transplantation, from July to December 2012. Patients’ urine samples with fungus/Candida were included. Candida albicans was identified by the production of tubular structures (germ tubes) on microscopy as per standard procedure followed by inoculation on Chrom agar (Oxoid) and Corn Meal-Tween 80 agar (Oxoid). The identification of other non-albicans Candidaspecies was also done both microscopically and macroscopically as per standard procedure. Out of the 289 isolates, 204 (70.6%) were male patients and 85 (29.4%) were female patients, with 165 (57.1%) from the out-patients and 124 (42.9%) from the in-patients. Five species of Candidawere found to be prevalent including 87 (30.1%) Candida albicans, 176 (60.9%) Candida tropicalis, 14 (4.8%) Candida parapsilosis, 8 (2.8%) Candida glabrata and 4 (1.4%) Candida lusitaniae. Majority of patients with funguria were aged above 50 years (60.2%). In the present study, 30.1% patients with funguria had Candida albicans. The most frequently isolated species was Candida tropicalis(60.9%), followed by other non-albicansCandida. This study has shown the emergence of non-albicans Candidaas a major cause of candiduria.
Detection of Candida species in pregnant Chinese women with a molecular beacon method.

PubMed

Zhai, Yanhong; Liu, Jing; Zhou, Li; Ji, Tongzhen; Meng, Lingxin; Gao, Yang; Liu, Ran; Wang, Xiao; Li, Lin; Lu, Binghuai; Cao, Zheng

2018-04-20

Candida pathogens are commonly found in women and can cause vulvovaginal candidiasis (VVC), whose infection rate is further increased during pregnancy. We aimed to study the Candida prevalence and strain distribution in pregnant Chinese women with a molecular beacon assay. From March 2016 to February 2017, a total of 993 pregnant women attending routine antenatal visits at the Beijing Obstetrics and Gynecology Hospital were enrolled. For Candida detection and identification, a unique molecular beacon assay was presented and compared with a traditional phenotypic method. Antifungal susceptibility was tested with the following agents: 5-flucytosine, amphotericin B, fluconazole, itraconazole and voriconazole. The prevalence of Candida was found to be 21.8 % when using the molecular method and 15.0 % when using the phenotypic method. The distribution of the Candida spp. was listed in order of decreasing prevalence: Candida albicans (79.8 %), Candida glabrata (13.5 %), Candida parapsilosis (3.7 %), Candida krusei (2.2 %) and Candida tropicalis (1.1 %). We found that 90.7 % of the Candida detection results were consistent between the molecular and the phenotypic methods. In the cases where the sequencing analyses for the Candida isolates resulted in inconsistent identification, the molecular method showed higher sensitivity than the phenotypic method (96.0 vs 64.6 %). C. albicans, C. glabrata and C. parapsilosis were essentially susceptible to all five antifungal agents tested, whereas C. tropicalis and C. krusei were susceptible to voriconazole and amphotericin B. By exhibiting good sensitivity and specificity, the molecular assay may offer a fast and accurate Candida screening platform for pregnant women.
In vitro antifungal activity of different components of Centratherum anthelminticum and Ocimum sanctum seed oils and their synergism against oral pathogenic fungi

PubMed Central

H Gopalkrishna, Aparna; M, Seshagiri; Muddaiah, Sunil; R, Shashidara

2016-01-01

Background. Opportunistic fungal infections like candidiasis are common in the oral cavity. In recent years Candida species have shown resistance against a number of synthetic drugs. This study assessed the antifungal activity of Centratherum anthelminticum and Ocimum sanctum seed oils against six common pathogenic Candida strains. Synergistic activity of the major oil components was also studied. Methods. Antifungal activity of Centratherum anthelminticum and Ocimum sanctum seed oils were tested against six oral fungal pathogens, Candida albicans ATCC 90028, Candida krusei 6258, Candida tropicalis 13803, Candida parapsilosis22019, Candida glabrata 90030 and Candida dubliniensis MYA 646, by disc diffusion and broth microdilution methods to determine the diameter of inhibition zone (DIZ) and minimum inhibitory concentration (MIC), respectively. The oil was extracted using Soxhlet apparatus from seeds subjected to columnchromatography (CC) and thin layer chromatography (TLC) and major components were separated and quantified. Results. All the six Candida strains showed growth inhibition to a variable degree when tested with both seed oils. Both seed oils showed antifungal activity. For Centratherum anthelminticum seed oil maximum DIZ at 7 μL was recorded at 75.7 mm for Candida albicans ATCC 90028, and the least DIZ was 45.7 mm for Candida dubliniensis MYA 646. For Ocimum sanctum seed oil maximum DIZ at 7 μL was 61.0 mm for Candida krusei ATCC 6258 and the least DIZ was 46.7 mm for Candida tropicalis ATCC 13803. The mixtures of phospholipids and unsaponifiable matter exhibitedMIC values at 1.25 μL for both oils, whereas neutral lipids fraction and unsaponifiable matter exhibited similar MIC at 2.5 μL against Candida albicans and Candida krusei. Conclusion.Centratherum anthelminticum and Ocimum sanctumseed oils exhibited strong antifungal activity against six different species of Candida and this may be attributed to various active components in the oil and their synergistic activity. PMID:27429725
Virulence factors of Candida species isolated from patients with urinary tract infection and obstructive uropathy

PubMed Central

Alenzi, Faris Q.B.

2016-01-01

Objective: Fungal urinary tract infections due to Candida have increased significantly in recent years. Our research objective was to study Candida species in urine samples of patients with urinary tract infections (UTIs) associated with obstructive uropathy and to investigate the virulence factors of the isolated Candida. Methods: Patients were divided into two groups: Group I (cases): 50 patients with UTIs and obstructive uropathy. Group II (control): 50 patients with UTIs but with no functional or anatomical obstruction of their urinary tract. Clinical histories and physical examinations, together with laboratory investigations of urine samples were carried out in all patients in this study. Mid stream urine samples were examined microscopically and by fungal cell culture. The isolated Candida species were identified by analytical profile index (API). Candida Virulence factors were determined for the isolated Candida. The susceptibility to fluconazole was evaluated. Results: This study revealed an overall isolation rate of 27% of Candida species among all patient groups. The rate was 36% in cases, and 18% in controls, a difference found to be statistically significant (P<0.05). By API, C.albicans was detected in 44% of Candida species in cases, and in 33% in controls. While C.glabrata was detected in 28% of Candida species in cases, and in 22% in controls. C.tropicalis was detected in 17% of Candida species in cases, and in 22% in controls. Both C.krusei and C.kyfr were detected in 5.5% of Candida species in cases, and in 11% in controls. In terms of virulence factors the study showed that 11 out of 27 (40.5%) of Candida isolates were biofilm positive by tube adherence. Phospholipase activity was demonstrated in 12 out of 27 (44.5%) of Candida isolates. Secretory aspartic proteinase activity was demonstrated in 13 out of 27 (48%) of the Candida isolates. Conclusion: Candida is an important cause of UTIs and obstructive uropathy is a major predisposing factor. PMID:27022363
Genome misclassification of Klebsiella variicola and Klebsiella quasipneumoniae isolated from plants, animals and humans.

PubMed

Martínez-Romero, Esperanza; Rodríguez-Medina, Nadia; Beltrán-Rojel, Marilu; Silva-Sánchez, Jesús; Barrios-Camacho, Humberto; Pérez-Rueda, Ernesto; Garza-Ramos, Ulises

2018-01-01

Due to the fact that K. variicola, K. quasipneumoniae and K. pneumoniae are closely related bacterial species, misclassification can occur due to mistakes either in normal biochemical tests or during submission to public databases. The objective of this work was to identify K. variicola and K. quasipneumoniae genomes misclassified in GenBank database. Both rpoB phylogenies and average nucleotide identity (ANI) were used to identify a significant number of misclassified Klebsiella spp. genomes. Here we report an update of K. variicola and K. Quasipneumoniae genomes correctly classified and a list of isolated genomes obtained from humans, plants, animals and insects, described originally as K. pneumoniae or K. variicola, but known now to be misclassified. This work contributes to recognize the extensive presence of K. variicola and K. quasipneumoniae isolates in diverse sites and samples.
Inferring transposons activity chronology by TRANScendence - TEs database and de-novo mining tool.

PubMed

Startek, Michał Piotr; Nogły, Jakub; Gromadka, Agnieszka; Grzebelus, Dariusz; Gambin, Anna

2017-10-16

The constant progress in sequencing technology leads to ever increasing amounts of genomic data. In the light of current evidence transposable elements (TEs for short) are becoming useful tools for learning about the evolution of host genome. Therefore the software for genome-wide detection and analysis of TEs is of great interest. Here we describe the computational tool for mining, classifying and storing TEs from newly sequenced genomes. This is an online, web-based, user-friendly service, enabling users to upload their own genomic data, and perform de-novo searches for TEs. The detected TEs are automatically analyzed, compared to reference databases, annotated, clustered into families, and stored in TEs repository. Also, the genome-wide nesting structure of found elements are detected and analyzed by new method for inferring evolutionary history of TEs. We illustrate the functionality of our tool by performing a full-scale analyses of TE landscape in Medicago truncatula genome. TRANScendence is an effective tool for the de-novo annotation and classification of transposable elements in newly-acquired genomes. Its streamlined interface makes it well-suited for evolutionary studies.
WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata

DOE Office of Scientific and Technical Information (OSTI.GOV)

Putman, Tim E.; Lelong, Sebastien; Burgstaller-Muehlbacher, Sebastian

With the advancement of genome-sequencing technologies, new genomes are being sequenced daily. Although these sequences are deposited in publicly available data warehouses, their functional and genomic annotations (beyond genes which are predicted automatically) mostly reside in the text of primary publications. Professional curators are hard at work extracting those annotations from the literature for the most studied organisms and depositing them in structured databases. However, the resources don’t exist to fund the comprehensive curation of the thousands of newly sequenced organisms in this manner. Here, we describe WikiGenomes (wikigenomes.org), a web application that facilitates the consumption and curation of genomicmore » data by the entire scientific community. WikiGenomes is based on Wikidata, an openly editable knowledge graph with the goal of aggregating published knowledge into a free and open database. WikiGenomes empowers the individual genomic researcher to contribute their expertise to the curation effort and integrates the knowledge into Wikidata, enabling it to be accessed by anyone without restriction.« less
WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata

DOE PAGES

Putman, Tim E.; Lelong, Sebastien; Burgstaller-Muehlbacher, Sebastian; ...

2017-03-06

With the advancement of genome-sequencing technologies, new genomes are being sequenced daily. Although these sequences are deposited in publicly available data warehouses, their functional and genomic annotations (beyond genes which are predicted automatically) mostly reside in the text of primary publications. Professional curators are hard at work extracting those annotations from the literature for the most studied organisms and depositing them in structured databases. However, the resources don’t exist to fund the comprehensive curation of the thousands of newly sequenced organisms in this manner. Here, we describe WikiGenomes (wikigenomes.org), a web application that facilitates the consumption and curation of genomicmore » data by the entire scientific community. WikiGenomes is based on Wikidata, an openly editable knowledge graph with the goal of aggregating published knowledge into a free and open database. WikiGenomes empowers the individual genomic researcher to contribute their expertise to the curation effort and integrates the knowledge into Wikidata, enabling it to be accessed by anyone without restriction.« less
A 454 sequencing approach to dipteran mitochondrial genome research

USDA-ARS?s Scientific Manuscript database

The availability of complete mitochondrial genome data for Diptera, one of the largest Metazoan orders, in public databases is limited. Herein, we generated the complete or nearly complete mitochondrial genomes for Cochliomyia hominivorax, Haematobia irritans, Phormia regina and Sarcophaga crassipa...
OperomeDB: A Database of Condition-Specific Transcription Units in Prokaryotic Genomes.

PubMed

Chetal, Kashish; Janga, Sarath Chandra

2015-01-01

Background. In prokaryotic organisms, a substantial fraction of adjacent genes are organized into operons-codirectionally organized genes in prokaryotic genomes with the presence of a common promoter and terminator. Although several available operon databases provide information with varying levels of reliability, very few resources provide experimentally supported results. Therefore, we believe that the biological community could benefit from having a new operon prediction database with operons predicted using next-generation RNA-seq datasets. Description. We present operomeDB, a database which provides an ensemble of all the predicted operons for bacterial genomes using available RNA-sequencing datasets across a wide range of experimental conditions. Although several studies have recently confirmed that prokaryotic operon structure is dynamic with significant alterations across environmental and experimental conditions, there are no comprehensive databases for studying such variations across prokaryotic transcriptomes. Currently our database contains nine bacterial organisms and 168 transcriptomes for which we predicted operons. User interface is simple and easy to use, in terms of visualization, downloading, and querying of data. In addition, because of its ability to load custom datasets, users can also compare their datasets with publicly available transcriptomic data of an organism. Conclusion. OperomeDB as a database should not only aid experimental groups working on transcriptome analysis of specific organisms but also enable studies related to computational and comparative operomics.
Solving the Problem: Genome Annotation Standards before the Data Deluge.

PubMed

Klimke, William; O'Donovan, Claire; White, Owen; Brister, J Rodney; Clark, Karen; Fedorov, Boris; Mizrachi, Ilene; Pruitt, Kim D; Tatusova, Tatiana

2011-10-15

The promise of genome sequencing was that the vast undiscovered country would be mapped out by comparison of the multitude of sequences available and would aid researchers in deciphering the role of each gene in every organism. Researchers recognize that there is a need for high quality data. However, different annotation procedures, numerous databases, and a diminishing percentage of experimentally determined gene functions have resulted in a spectrum of annotation quality. NCBI in collaboration with sequencing centers, archival databases, and researchers, has developed the first international annotation standards, a fundamental step in ensuring that high quality complete prokaryotic genomes are available as gold standard references. Highlights include the development of annotation assessment tools, community acceptance of protein naming standards, comparison of annotation resources to provide consistent annotation, and improved tracking of the evidence used to generate a particular annotation. The development of a set of minimal standards, including the requirement for annotated complete prokaryotic genomes to contain a full set of ribosomal RNAs, transfer RNAs, and proteins encoding core conserved functions, is an historic milestone. The use of these standards in existing genomes and future submissions will increase the quality of databases, enabling researchers to make accurate biological discoveries.
SmedGD 2.0: The Schmidtea mediterranea genome database

PubMed Central

Robb, Sofia M.C.; Gotting, Kirsten; Ross, Eric; Sánchez Alvarado, Alejandro

2016-01-01

Planarians have emerged as excellent models for the study of key biological processes such as stem cell function and regulation, axial polarity specification, regeneration, and tissue homeostasis among others. The most widely used organism for these studies is the free-living flatworm Schmidtea mediterranea. In 2007, the Schmidtea mediterranea Genome Database (SmedGD) was first released to provide a much needed resource for the small, but growing planarian community. SmedGD 1.0 has been a depository for genome sequence, a draft assembly, and related experimental data (e.g., RNAi phenotypes, in situ hybridization images, and differential gene expression results). We report here a comprehensive update to SmedGD (SmedGD 2.0) that aims to expand its role as an interactive community resource. The new database includes more recent, and up-to-date transcription data, provides tools that enhance interconnectivity between different genome assemblies and transcriptomes, including next generation assemblies for both the sexual and asexual biotypes of S. mediterranea. SmedGD 2.0 (http://smedgd.stowers.org) not only provides significantly improved gene annotations, but also tools for data sharing, attributes that will help both the planarian and biomedical communities to more efficiently mine the genomics and transcriptomics of S. mediterranea. PMID:26138588
Solving the Problem: Genome Annotation Standards before the Data Deluge

PubMed Central

Klimke, William; O'Donovan, Claire; White, Owen; Brister, J. Rodney; Clark, Karen; Fedorov, Boris; Mizrachi, Ilene; Pruitt, Kim D.; Tatusova, Tatiana

2011-01-01

The promise of genome sequencing was that the vast undiscovered country would be mapped out by comparison of the multitude of sequences available and would aid researchers in deciphering the role of each gene in every organism. Researchers recognize that there is a need for high quality data. However, different annotation procedures, numerous databases, and a diminishing percentage of experimentally determined gene functions have resulted in a spectrum of annotation quality. NCBI in collaboration with sequencing centers, archival databases, and researchers, has developed the first international annotation standards, a fundamental step in ensuring that high quality complete prokaryotic genomes are available as gold standard references. Highlights include the development of annotation assessment tools, community acceptance of protein naming standards, comparison of annotation resources to provide consistent annotation, and improved tracking of the evidence used to generate a particular annotation. The development of a set of minimal standards, including the requirement for annotated complete prokaryotic genomes to contain a full set of ribosomal RNAs, transfer RNAs, and proteins encoding core conserved functions, is an historic milestone. The use of these standards in existing genomes and future submissions will increase the quality of databases, enabling researchers to make accurate biological discoveries. PMID:22180819

Some links on this page may take you to non-federal websites. Their policies may differ from this site.