Cerqueira, Gustavo C; Arnaud, Martha B; Inglis, Diane O; Skrzypek, Marek S; Binkley, Gail; Simison, Matt; Miyasato, Stuart R; Binkley, Jonathan; Orvis, Joshua; Shah, Prachi; Wymore, Farrell; Sherlock, Gavin; Wortman, Jennifer R
2014-01-01
The Aspergillus Genome Database (AspGD; http://www.aspgd.org) is a freely available web-based resource that was designed for Aspergillus researchers and is also a valuable source of information for the entire fungal research community. In addition to being a repository and central point of access to genome, transcriptome and polymorphism data, AspGD hosts a comprehensive comparative genomics toolbox that facilitates the exploration of precomputed orthologs among the 20 currently available Aspergillus genomes. AspGD curators perform gene product annotation based on review of the literature for four key Aspergillus species: Aspergillus nidulans, Aspergillus oryzae, Aspergillus fumigatus and Aspergillus niger. We have iteratively improved the structural annotation of Aspergillus genomes through the analysis of publicly available transcription data, mostly expressed sequenced tags, as described in a previous NAR Database article (Arnaud et al. 2012). In this update, we report substantive structural annotation improvements for A. nidulans, A. oryzae and A. fumigatus genomes based on recently available RNA-Seq data. Over 26 000 loci were updated across these species; although those primarily comprise the addition and extension of untranslated regions (UTRs), the new analysis also enabled over 1000 modifications affecting the coding sequence of genes in each target genome.
USDA-ARS?s Scientific Manuscript database
The availability of a representative gene ontology (GO) database is a prerequisite for a successful functional genomics study. Using online Blast2GO resources we constructed a GO database of Aspergillus flavus. Of the predicted total 13,485 A. flavus genes 8,987 were annotated with GO terms. The mea...
Han, Xiaolong; Chakrabortti, Alolika; Zhu, Jindong; Liang, Zhao-Xun; Li, Jinming
2016-08-15
Aspergillus westerdijkiae produces ochratoxin A (OTA) in Aspergillus section Circumdati. It is responsible for the contamination of agricultural crops, fruits, and food commodities, as its secondary metabolite OTA poses a potential threat to animals and humans. As a member of the filamentous fungi family, its capacity for enzymatic catalysis and secondary metabolite production is valuable in industrial production and medicine. To understand the genetic factors underlying its pathogenicity, enzymatic degradation, and secondary metabolism, we analysed the whole genome of A. westerdijkiae and compared it with eight other sequenced Aspergillus species. We sequenced the complete genome of A. westerdijkiae and assembled approximately 36 Mb of its genomic DNA, in which we identified 10,861 putative protein-coding genes. We constructed a phylogenetic tree of A. westerdijkiae and eight other sequenced Aspergillus species and found that the sister group of A. westerdijkiae was the A. oryzae - A. flavus clade. By searching the associated databases, we identified 716 cytochrome P450 enzymes, 633 carbohydrate-active enzymes, and 377 proteases. By combining comparative analysis with Kyoto Encyclopaedia of Genes and Genomes (KEGG), Conserved Domains Database (CDD), and Pfam annotations, we predicted 228 potential carbohydrate-active enzymes related to plant polysaccharide degradation (PPD). We found a large number of secondary biosynthetic gene clusters, which suggested that A. westerdijkiae had a remarkable capacity to produce secondary metabolites. Furthermore, we obtained two more reliable and integrated gene sequences containing the reported portions of OTA biosynthesis and identified their respective secondary metabolite clusters. We also systematically annotated these two hybrid t1pks-nrps gene clusters involved in OTA biosynthesis. These two clusters were separate in the genome, and one of them encoded a couple of GH3 and AA3 enzyme genes involved in sucrose and glucose metabolism. The genomic information obtained in this study is valuable for understanding the life cycle and pathogenicity of A. westerdijkiae. We identified numerous enzyme genes that are potentially involved in host invasion and pathogenicity, and we provided a preliminary prediction for each putative secondary metabolite (SM) gene cluster. In particular, for the OTA-related SM gene clusters, we delivered their components with domain and pathway annotations. This study sets the stage for experimental verification of the biosynthetic and regulatory mechanisms of OTA and for the discovery of new secondary metabolites.
What can comparative genomics tell us about species concepts in the genus Aspergillus?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rokas, Antonis; payne, gary; Federova, Natalie D.
2007-12-15
Understanding the nature of species" boundaries is a fundamental question in evolutionary biology. The availability of genomes from several species of the genus Aspergillus allows us for the first time to examine the demarcation of fungal species at the whole-genome level. Here, we examine four case studies, two of which involve intraspecific comparisons, whereas the other two deal with interspecific genomic comparisons between closely related species. These four comparisons reveal significant variation in the nature of species boundaries across Aspergillus. For example, comparisons between A. fumigatus and Neosartorya fischeri (the teleomorph of A. fischerianus) and between A. oryzae and A.more » flavus suggest that measures of sequence similarity and species-specific genes are significantly higher for the A. fumigatus - N. fischeri pair. Importantly, the values obtained from the comparison between A. oryzae and A. flavus are remarkably similar to those obtained from an intra-specific comparison of A. fumigatus strains, giving support to the proposal that A. oryzae represents a distinct ecotype of A. flavus and not a distinct species. We argue that genomic data can aid Aspergillus taxonomy by serving as a source of novel and unprecedented amounts of comparative data, as a resource for the development of additional diagnostic tools, and finally as a knowledge database about the biological differences between strains and species.« less
Genome-wide analysis of the Zn(II)2Cys6 zinc cluster-encoding gene family in Aspergillus flavus
USDA-ARS?s Scientific Manuscript database
Proteins with a Zn(II)2Cys6 domain, Cys-X2-Cys-X6-Cys-X5-12-Cys-X2-Cys-X6-9-Cys (hereafter, referred to as the C6 domain), form a subclass of zinc finger proteins found exclusively in fungi and yeast. Genome sequence databases of Saccharomyces cerevisiae and Candida albicans have provided an overvie...
The function and evolution of the Aspergillus genome
Gibbons, John G.; Rokas, Antonis
2012-01-01
Species in the filamentous fungal genus Aspergillus display a wide diversity of lifestyles and are of great importance to humans. The decoding of genome sequences from a dozen species that vary widely in their degree of evolutionary affinity has galvanized studies of the function and evolution of the Aspergillus genome in clinical, industrial, and agricultural environments. Here, we synthesize recent key findings that shed light on the architecture of the Aspergillus genome, on the molecular foundations of the genus’ astounding dexterity and diversity in secondary metabolism, and on the genetic underpinnings of virulence in Aspergillus fumigatus, one of the most lethal fungal pathogens. Many of these insights dramatically expand our knowledge of fungal and microbial eukaryote genome evolution and function and argue that Aspergillus constitutes a superb model clade for the study of functional and comparative genomics. PMID:23084572
What the Aspergillus genomes have told us.
Nierman, W C; May, G; Kim, H S; Anderson, M J; Chen, D; Denning, D W
2005-05-01
The sequencing and annotation of the genomes of the first strains of Aspergillus nidulans, Aspergillus oryzae, and Aspergillus fumigatus will be seen in retrospect as a transformational event in Aspergillus biology. With this event the entire genetic composition of A. nidulans, the sexual experimental model organism of the genus Aspergillus, A. oryzae, the food biotechnology organism which is the product of centuries of cultivation, and A. fumigatus, the most common causative agent of invasive aspergillosis is now revealed to the extent that we are at present able to understand. Each genome exhibits a large set of genes common to the three as well as a much smaller set of genes unique to each. Moreover, these sequences serve as resources providing the major tool to expanding our understanding of the biology of each. Transcription profiling of A. fumigatus at high temperatures and comparative genomic hybridization between A. fumigatus and a closely related Aspergillus species provides microarray based examples of the beginning of functional analysis of the genomes of these organisms going forward from the genome sequence.
Gil-Serna, Jessica; García-Díaz, Marta; González-Jaén, María Teresa; Vázquez, Covadonga; Patiño, Belén
2018-03-02
Ochratoxin A (OTA) is one of the most important mycotoxins due to its toxic properties and worldwide distribution which is produced by several Aspergillus and Penicillium species. The knowledge of OTA biosynthetic genes and understanding of the mechanisms involved in their regulation are essential. In this work, we obtained a clear picture of biosynthetic genes organization in the main OTA-producing Aspergillus and Penicillium species (A. steynii, A. westerdijkiae, A. niger, A. carbonarius and P. nordicum) using complete genome sequences obtained in this work or previously available on databases. The results revealed a region containing five ORFs which predicted five proteins: halogenase, bZIP transcription factor, cytochrome P450 monooxygenase, non-ribosomal peptide synthetase and polyketide synthase in all the five species. Genetic synteny was conserved in both Penicillium and Aspergillus species although genomic location seemed to be different since the clusters presented different flanking regions (except for A. steynii and A. westerdijkiae); these observations support the hypothesis of the orthology of this genomic region and that it might have been acquired by horizontal transfer. New real-time RT-PCR assays for quantification of the expression of these OTA biosynthetic genes were developed. In all species, the five genes were consistently expressed in OTA-producing strains in permissive conditions. These protocols might favour futures studies on the regulation of biosynthetic genes in order to develop new efficient control methods to avoid OTA entering the food chain. Copyright © 2018 Elsevier B.V. All rights reserved.
2013-01-01
Background Secondary metabolite production, a hallmark of filamentous fungi, is an expanding area of research for the Aspergilli. These compounds are potent chemicals, ranging from deadly toxins to therapeutic antibiotics to potential anti-cancer drugs. The genome sequences for multiple Aspergilli have been determined, and provide a wealth of predictive information about secondary metabolite production. Sequence analysis and gene overexpression strategies have enabled the discovery of novel secondary metabolites and the genes involved in their biosynthesis. The Aspergillus Genome Database (AspGD) provides a central repository for gene annotation and protein information for Aspergillus species. These annotations include Gene Ontology (GO) terms, phenotype data, gene names and descriptions and they are crucial for interpreting both small- and large-scale data and for aiding in the design of new experiments that further Aspergillus research. Results We have manually curated Biological Process GO annotations for all genes in AspGD with recorded functions in secondary metabolite production, adding new GO terms that specifically describe each secondary metabolite. We then leveraged these new annotations to predict roles in secondary metabolism for genes lacking experimental characterization. As a starting point for manually annotating Aspergillus secondary metabolite gene clusters, we used antiSMASH (antibiotics and Secondary Metabolite Analysis SHell) and SMURF (Secondary Metabolite Unknown Regions Finder) algorithms to identify potential clusters in A. nidulans, A. fumigatus, A. niger and A. oryzae, which we subsequently refined through manual curation. Conclusions This set of 266 manually curated secondary metabolite gene clusters will facilitate the investigation of novel Aspergillus secondary metabolites. PMID:23617571
NASA Astrophysics Data System (ADS)
Dodda, Subba Reddy; Aich, Aparajita; Sarkar, Nibedita; Jain, Piyush; Jain, Sneha; Mondal, Sudipa; Aikat, Kaustav; Mukhopadhyay, Sudit S.
2018-03-01
Thermostable glucose tolerant β-glucosidase from Aspergillus species has attracted worldwide interest for their potentiality in industrial applications and bioethanol production. A strain of Aspergillus fumigatus (AfNITDGPKA3) identified by our laboratory from straw retting ground showed higher cellulase activity, specifically the β-glucosidase activity, compared to other contemporary strains. Though A. fumigatus has been known for high cellulase activity, detailed identification and characterization of the cellulase genes from their genome is yet to be done. In this work we have been analyzed the cellulase genes from the genome sequence database of Aspergillus fumigatus (Af293). Genome analysis suggests two cellobiohydrolase, eleven endoglucanase and seventeen β-glucosidase genes present. β-Glucosidase genes belong to either Glycohydro1 (GH1 or Bgl1) or Glycohydro3 (GH3 or Bgl3) family. The sequence similarity suggests that Bgl1 and Bgl3 of A. fumagatus are phylogenetically close to those of A. fisheri and A. oryzae. The modelled structure of the Bgl1 predicts the (β/α)8 barrel type structure with deep and narrow active site, whereas, Bgl3 shows the (α/β)8 barrel and (α/β)6 sandwich structure with shallow and open active site. Docking results suggest that amino acids Glu544, Glu466, Trp408,Trp567,Tyr44,Tyr222,Tyr770,Asp844,Asp537,Asn212,Asn217 of Bgl3 and Asp224,Asn242,Glu440, Glu445, Tyr367, Tyr365,Thr994,Trp435,Trp446 of Bgl1 are involved in the hydrolysis. Binding affinity analyses suggest that Bgl3 and Bgl1 enzymes are more active on the substrates like 4-methylumbelliferyl glycoside (MUG) and p-nitrophenyl-β-D-1, 4-glucopyranoside (pNPG) than on cellobiose. Further docking with glucose suggests that Bgl1 is more glucose tolerant than Bgl3. Analysis of the Aspergillus fumigatus genome may help to identify a β-glucosidase enzyme with better property and the structural information may help to develop an engineered recombinant enzyme.
Genomic Islands in Pathogenic Filamentous Fungus Aspergillus fumigatus
USDA-ARS?s Scientific Manuscript database
We present the genome sequences of a new clinical isolate, CEA10, of an important human pathogen, Aspergillus fumigatus, and two closely related, but rarely pathogenic species, Neosartorya fischeri NRRL181 and Aspergillus clavatus NRRL1. Comparative genomic analysis of CEA10 with the recently sequen...
Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger.
Wright, James C; Sugden, Deana; Francis-McIntyre, Sue; Riba-Garcia, Isabel; Gaskell, Simon J; Grigoriev, Igor V; Baker, Scott E; Beynon, Robert J; Hubbard, Simon J
2009-02-04
Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS) and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR). 405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6%) of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models. This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST) data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method.
Kjerbolling, Inge; Vesth, Tammi C.; Frisvad, Jens C.; ...
2018-01-09
The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories over model organisms to human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus and A. steynii) has been whole genome PacBio sequenced to provide genetic references in three Aspergillus sections. Additionally, A. taichungensis and A. candidus were sequenced for SM elucidation. Thirteen Aspergillus genomes were analysed with comparative genomics to determine phylogeny and genetic diversity, showing that each new genome contains 15–27% genes not found in othermore » sequenced Aspergilli. In particular, the new species A. novofumigatus was compared to the pathogenic species A. fumigatus. This suggests that A. novofumigatus can produce most of the same allergens, virulence and pathogenicity factors as A. fumigatus suggesting that A. novofumigatus could be as pathogenic as A. fumigatus. Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences and predictive algorithms.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kjerbolling, Inge; Vesth, Tammi C.; Frisvad, Jens C.
The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories over model organisms to human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus and A. steynii) has been whole genome PacBio sequenced to provide genetic references in three Aspergillus sections. Additionally, A. taichungensis and A. candidus were sequenced for SM elucidation. Thirteen Aspergillus genomes were analysed with comparative genomics to determine phylogeny and genetic diversity, showing that each new genome contains 15–27% genes not found in othermore » sequenced Aspergilli. In particular, the new species A. novofumigatus was compared to the pathogenic species A. fumigatus. This suggests that A. novofumigatus can produce most of the same allergens, virulence and pathogenicity factors as A. fumigatus suggesting that A. novofumigatus could be as pathogenic as A. fumigatus. Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences and predictive algorithms.« less
Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger
Wright, James C; Sugden, Deana; Francis-McIntyre, Sue; Riba-Garcia, Isabel; Gaskell, Simon J; Grigoriev, Igor V; Baker, Scott E; Beynon, Robert J; Hubbard, Simon J
2009-01-01
Background Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS) and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR). Results 405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6%) of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models. Conclusion This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST) data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method. PMID:19193216
Comparative Reannotation of 21 Aspergillus Genomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Salamov, Asaf; Riley, Robert; Kuo, Alan
2013-03-08
We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one whichmore » most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.« less
USDA-ARS?s Scientific Manuscript database
Aspergillus flavus and A. parasiticus fungi, carcinogen-mycotoxins producers, infect peanut seeds, causing considerable impact on both human health and the economy. Here we report 9 genome sequences of Aspergillus spp. isolated from peanut seeds. The information obtained will allow conducting biodiv...
Caspeta, Luis; Nielsen, Jens
2013-05-01
Recently genome sequence data have become available for Aspergillus and Pichia species of industrial interest. This has stimulated the use of systems biology approaches for large-scale analysis of the molecular and metabolic responses of Aspergillus and Pichia under defined conditions, which has resulted in much new biological information. Case-specific contextualization of this information has been performed using comparative and functional genomic tools. Genomics data are also the basis for constructing genome-scale metabolic models, and these models have helped in the contextualization of knowledge on the fundamental biology of Aspergillus and Pichia species. Furthermore, with the availability of these models, the engineering of Aspergillus and Pichia is moving from traditional approaches, such as random mutagenesis, to a systems metabolic engineering approach. Here we review the recent trends in systems biology of Aspergillus and Pichia species, highlighting the relevance of these developments for systems metabolic engineering of these organisms for the production of hydrolytic enzymes, biofuels and chemicals from biomass. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Characterization of recombinant terrelysin, a hemolysin of Aspergillus terreus.
Nayak, Ajay P; Blachere, Françoise M; Hettick, Justin M; Lukomski, Slawomir; Schmechel, Detlef; Beezhold, Donald H
2011-01-01
Fungal hemolysins are potential virulence factors. Some fungal hemolysins belong to the aegerolysin protein family that includes cytolysins capable of lysing erythrocytes and other cells. Here, we describe a hemolysin from Aspergillus terreus called terrelysin. We used the genome sequence database to identify the terrelysin sequence based on homology with other known aegerolysins. Aspergillus terreus mRNA was isolated, transcribed to cDNA and the open reading frame for terrelysin amplified by PCR using specific primers. Using the pASK-IBA6 cloning vector, we produced recombinant terrelysin (rTerrelysin) as a fusion product in Escherichia coli. The recombinant protein was purified and using MALDI-TOF MS determined to have a mass of 16,428 Da. Circular dichroism analysis suggests the secondary structure of the protein to be predominantly β-sheet. Results from thermal denaturation of rTerrelysin show that the protein maintained the β-sheet confirmation up to 65°C. Polyclonal antibody to rTerrelysin recognized a protein of approximately 16.5 kDa in mycelial extracts from A. terreus.
Draft Genome Sequence of Aspergillus oryzae ATCC 12892
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deng, Shuang; Pomraning, Kyle R.; Bohutskyi, Pavlo
The draft genome sequence ofAspergillus oryzaeATCC 12892 is presented here.A. oryzaeproduces 3-nitropropionic acid, which has been investigated with regard to understanding the biosynthesis of nitroorganic compounds.
Weigt, S Samuel; Wang, Xiaoyan; Palchevskiy, Vyacheslav; Patel, Naman; Derhovanessian, Ariss; Shino, Michael Y; Sayah, David M; Lynch, Joseph P; Saggar, Rajan; Ross, David J; Kubak, Bernie M; Ardehali, Abbas; Palmer, Scott; Husain, Shahid; Belperio, John A
2018-06-01
Aspergillus colonization after lung transplant is associated with an increased risk of chronic lung allograft dysfunction (CLAD). We hypothesized that gene expression during Aspergillus colonization could provide clues to CLAD pathogenesis. We examined transcriptional profiles in 3- or 6-month surveillance bronchoalveolar lavage fluid cell pellets from recipients with Aspergillus fumigatus colonization (n = 12) and without colonization (n = 10). Among the Aspergillus colonized, we also explored profiles in those who developed CLAD (n = 6) or remained CLAD-free (n = 6). Transcription profiles were assayed with the HG-U133 Plus 2.0 microarray (Affymetrix). Differential gene expression was based on an absolute fold difference of 2.0 or greater and unadjusted P value less than 0.05. We used NIH Database for Annotation, Visualization and Integrated Discovery for functional analyses, with false discovery rates less than 5% considered significant. Aspergillus colonization was associated with differential expression of 489 probe sets, representing 404 unique genes. "Defense response" genes and genes in the "cytokine-cytokine receptor" Kyoto Encyclopedia of Genes and Genomes pathway were notably enriched in this list. Among Aspergillus colonized patients, CLAD development was associated with differential expression of 69 probe sets, representing 64 unique genes. This list was enriched for genes involved in "immune response" and "response to wounding", among others. Notably, both chitinase 3-like-1 and chitotriosidase were associated with progression to CLAD. Aspergillus colonization is associated with gene expression profiles related to defense responses including cytokine signaling. Epithelial wounding, as well as the innate immune response to chitin that is present in the fungal cell wall, may be key in the link between Aspergillus colonization and CLAD.
Genomic sequence for the aflatoxigenic filamentous fungus Aspergillus nomius
USDA-ARS?s Scientific Manuscript database
The genome of the A. nomius type strain was sequenced using a personal genome machine. Annotation of the genes was undertaken, followed by gene ontology and an investigation into the number of secondary metabolite clusters. Comparative studies with other Aspergillus species involved shared/unique ge...
A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.
Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong
2015-01-01
Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.
Coutinho, Pedro M; Andersen, Mikael R; Kolenova, Katarina; vanKuyk, Patricia A; Benoit, Isabelle; Gruben, Birgit S; Trejo-Aguilar, Blanca; Visser, Hans; van Solingen, Piet; Pakula, Tiina; Seiboth, Bernard; Battaglia, Evy; Aguilar-Osorio, Guillermo; de Jong, Jan F; Ohm, Robin A; Aguilar, Mariana; Henrissat, Bernard; Nielsen, Jens; Stålbrand, Henrik; de Vries, Ronald P
2009-03-01
The plant polysaccharide degradative potential of Aspergillus nidulans was analysed in detail and compared to that of Aspergillus niger and Aspergillus oryzae using a combination of bioinformatics, physiology and transcriptomics. Manual verification indicated that 28.4% of the A. nidulans ORFs analysed in this study do not contain a secretion signal, of which 40% may be secreted through a non-classical method.While significant differences were found between the species in the numbers of ORFs assigned to the relevant CAZy families, no significant difference was observed in growth on polysaccharides. Growth differences were observed between the Aspergilli and Podospora anserina, which has a more different genomic potential for polysaccharide degradation, suggesting that large genomic differences are required to cause growth differences on polysaccharides. Differences were also detected between the Aspergilli in the presence of putative regulatory sequences in the promoters of the ORFs of this study and correlation of the presence of putative XlnR binding sites to induction by xylose was detected for A. niger. These data demonstrate differences at genome content, substrate specificity of the enzymes and gene regulation in these three Aspergilli, which likely reflect their individual adaptation to their natural biotope.
A Genomics Based Discovery of Secondary Metabolite Biosynthetic Gene Clusters in Aspergillus ustus
Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong
2015-01-01
Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180
Umemura, Myco; Koike, Hideaki; Yamane, Noriko; Koyama, Yoshinori; Satou, Yuki; Kikuzato, Ikuya; Teruya, Morimi; Tsukahara, Masatoshi; Imada, Yumi; Wachi, Youji; Miwa, Yukino; Yano, Shuichi; Tamano, Koichi; Kawarabayasi, Yutaka; Fujimori, Kazuhiro E.; Machida, Masayuki; Hirano, Takashi
2012-01-01
Aspergillus oryzae has been utilized for over 1000 years in Japan for the production of various traditional foods, and a large number of A. oryzae strains have been isolated and/or selected for the effective fermentation of food ingredients. Characteristics of genetic alterations among the strains used are of particular interest in studies of A. oryzae. Here, we have sequenced the whole genome of an industrial fungal isolate, A. oryzae RIB326, by using a next-generation sequencing system and compared the data with those of A. oryzae RIB40, a wild-type strain sequenced in 2005. The aim of this study was to evaluate the mutation pressure on the non-syntenic blocks (NSBs) of the genome, which were previously identified through comparative genomic analysis of A. oryzae, Aspergillus fumigatus, and Aspergillus nidulans. We found that genes within the NSBs of RIB326 accumulate mutations more frequently than those within the SBs, regardless of their distance from the telomeres or of their expression level. Our findings suggest that the high mutation frequency of NSBs might contribute to maintaining the diversity of the A. oryzae genome. PMID:22912434
Umemura, Myco; Koike, Hideaki; Yamane, Noriko; Koyama, Yoshinori; Satou, Yuki; Kikuzato, Ikuya; Teruya, Morimi; Tsukahara, Masatoshi; Imada, Yumi; Wachi, Youji; Miwa, Yukino; Yano, Shuichi; Tamano, Koichi; Kawarabayasi, Yutaka; Fujimori, Kazuhiro E; Machida, Masayuki; Hirano, Takashi
2012-10-01
Aspergillus oryzae has been utilized for over 1000 years in Japan for the production of various traditional foods, and a large number of A. oryzae strains have been isolated and/or selected for the effective fermentation of food ingredients. Characteristics of genetic alterations among the strains used are of particular interest in studies of A. oryzae. Here, we have sequenced the whole genome of an industrial fungal isolate, A. oryzae RIB326, by using a next-generation sequencing system and compared the data with those of A. oryzae RIB40, a wild-type strain sequenced in 2005. The aim of this study was to evaluate the mutation pressure on the non-syntenic blocks (NSBs) of the genome, which were previously identified through comparative genomic analysis of A. oryzae, Aspergillus fumigatus, and Aspergillus nidulans. We found that genes within the NSBs of RIB326 accumulate mutations more frequently than those within the SBs, regardless of their distance from the telomeres or of their expression level. Our findings suggest that the high mutation frequency of NSBs might contribute to maintaining the diversity of the A. oryzae genome.
Joardar, Vinita; Abrams, Natalie F; Hostetler, Jessica; Paukstelis, Paul J; Pakala, Suchitra; Pakala, Suman B; Zafar, Nikhat; Abolude, Olukemi O; Payne, Gary; Andrianopoulos, Alex; Denning, David W; Nierman, William C
2012-12-12
The genera Aspergillus and Penicillium include some of the most beneficial as well as the most harmful fungal species such as the penicillin-producer Penicillium chrysogenum and the human pathogen Aspergillus fumigatus, respectively. Their mitochondrial genomic sequences may hold vital clues into the mechanisms of their evolution, population genetics, and biology, yet only a handful of these genomes have been fully sequenced and annotated. Here we report the complete sequence and annotation of the mitochondrial genomes of six Aspergillus and three Penicillium species: A. fumigatus, A. clavatus, A. oryzae, A. flavus, Neosartorya fischeri (A. fischerianus), A. terreus, P. chrysogenum, P. marneffei, and Talaromyces stipitatus (P. stipitatum). The accompanying comparative analysis of these and related publicly available mitochondrial genomes reveals wide variation in size (25-36 Kb) among these closely related fungi. The sources of genome expansion include group I introns and accessory genes encoding putative homing endonucleases, DNA and RNA polymerases (presumed to be of plasmid origin) and hypothetical proteins. The two smallest sequenced genomes (A. terreus and P. chrysogenum) do not contain introns in protein-coding genes, whereas the largest genome (T. stipitatus), contains a total of eleven introns. All of the sequenced genomes have a group I intron in the large ribosomal subunit RNA gene, suggesting that this intron is fixed in these species. Subsequent analysis of several A. fumigatus strains showed low intraspecies variation. This study also includes a phylogenetic analysis based on 14 concatenated core mitochondrial proteins. The phylogenetic tree has a different topology from published multilocus trees, highlighting the challenges still facing the Aspergillus systematics. The study expands the genomic resources available to fungal biologists by providing mitochondrial genomes with consistent annotations for future genetic, evolutionary and population studies. Despite the conservation of the core genes, the mitochondrial genomes of Aspergillus and Penicillium species examined here exhibit significant amount of interspecies variation. Most of this variation can be attributed to accessory genes and mobile introns, presumably acquired by horizontal gene transfer of mitochondrial plasmids and intron homing.
Genome sequences of three strains of Aspergillus flavus for the biological control of Aflatoxin
USDA-ARS?s Scientific Manuscript database
The genomes of three strains of Aspergillus flavus with demonstrated utility for the biological control of aflatoxin were sequenced. These sequences were assembled with MIRA and annotated with Augustus using A. flavus strain 3357 (NCBI EQ963472) as a reference. Each strain had a genome of 36.3 to ...
Detection of alternative splice variants at the proteome level in Aspergillus flavus.
Chang, Kung-Yen; Georgianna, D Ryan; Heber, Steffen; Payne, Gary A; Muddiman, David C
2010-03-05
Identification of proteins from proteolytic peptides or intact proteins plays an essential role in proteomics. Researchers use search engines to match the acquired peptide sequences to the target proteins. However, search engines depend on protein databases to provide candidates for consideration. Alternative splicing (AS), the mechanism where the exon of pre-mRNAs can be spliced and rearranged to generate distinct mRNA and therefore protein variants, enable higher eukaryotic organisms, with only a limited number of genes, to have the requisite complexity and diversity at the proteome level. Multiple alternative isoforms from one gene often share common segments of sequences. However, many protein databases only include a limited number of isoforms to keep minimal redundancy. As a result, the database search might not identify a target protein even with high quality tandem MS data and accurate intact precursor ion mass. We computationally predicted an exhaustive list of putative isoforms of Aspergillus flavus proteins from 20 371 expressed sequence tags to investigate whether an alternative splicing protein database can assign a greater proportion of mass spectrometry data. The newly constructed AS database provided 9807 new alternatively spliced variants in addition to 12 832 previously annotated proteins. The searches of the existing tandem MS spectra data set using the AS database identified 29 new proteins encoded by 26 genes. Nine fungal genes appeared to have multiple protein isoforms. In addition to the discovery of splice variants, AS database also showed potential to improve genome annotation. In summary, the introduction of an alternative splicing database helps identify more proteins and unveils more information about a proteome.
Survey of protein–DNA interactions in Aspergillus oryzae on a genomic scale
Wang, Chao; Lv, Yangyong; Wang, Bin; Yin, Chao; Lin, Ying; Pan, Li
2015-01-01
The genome-scale delineation of in vivo protein–DNA interactions is key to understanding genome function. Only ∼5% of transcription factors (TFs) in the Aspergillus genus have been identified using traditional methods. Although the Aspergillus oryzae genome contains >600 TFs, knowledge of the in vivo genome-wide TF-binding sites (TFBSs) in aspergilli remains limited because of the lack of high-quality antibodies. We investigated the landscape of in vivo protein–DNA interactions across the A. oryzae genome through coupling the DNase I digestion of intact nuclei with massively parallel sequencing and the analysis of cleavage patterns in protein–DNA interactions at single-nucleotide resolution. The resulting map identified overrepresented de novo TF-binding motifs from genomic footprints, and provided the detailed chromatin remodeling patterns and the distribution of digital footprints near transcription start sites. The TFBSs of 19 known Aspergillus TFs were also identified based on DNase I digestion data surrounding potential binding sites in conjunction with TF binding specificity information. We observed that the cleavage patterns of TFBSs were dependent on the orientation of TF motifs and independent of strand orientation, consistent with the DNA shape features of binding motifs with flanking sequences. PMID:25883143
Genome sequence of Aspergillus luchuensis NBRC 4314
Yamada, Osamu; Machida, Masayuki; Hosoyama, Akira; Goto, Masatoshi; Takahashi, Toru; Futagami, Taiki; Yamagata, Youhei; Takeuchi, Michio; Kobayashi, Tetsuo; Koike, Hideaki; Abe, Keietsu; Asai, Kiyoshi; Arita, Masanori; Fujita, Nobuyuki; Fukuda, Kazuro; Higa, Ken-ichi; Horikawa, Hiroshi; Ishikawa, Takeaki; Jinno, Koji; Kato, Yumiko; Kirimura, Kohtaro; Mizutani, Osamu; Nakasone, Kaoru; Sano, Motoaki; Shiraishi, Yohei; Tsukahara, Masatoshi; Gomi, Katsuya
2016-01-01
Awamori is a traditional distilled beverage made from steamed Thai-Indica rice in Okinawa, Japan. For brewing the liquor, two microbes, local kuro (black) koji mold Aspergillus luchuensis and awamori yeast Saccharomyces cerevisiae are involved. In contrast, that yeasts are used for ethanol fermentation throughout the world, a characteristic of Japanese fermentation industries is the use of Aspergillus molds as a source of enzymes for the maceration and saccharification of raw materials. Here we report the draft genome of a kuro (black) koji mold, A. luchuensis NBRC 4314 (RIB 2604). The total length of nonredundant sequences was nearly 34.7 Mb, comprising approximately 2,300 contigs with 16 telomere-like sequences. In total, 11,691 genes were predicted to encode proteins. Most of the housekeeping genes, such as transcription factors and N-and O-glycosylation system, were conserved with respect to Aspergillus niger and Aspergillus oryzae. An alternative oxidase and acid-stable α-amylase regarding citric acid production and fermentation at a low pH as well as a unique glutamic peptidase were also found in the genome. Furthermore, key biosynthetic gene clusters of ochratoxin A and fumonisin B were absent when compared with A. niger genome, showing the safety of A. luchuensis for food and beverage production. This genome information will facilitate not only comparative genomics with industrial kuro-koji molds, but also molecular breeding of the molds in improvements of awamori fermentation. PMID:27651094
Masih, Aradhana; Singh, Pradeep K; Kathuria, Shallu; Agarwal, Kshitij; Meis, Jacques F; Chowdhary, Anuradha
2016-09-01
Aspergillus species cause a wide spectrum of clinical infections. Although Aspergillus fumigatus and Aspergillus flavus remain the most commonly isolated species in aspergillosis, in the last decade, rare and cryptic Aspergillus species have emerged in diverse clinical settings. The present study analyzed the distribution and in vitro antifungal susceptibility profiles of rare Aspergillus species in clinical samples from patients with suspected aspergillosis in 8 medical centers in India. Further, a matrix-assisted laser desorption ionization-time of flight mass spectrometry in-house database was developed to identify these clinically relevant Aspergillus species. β-Tubulin and calmodulin gene sequencing identified 45 rare Aspergillus isolates to the species level, except for a solitary isolate. They included 23 less common Aspergillus species belonging to 12 sections, mainly in Circumdati, Nidulantes, Flavi, Terrei, Versicolores, Aspergillus, and Nigri Matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) identified only 8 (38%) of the 23 rare Aspergillus isolates to the species level. Following the creation of an in-house database with the remaining 14 species not available in the Bruker database, the MALDI-TOF MS identification rate increased to 95%. Overall, high MICs of ≥2 μg/ml were noted for amphotericin B in 29% of the rare Aspergillus species, followed by voriconazole in 20% and isavuconazole in 7%, whereas MICs of >0.5 μg/ml for posaconazole were observed in 15% of the isolates. Regarding the clinical diagnoses in 45 patients with positive rare Aspergillus species cultures, 19 (42%) were regarded to represent colonization. In the remaining 26 patients, rare Aspergillus species were the etiologic agent of invasive, chronic, and allergic bronchopulmonary aspergillosis, allergic fungal rhinosinusitis, keratitis, and mycetoma. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Masih, Aradhana; Singh, Pradeep K.; Kathuria, Shallu; Agarwal, Kshitij
2016-01-01
Aspergillus species cause a wide spectrum of clinical infections. Although Aspergillus fumigatus and Aspergillus flavus remain the most commonly isolated species in aspergillosis, in the last decade, rare and cryptic Aspergillus species have emerged in diverse clinical settings. The present study analyzed the distribution and in vitro antifungal susceptibility profiles of rare Aspergillus species in clinical samples from patients with suspected aspergillosis in 8 medical centers in India. Further, a matrix-assisted laser desorption ionization–time of flight mass spectrometry in-house database was developed to identify these clinically relevant Aspergillus species. β-Tubulin and calmodulin gene sequencing identified 45 rare Aspergillus isolates to the species level, except for a solitary isolate. They included 23 less common Aspergillus species belonging to 12 sections, mainly in Circumdati, Nidulantes, Flavi, Terrei, Versicolores, Aspergillus, and Nigri. Matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) identified only 8 (38%) of the 23 rare Aspergillus isolates to the species level. Following the creation of an in-house database with the remaining 14 species not available in the Bruker database, the MALDI-TOF MS identification rate increased to 95%. Overall, high MICs of ≥2 μg/ml were noted for amphotericin B in 29% of the rare Aspergillus species, followed by voriconazole in 20% and isavuconazole in 7%, whereas MICs of >0.5 μg/ml for posaconazole were observed in 15% of the isolates. Regarding the clinical diagnoses in 45 patients with positive rare Aspergillus species cultures, 19 (42%) were regarded to represent colonization. In the remaining 26 patients, rare Aspergillus species were the etiologic agent of invasive, chronic, and allergic bronchopulmonary aspergillosis, allergic fungal rhinosinusitis, keratitis, and mycetoma. PMID:27413188
Nakamura, Sayaka; Sato, Hiroaki; Tanaka, Reiko; Yaguchi, Takashi
2016-01-01
We have previously proposed a rapid identification method for bacterial strains based on the profiles of their ribosomal subunit proteins (RSPs), observed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). This method can perform phylogenetic characterization based on the mass of housekeeping RSP biomarkers, ideally calculated from amino acid sequence information registered in public protein databases. With the aim of extending its field of application to medical mycology, this study investigates the actual state of information of RSPs of eukaryotic fungi registered in public protein databases through the characterization of ribosomal protein fractions extracted from genome-sequenced Aspergillus fumigatus strains Af293 and A1163 as a model. In this process, we have found that the public protein databases harbor problems. The RSP names are in confusion, so we have provisionally unified them using the yeast naming system. The most serious problem is that many incorrect sequences are registered in the public protein databases. Surprisingly, more than half of the sequences are incorrect, due chiefly to mis-annotation of exon/intron structures. These errors could be corrected by a combination of in silico inspection by sequence homology analysis and MALDI-TOF MS measurements. We were also able to confirm conserved post-translational modifications in eleven RSPs. After these verifications, the masses of 31 expressed RSPs under 20,000 Da could be accurately confirmed. These RSPs have a potential to be useful biomarkers for identifying clinical isolates of A. fumigatus .
A multilocus database for the identification of Aspergillus and Penicillium species
USDA-ARS?s Scientific Manuscript database
Identification of Aspergillus and Penicillium isolates using phenotypic methods is increasingly complex and difficult but genetic tools allow recognition and description of species formerly unrecognized or cryptic. We constructed a web-based taxonomic database using BIGSdb for the identification of ...
Singh, Nitin Kumar; Blachowicz, Adriana; Checinska, Aleksandra; Wang, Clay; Venkateswaran, Kasthuri
2016-07-14
Draft genome sequences of Aspergillus fumigatus strains (ISSFT-021 and IF1SW-F4), opportunistic pathogens isolated from the International Space Station (ISS), were assembled to facilitate investigations of the nature of the virulence characteristics of the ISS strains to other clinical strains isolated on Earth. Copyright © 2016 Singh et al.
Genome Sequences of Three Strains of Aspergillus flavus for the Biological Control of Aflatoxin
Scheffler, Brian E.; Duke, Mary; Ballard, Linda; Abbas, Hamed K.; Grodowitz, Michael J.
2017-01-01
ABSTRACT Aflatoxin is a carcinogenic contaminant of many commodities that are infected by Aspergillus flavus. Nonaflatoxigenic strains of A. flavus have been utilized as biological control agents. Here, we report the genome sequences from three biocontrol strains. This information will be useful in developing markers for postrelease monitoring of these fungi. PMID:29097466
Nakamura, Sayaka; Sato, Hiroaki; Tanaka, Reiko; Kusuya, Yoko; Takahashi, Hiroki; Yaguchi, Takashi
2017-04-26
Accurate identification of Aspergillus species is a very important subject. Mass spectral fingerprinting using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) is generally employed for the rapid identification of fungal isolates. However, the results are based on simple mass spectral pattern-matching, with no peak assignment and no taxonomic input. We propose here a ribosomal subunit protein (RSP) typing technique using MALDI-TOF MS for the identification and discrimination of Aspergillus species. The results are concluded to be phylogenetic in that they reflect the molecular evolution of housekeeping RSPs. The amino acid sequences of RSPs of genome-sequenced strains of Aspergillus species were first verified and compared to compile a reliable biomarker list for the identification of Aspergillus species. In this process, we revealed that many amino acid sequences of RSPs (about 10-60%, depending on strain) registered in the public protein databases needed to be corrected or newly added. The verified RSPs were allocated to RSP types based on their mass. Peak assignments of RSPs of each sample strain as observed by MALDI-TOF MS were then performed to set RSP type profiles, which were then further processed by means of cluster analysis. The resulting dendrogram based on RSP types showed a relatively good concordance with the tree based on β-tubulin gene sequences. RSP typing was able to further discriminate the strains belonging to Aspergillus section Fumigati. The RSP typing method could be applied to identify Aspergillus species, even for species within section Fumigati. The discrimination power of RSP typing appears to be comparable to conventional β-tubulin gene analysis. This method would therefore be suitable for species identification and discrimination at the strain to species level. Because RSP typing can characterize the strains within section Fumigati, this method has potential as a powerful and reliable tool in the field of clinical microbiology.
Ge, Yongyi; Wang, Yuchen; Liu, YongXiang; Tan, Yumei; Ren, Xiuxiu; Zhang, Xinyu; Hyde, Kevin D; Liu, Yongfeng; Liu, Zuoyi
2016-06-07
Aspergillus cristatus is the dominant fungus involved in the fermentation of Chinese Fuzhuan brick tea. Aspergillus cristatus is a homothallic fungus that undergoes a sexual stage without asexual conidiation when cultured in hypotonic medium. The asexual stage is induced by a high salt concentration, which completely inhibits sexual development. The taxon is therefore appropriate for investigating the mechanisms of asexual and sexual reproduction in fungi. In this study, de novo genome sequencing and analysis of transcriptomes during culture under high- and low-osmolarity conditions were performed. These analyses facilitated investigation of the evolution of mating-type genes, which determine the mode of sexual reproduction, in A. cristatus, the response of the high-osmolarity glycerol (HOG) pathway to osmotic stimulation, and the detection of mycotoxins and evaluation of the relationship with the location of the encoding genes. The A. cristatus genome comprised 27.9 Mb and included 68 scaffolds, from which 10,136 protein-coding gene models were predicted. A phylogenetic analysis suggested a considerable phylogenetic distance between A. cristatus and A. nidulans. Comparison of the mating-type gene loci among Aspergillus species indicated that the mode in A. cristatus differs from those in other Aspergillus species. The components of the HOG pathway were conserved in the genome of A. cristatus. Differential gene expression analysis in A. cristatus using RNA-Seq demonstrated that the expression of most genes in the HOG pathway was unaffected by osmotic pressure. No gene clusters associated with the production of carcinogens were detected. A model of the mating-type locus in A. cristatus is reported for the first time. Aspergillus cristatus has evolved various mechanisms to cope with high osmotic stress. As a fungus associated with Fuzhuan tea, it is considered to be safe under low- and high-osmolarity conditions.
Nierman, William C; Yu, Jiujiang; Fedorova-Abrams, Natalie D; Losada, Liliana; Cleveland, Thomas E; Bhatnagar, Deepak; Bennett, Joan W; Dean, Ralph; Payne, Gary A
2015-04-16
Aflatoxin contamination of food and livestock feed results in significant annual crop losses internationally. Aspergillus flavus is the major fungus responsible for this loss. Additionally, A. flavus is the second leading cause of aspergillosis in immunocompromised human patients. Here, we report the genome sequence of strain NRRL 3357. Copyright © 2015 Nierman et al.
Raethong, Nachon; Wong-ekkabut, Jirasak; Laoteng, Kobkul; Vongsangnak, Wanwipa
2016-01-01
Aspergillus oryzae is widely used for the industrial production of enzymes. In A. oryzae metabolism, transporters appear to play crucial roles in controlling the flux of molecules for energy generation, nutrients delivery, and waste elimination in the cell. While the A. oryzae genome sequence is available, transporter annotation remains limited and thus the connectivity of metabolic networks is incomplete. In this study, we developed a metabolic annotation strategy to understand the relationship between the sequence, structure, and function for annotation of A. oryzae metabolic transporters. Sequence-based analysis with manual curation showed that 58 genes of 12,096 total genes in the A. oryzae genome encoded metabolic transporters. Under consensus integrative databases, 55 unambiguous metabolic transporter genes were distributed into channels and pores (7 genes), electrochemical potential-driven transporters (33 genes), and primary active transporters (15 genes). To reveal the transporter functional role, a combination of homology modeling and molecular dynamics simulation was implemented to assess the relationship between sequence to structure and structure to function. As in the energy metabolism of A. oryzae, the H+-ATPase encoded by the AO090005000842 gene was selected as a representative case study of multilevel linkage annotation. Our developed strategy can be used for enhancing metabolic network reconstruction. PMID:27274991
Raethong, Nachon; Wong-Ekkabut, Jirasak; Laoteng, Kobkul; Vongsangnak, Wanwipa
2016-01-01
Aspergillus oryzae is widely used for the industrial production of enzymes. In A. oryzae metabolism, transporters appear to play crucial roles in controlling the flux of molecules for energy generation, nutrients delivery, and waste elimination in the cell. While the A. oryzae genome sequence is available, transporter annotation remains limited and thus the connectivity of metabolic networks is incomplete. In this study, we developed a metabolic annotation strategy to understand the relationship between the sequence, structure, and function for annotation of A. oryzae metabolic transporters. Sequence-based analysis with manual curation showed that 58 genes of 12,096 total genes in the A. oryzae genome encoded metabolic transporters. Under consensus integrative databases, 55 unambiguous metabolic transporter genes were distributed into channels and pores (7 genes), electrochemical potential-driven transporters (33 genes), and primary active transporters (15 genes). To reveal the transporter functional role, a combination of homology modeling and molecular dynamics simulation was implemented to assess the relationship between sequence to structure and structure to function. As in the energy metabolism of A. oryzae, the H(+)-ATPase encoded by the AO090005000842 gene was selected as a representative case study of multilevel linkage annotation. Our developed strategy can be used for enhancing metabolic network reconstruction.
Nakamura, Sayaka; Sato, Hiroaki; Tanaka, Reiko; Yaguchi, Takashi
2016-01-01
We have previously proposed a rapid identification method for bacterial strains based on the profiles of their ribosomal subunit proteins (RSPs), observed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). This method can perform phylogenetic characterization based on the mass of housekeeping RSP biomarkers, ideally calculated from amino acid sequence information registered in public protein databases. With the aim of extending its field of application to medical mycology, this study investigates the actual state of information of RSPs of eukaryotic fungi registered in public protein databases through the characterization of ribosomal protein fractions extracted from genome-sequenced Aspergillus fumigatus strains Af293 and A1163 as a model. In this process, we have found that the public protein databases harbor problems. The RSP names are in confusion, so we have provisionally unified them using the yeast naming system. The most serious problem is that many incorrect sequences are registered in the public protein databases. Surprisingly, more than half of the sequences are incorrect, due chiefly to mis-annotation of exon/intron structures. These errors could be corrected by a combination of in silico inspection by sequence homology analysis and MALDI-TOF MS measurements. We were also able to confirm conserved post-translational modifications in eleven RSPs. After these verifications, the masses of 31 expressed RSPs under 20,000 Da could be accurately confirmed. These RSPs have a potential to be useful biomarkers for identifying clinical isolates of A. fumigatus. PMID:27843740
Scientific Advances with Aspergillus Species that Are Used for Food and Biotech Applications.
Biesebeke, Rob Te; Record, Erik
2008-01-01
Yeast and filamentous fungi have been used for centuries in diverse biotechnological processes. Fungal fermentation technology is traditionally used in relation to food production, such as for bread, beer, cheese, sake and soy sauce. Last century, the industrial application of yeast and filamentous fungi expanded rapidly, with excellent examples such as purified enzymes and secondary metabolites (e.g. antibiotics), which are used in a wide range of food as well as non-food industries. Research on protein and/or metabolite secretion by fungal species has focused on identifying bottlenecks in (post-) transcriptional regulation of protein production, metabolic rerouting, morphology and the transit of proteins through the secretion pathway. In past years, genome sequencing of some fungi (e.g. Aspergillus oryzae, Aspergillus niger) has been completed. The available genome sequences have enabled identification of genes and functionally important regions of the genome. This has directed research to focus on a post-genomics era in which transcriptomics, proteomics and metabolomics methodologies will help to explore the scientific relevance and industrial application of fungal genome sequences.
An, Junghwa; Bechet, Arnaud; Berggren, Asa; Brown, Sarah K; Bruford, Michael W; Cai, Qingui; Cassel-Lundhagen, Anna; Cezilly, Frank; Chen, Song-Lin; Cheng, Wei; Choi, Sung-Kyoung; Ding, X Y; Fan, Yong; Feldheim, Kevin A; Feng, Z Y; Friesen, Vicki L; Gaillard, Maria; Galaraza, Juan A; Gallo, Leonardo; Ganeshaiah, K N; Geraci, Julia; Gibbons, John G; Grant, William S; Grauvogel, Zac; Gustafsson, S; Guyon, Jeffrey R; Han, L; Heath, Daniel D; Hemmilä, S; Hogan, J Derek; Hou, B W; Jakse, Jernej; Javornik, Branka; Kaňuch, Peter; Kim, Kyung-Kil; Kim, Kyung-Seok; Kim, Sang-Gyu; Kim, Sang-In; Kim, Woo-Jin; Kim, Yi-Kyung; Klich, Maren A; Kreiser, Brian R; Kwan, Ye-Seul; Lam, Athena W; Lasater, Kelly; Lascoux, M; Lee, Hang; Lee, Yun-Sun; Li, D L; Li, Shao-Jing; Li, W Y; Liao, Xiaolin; Liber, Zlatko; Lin, Lin; Liu, Shaoying; Luo, Xin-Hui; Ma, Y H; Ma, Yajun; Marchelli, Paula; Min, Mi-Sook; Moccia, Maria Domenica; Mohana, Kumara P; Moore, Marcelle; Morris-Pocock, James A; Park, Han-Chan; Pfunder, Monika; Ivan, Radosavljević; Ravikanth, G; Roderick, George K; Rokas, Antonis; Sacks, Benjamin N; Saski, Christopher A; Satovic, Zlatko; Schoville, Sean D; Sebastiani, Federico; Sha, Zhen-Xia; Shin, Eun-Ha; Soliani, Carolina; Sreejayan, N; Sun, Zhengxin; Tao, Yong; Taylor, Scott A; Templin, William D; Shaanker, R Uma; Vasudeva, R; Vendramin, Giovanni G; Walter, Ryan P; Wang, Gui-Zhong; Wang, Ke-Jian; Wang, Y Q; Wattier, Rémi A; Wei, Fuwen; Widmer, Alex; Woltmann, Stefan; Won, Yong-Jin; Wu, Jing; Xie, M L; Xu, Genbo; Xu, Xiao-Jun; Ye, Hai-Hui; Zhan, Xiangjiang; Zhang, F; Zhong, J
2010-03-01
This article documents the addition of 411 microsatellite marker loci and 15 pairs of Single Nucleotide Polymorphism (SNP) sequencing primers to the Molecular Ecology Resources Database. Loci were developed for the following species: Acanthopagrus schlegeli, Anopheles lesteri, Aspergillus clavatus, Aspergillus flavus, Aspergillus fumigatus, Aspergillus oryzae, Aspergillus terreus, Branchiostoma japonicum, Branchiostoma belcheri, Colias behrii, Coryphopterus personatus, Cynogolssus semilaevis, Cynoglossus semilaevis, Dendrobium officinale, Dendrobium officinale, Dysoxylum malabaricum, Metrioptera roeselii, Myrmeciza exsul, Ochotona thibetana, Neosartorya fischeri, Nothofagus pumilio, Onychodactylus fischeri, Phoenicopterus roseus, Salvia officinalis L., Scylla paramamosain, Silene latifo, Sula sula, and Vulpes vulpes. These loci were cross-tested on the following species: Aspergillus giganteus, Colias pelidne, Colias interior, Colias meadii, Colias eurytheme, Coryphopterus lipernes, Coryphopterus glaucofrenum, Coryphopterus eidolon, Gnatholepis thompsoni, Elacatinus evelynae, Dendrobium loddigesii Dendrobium devonianum, Dysoxylum binectariferum, Nothofagus antarctica, Nothofagus dombeyii, Nothofagus nervosa, Nothofagus obliqua, Sula nebouxii, and Sula variegata. This article also documents the addition of 39 sequencing primer pairs and 15 allele specific primers or probes for Paralithodes camtschaticus. © 2010 Blackwell Publishing Ltd.
Sleiman, Sue; Halliday, Catriona L.; Chapman, Belinda; Brown, Mitchell; Nitschke, Joanne; Lau, Anna F.
2016-01-01
We developed an Australian database for the identification of Aspergillus, Scedosporium, and Fusarium species (n = 28) by matrix-assisted laser desorption ionization−time of flight mass spectrometry (MALDI-TOF MS). In a challenge against 117 isolates, species identification significantly improved when the in-house-built database was combined with the Bruker Filamentous Fungi Library compared with that for the Bruker library alone (Aspergillus, 93% versus 69%; Fusarium, 84% versus 42%; and Scedosporium, 94% versus 18%, respectively). PMID:27252460
Borsa, Barış Ata; Özgün, Gonca; Houbraken, Jos; Ökmen, Fırat
2015-01-01
The vast majority of vaginal fungal infections are caused by Candida species. However, vaginitis cases caused by molds are extremely rare. Aspergillus protuberus is previously known as a member of Aspergillus section Versicolores which can cause opportunistic infections in immunocompromised patients, however it has recently been described as a seperate species. Although the members of Aspergillus section Versicolores have been isolated rarely in cases of pulmonary infections, eye infections, otomycosis, osteomyelitis and onycomycoses, to the best of our knowledge, there is no published case of human infection caused by A.protuberus. In this report, the first case of persistent vaginitis due to A.protuberus in an immunocompetent patient was presented. A 42-year-old female patient was admitted to our hospital with the complaints of pelvic pain, vaginal itching and discharge during one month. Her symptoms had been persistant despite of the miconazole nitrate and clotrimazole therapies for probable candidal vaginitis. Fungal structures such as branched, septate hyphae together with the conidial forms were seen in microscopic examination as in the cervical smear. Thereafter, a vaginal discharge sample was taken for microbiological evaluation and similar characteristics of fungal structures were observed in the microscopic examination as of cervical smear. Then, preliminary result was reported as Aspergillus spp. At the same time, the sample was plated on Sabouraud dextrose agar (SDA) in duplicate and incubated at room temperature and at 37°C. After 5 days, white, powdery and pure-looking fungal colonies were observed in SDA which was incubated at room temperature, while the other medium remained sterile. The culture was submitted to the CBS-KNAW Fungal Biodiversity Center for further characterization. Phenotypic identification showed that the isolated strain belonged to the Aspergillus section Versicolores. The strain was grown for 7 days on malt extract agar and then ITS regions were amplified and sequenced from isolated DNA for genomic characterization. The obtained sequences were compared with the NCBI database and internal databases of the CBS-KNAW Fungal Biodiversity Centre and confirmed as Aspergillus section Versicolores. As a result of recent changes in classification of fungi, analysis of partial β-tubulin and calmodulin sequences have also been used to obtain a detailed and precise characterization. Eventually, the strain has been identified as A.protuberus which is a recently accepted species distinct from Aspergillus section Versicolores. As the patient could not be contacted after the preliminary report, detailed demographical information, probable origin and route of transmission of the agent and prognosis of infection remained obscure. In conclusion, the first case of vaginitis caused by A.protuberus was described in this report with the support of clinical, pathological, microbiological and molecular data.
A novel non-thermostable deuterolysin from Aspergillus oryzae.
Maeda, Hiroshi; Katase, Toru; Sakai, Daisuke; Takeuchi, Michio; Kusumoto, Ken-Ichi; Amano, Hitoshi; Ishida, Hiroki; Abe, Keietsu; Yamagata, Youhei
2016-09-01
Three putative deuterolysin (EC 3.4.24.29) genes (deuA, deuB, and deuC) were found in the Aspergillus oryzae genome database ( http://www.bio.nite.go.jp/dogan/project/view/AO ). One of these genes, deuA, was corresponding to NpII gene, previously reported. DeuA and DeuB were overexpressed by recombinant A. oryzae and were purified. The degradation profiles against protein substrates of both enzymes were similar, but DeuB showed wider substrate specificity against peptidyl MCA-substrates compared with DeuA. Enzymatic profiles of DeuB except for thermostability also resembled those of DeuA. DeuB was inactivated by heat treatment above 80° C, different from thermostable DeuA. Transcription analysis in wild type A. oryzae showed only deuB was expressed in liquid culture, and the addition of the proteinous substrate upregulated the transcription. Furthermore, the NaNO3 addition seems to eliminate the effect of proteinous substrate for the transcription of deuB.
On the way toward systems biology of Aspergillus fumigatus infection.
Albrecht, Daniela; Kniemeyer, Olaf; Mech, Franziska; Gunzer, Matthias; Brakhage, Axel; Guthke, Reinhard
2011-06-01
Pathogenicity of Aspergillus fumigatus is multifactorial. Thus, global studies are essential for the understanding of the infection process. Therefore, a data warehouse was established where genome sequence, transcriptome and proteome data are stored. These data are analyzed for the elucidation of virulence determinants. The data analysis workflow starts with pre-processing including imputing of missing values and normalization. Last step is the identification of differentially expressed genes/proteins as interesting candidates for further analysis, in particular for functional categorization and correlation studies. Sequence data and other prior knowledge extracted from databases are integrated to support the inference of gene regulatory networks associated with pathogenicity. This knowledge-assisted data analysis aims at establishing mathematical models with predictive strength to assist further experimental work. Recently, first steps were done to extend the integrative data analysis and computational modeling by evaluating spatio-temporal data (movies) that monitor interactions of A. fumigatus morphotypes (e.g. conidia) with host immune cells. Copyright © 2011 Elsevier GmbH. All rights reserved.
Improved annotation through genome-scale metabolic modeling of Aspergillus oryzae
Vongsangnak, Wanwipa; Olsen, Peter; Hansen, Kim; Krogsgaard, Steen; Nielsen, Jens
2008-01-01
Background Since ancient times the filamentous fungus Aspergillus oryzae has been used in the fermentation industry for the production of fermented sauces and the production of industrial enzymes. Recently, the genome sequence of A. oryzae with 12,074 annotated genes was released but the number of hypothetical proteins accounted for more than 50% of the annotated genes. Considering the industrial importance of this fungus, it is therefore valuable to improve the annotation and further integrate genomic information with biochemical and physiological information available for this microorganism and other related fungi. Here we proposed the gene prediction by construction of an A. oryzae Expressed Sequence Tag (EST) library, sequencing and assembly. We enhanced the function assignment by our developed annotation strategy. The resulting better annotation was used to reconstruct the metabolic network leading to a genome scale metabolic model of A. oryzae. Results Our assembled EST sequences we identified 1,046 newly predicted genes in the A. oryzae genome. Furthermore, it was possible to assign putative protein functions to 398 of the newly predicted genes. Noteworthy, our annotation strategy resulted in assignment of new putative functions to 1,469 hypothetical proteins already present in the A. oryzae genome database. Using the substantially improved annotated genome we reconstructed the metabolic network of A. oryzae. This network contains 729 enzymes, 1,314 enzyme-encoding genes, 1,073 metabolites and 1,846 (1,053 unique) biochemical reactions. The metabolic reactions are compartmentalized into the cytosol, the mitochondria, the peroxisome and the extracellular space. Transport steps between the compartments and the extracellular space represent 281 reactions, of which 161 are unique. The metabolic model was validated and shown to correctly describe the phenotypic behavior of A. oryzae grown on different carbon sources. Conclusion A much enhanced annotation of the A. oryzae genome was performed and a genome-scale metabolic model of A. oryzae was reconstructed. The model accurately predicted the growth and biomass yield on different carbon sources. The model serves as an important resource for gaining further insight into our understanding of A. oryzae physiology. PMID:18500999
Integrated database for identifying candidate genes for Aspergillus flavus resistance in maize
2010-01-01
Background Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. Results In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. Conclusions CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query options for mining the database across different types of experiments. The database is publically available at http://agbase.msstate.edu. PMID:20946609
Integrated database for identifying candidate genes for Aspergillus flavus resistance in maize.
Kelley, Rowena Y; Gresham, Cathy; Harper, Jonathan; Bridges, Susan M; Warburton, Marilyn L; Hawkins, Leigh K; Pechanova, Olga; Peethambaran, Bela; Pechan, Tibor; Luthe, Dawn S; Mylroie, J E; Ankala, Arunkanth; Ozkan, Seval; Henry, W B; Williams, W P
2010-10-07
Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query options for mining the database across different types of experiments. The database is publically available at http://agbase.msstate.edu.
Clinical utility and development of biomarkers in invasive aspergillosis.
Patterson, Thomas F
2011-01-01
The diagnosis of invasive aspergillosis remains very difficult, and there are limited treatment options for the disease. Pre-clinical models have been used to evaluate the diagnosis and treatment of Aspergillus infection and to assess the pathogenicity and virulence of the organism. Extensive efforts in Aspergillus research have significantly expanded the genomic information about this microorganism. The standardization of animal models of invasive aspergillosis can be used to enhance the evaluation of genomic information about the organism to improve the diagnosis and treatment of invasive aspergillosis. One approach to this process has been the award of a contract by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health to establish and standardize animal models of invasive aspergillosis for the development of new diagnostic technologies for both pulmonary and disseminated Aspergillus infection. This work utilizes molecular approaches for the genetic manipulation of Aspergillus strains that can be tested in animal-model systems to establish new diagnostic targets and tools. Studies have evaluated the performance characteristics of assays for cell-wall antigens of Aspergillus including galactomannan and beta-D-glucan, as well as for DNA targets in the organism, through PCR. New targets, such as proteomic and genomic approaches, and novel detection methods, such as point-of-care lateral-flow devices, have also been evaluated. The goal of this paper is to provide a framework for evaluating genomic targets in animal models to improve the diagnosis and treatment of invasive aspergillosis toward ultimately improving the outcomes for patients with this frequently fatal infection.
Sleiman, Sue; Halliday, Catriona L; Chapman, Belinda; Brown, Mitchell; Nitschke, Joanne; Lau, Anna F; Chen, Sharon C-A
2016-08-01
We developed an Australian database for the identification of Aspergillus, Scedosporium, and Fusarium species (n = 28) by matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS). In a challenge against 117 isolates, species identification significantly improved when the in-house-built database was combined with the Bruker Filamentous Fungi Library compared with that for the Bruker library alone (Aspergillus, 93% versus 69%; Fusarium, 84% versus 42%; and Scedosporium, 94% versus 18%, respectively). Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Kniemeyer, Olaf
2011-08-01
Fungal species of the genus Aspergillus play significant roles as model organisms in basic research, as "cell factories" for the production of organic acids, pharmaceuticals or industrially important enzymes and as pathogens causing superficial and invasive infections in animals and humans. The release of the genome sequences of several Aspergillus sp. has paved the way for global analyses of protein expression in Aspergilli including the characterisation of proteins, which have not designated any function. With the application of proteomic methods, particularly 2-D gel and LC-MS/MS-based methods, first insights into the composition of the proteome of Aspergilli under different growth and stress conditions could be gained. Putative targets of global regulators led to the improvement of industrially relevant Aspergillus strains and so far not described Aspergillus antigens have already been discovered. Here, I review the recent proteome data generated for the species Aspergillus nidulans, Aspergillus fumigatus, Aspergillus niger, Aspergillus terreus, Aspergillus flavus and Aspergillus oryzae. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
de Vries, Ronald P; Riley, Robert; Wiebenga, Ad; Aguilar-Osorio, Guillermo; Amillis, Sotiris; Uchima, Cristiane Akemi; Anderluh, Gregor; Asadollahi, Mojtaba; Askin, Marion; Barry, Kerrie; Battaglia, Evy; Bayram, Özgür; Benocci, Tiziano; Braus-Stromeyer, Susanna A; Caldana, Camila; Cánovas, David; Cerqueira, Gustavo C; Chen, Fusheng; Chen, Wanping; Choi, Cindy; Clum, Alicia; Dos Santos, Renato Augusto Corrêa; Damásio, André Ricardo de Lima; Diallinas, George; Emri, Tamás; Fekete, Erzsébet; Flipphi, Michel; Freyberg, Susanne; Gallo, Antonia; Gournas, Christos; Habgood, Rob; Hainaut, Matthieu; Harispe, María Laura; Henrissat, Bernard; Hildén, Kristiina S; Hope, Ryan; Hossain, Abeer; Karabika, Eugenia; Karaffa, Levente; Karányi, Zsolt; Kraševec, Nada; Kuo, Alan; Kusch, Harald; LaButti, Kurt; Lagendijk, Ellen L; Lapidus, Alla; Levasseur, Anthony; Lindquist, Erika; Lipzen, Anna; Logrieco, Antonio F; MacCabe, Andrew; Mäkelä, Miia R; Malavazi, Iran; Melin, Petter; Meyer, Vera; Mielnichuk, Natalia; Miskei, Márton; Molnár, Ákos P; Mulé, Giuseppina; Ngan, Chew Yee; Orejas, Margarita; Orosz, Erzsébet; Ouedraogo, Jean Paul; Overkamp, Karin M; Park, Hee-Soo; Perrone, Giancarlo; Piumi, Francois; Punt, Peter J; Ram, Arthur F J; Ramón, Ana; Rauscher, Stefan; Record, Eric; Riaño-Pachón, Diego Mauricio; Robert, Vincent; Röhrig, Julian; Ruller, Roberto; Salamov, Asaf; Salih, Nadhira S; Samson, Rob A; Sándor, Erzsébet; Sanguinetti, Manuel; Schütze, Tabea; Sepčić, Kristina; Shelest, Ekaterina; Sherlock, Gavin; Sophianopoulou, Vicky; Squina, Fabio M; Sun, Hui; Susca, Antonia; Todd, Richard B; Tsang, Adrian; Unkles, Shiela E; van de Wiele, Nathalie; van Rossen-Uffink, Diana; Oliveira, Juliana Velasco de Castro; Vesth, Tammi C; Visser, Jaap; Yu, Jae-Hyuk; Zhou, Miaomiao; Andersen, Mikael R; Archer, David B; Baker, Scott E; Benoit, Isabelle; Brakhage, Axel A; Braus, Gerhard H; Fischer, Reinhard; Frisvad, Jens C; Goldman, Gustavo H; Houbraken, Jos; Oakley, Berl; Pócsi, István; Scazzocchio, Claudio; Seiboth, Bernhard; vanKuyk, Patricia A; Wortman, Jennifer; Dyer, Paul S; Grigoriev, Igor V
2017-02-14
The fungal genus Aspergillus is of critical importance to humankind. Species include those with industrial applications, important pathogens of humans, animals and crops, a source of potent carcinogenic contaminants of food, and an important genetic model. The genome sequences of eight aspergilli have already been explored to investigate aspects of fungal biology, raising questions about evolution and specialization within this genus. We have generated genome sequences for ten novel, highly diverse Aspergillus species and compared these in detail to sister and more distant genera. Comparative studies of key aspects of fungal biology, including primary and secondary metabolism, stress response, biomass degradation, and signal transduction, revealed both conservation and diversity among the species. Observed genomic differences were validated with experimental studies. This revealed several highlights, such as the potential for sex in asexual species, organic acid production genes being a key feature of black aspergilli, alternative approaches for degrading plant biomass, and indications for the genetic basis of stress response. A genome-wide phylogenetic analysis demonstrated in detail the relationship of the newly genome sequenced species with other aspergilli. Many aspects of biological differences between fungal species cannot be explained by current knowledge obtained from genome sequences. The comparative genomics and experimental study, presented here, allows for the first time a genus-wide view of the biological diversity of the aspergilli and in many, but not all, cases linked genome differences to phenotype. Insights gained could be exploited for biotechnological and medical applications of fungi.
Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuo, Alan; Grigoriev, Igor
2009-04-17
Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentousmore » ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.« less
Diversity, Application, and Synthetic Biology of Industrially Important Aspergillus Fungi.
Park, Hee-Soo; Jun, Sang-Cheol; Han, Kap-Hoon; Hong, Seung-Beom; Yu, Jae-Hyuk
2017-01-01
The filamentous fungal genus Aspergillus consists of over 340 officially recognized species. A handful of these Aspergillus fungi are predominantly used for food fermentation and large-scale production of enzymes, organic acids, and bioactive compounds. These industrially important Aspergilli primarily belong to the two major Aspergillus sections, Nigri and Flavi. Aspergillus oryzae (section Flavi) is the most commonly used mold for the fermentation of soybeans, rice, grains, and potatoes. Aspergillus niger (section Nigri) is used in the industrial production of various enzymes and organic acids, including 99% (1.4 million tons per year) of citric acid produced worldwide. Better understanding of the genomes and the signaling mechanisms of key Aspergillus species can help identify novel approaches to enhance these commercially significant strains. This review summarizes the diversity, current applications, key products, and synthetic biology of Aspergillus fungi commonly used in industry. Copyright © 2017 Elsevier Inc. All rights reserved.
Draft genome sequence of an aflatoxigenic Aspergillus species, A. bombycis
USDA-ARS?s Scientific Manuscript database
The genome of the A. bombycis Type strain was sequenced using a Personal Genome Machine, followed by annotation of its predicted genes. The genome size for A. bombycis was found to be approximately 37 Mb and contained 12,266 genes. This announcement introduces a sequenced genome for an aflatoxigenic...
Futagami, Taiki; Mori, Kazuki; Yamashita, Ayaka; Wada, Shotaro; Kajiwara, Yasuhiro; Takashita, Hideharu; Omori, Toshiro; Takegawa, Kaoru; Tashiro, Kosuke; Kuhara, Satoru; Goto, Masatoshi
2011-11-01
The filamentous fungus Aspergillus kawachii has traditionally been used for brewing the Japanese distilled spirit shochu. A. kawachii characteristically hyperproduces citric acid and a variety of polysaccharide glycoside hydrolases. Here the genome sequence of A. kawachii IFO 4308 was determined and annotated. Analysis of the sequence may provide insight into the properties of this fungus that make it superior for use in shochu production, leading to the further development of A. kawachii for industrial applications.
Gil-Serna, Jessica; Vázquez, Covadonga; González-Jaén, María Teresa; Patiño, Belén
2015-12-02
Aspergillus steynii is probably the most relevant species of section Circumdati producing ochratoxin A (OTA). This mycotoxin contaminates a wide number of commodities and it is highly toxic for humans and animals. Little is known on the biosynthetic genes and their regulation in Aspergillus species. In this work, we identified and analysed three contiguous genes in A. steynii using 5'-RACE and genome walking approaches which predicted a cytochrome P450 monooxygenase (p450ste), a non-ribosomal peptide synthetase (nrpsste) and a polyketide synthase (pksste). These three genes were contiguous within a 20742 bp long genomic DNA fragment. Their corresponding cDNA were sequenced and their expression was analysed in three A. steynii strains using real time RT-PCR specific assays in permissive conditions in in vitro cultures. OTA was also analysed in these cultures. Comparative analyses of predicted genomic, cDNA and amino acid sequences were performed with sequences of similar gene functions. All the results obtained in these analyses were consistent and point out the involvement of these three genes in OTA biosynthesis by A. steynii and showed a co-ordinated expression pattern. This is the first time that a clustered organization OTA biosynthetic genes has been reported in Aspergillus genus. The results also suggested that this situation might be common in Aspergillus OTA-producing species and distinct to the one described for Penicillium species. Copyright © 2015 Elsevier B.V. All rights reserved.
Paul, Sujay; Zhang, Angel; Ludeña, Yvette; Villena, Gretty K; Yu, Fengan; Sherman, David H; Gutiérrez-Correa, Marcel
2017-06-10
Here, we report the complete genome sequence of a high alkaline cellulase producing Aspergillus fumigatus strain LMB-35Aa isolated from soil of Peruvian Amazon rainforest. The genome is ∼27.5mb in size, comprises of 228 scaffolds with an average GC content of 50%, and is predicted to contain a total of 8660 protein-coding genes. Of which, 6156 are with known function; it codes for 607 putative CAZymes families potentially involved in carbohydrate metabolism. Several important cellulose degrading genes, such as endoglucanase A, endoglucanase B, endoglucanase D and beta-glucosidase, are also identified. The genome of A. fumigatus strain LMB-35Aa represents the first whole sequenced genome of non-clinical, high cellulase producing A. fumigatus strain isolated from forest soil. Copyright © 2017 Elsevier B.V. All rights reserved.
Sun, Wei-Wen; Guo, Chun-Jun; Wang, Clay C C
2016-04-01
Genome sequencing of the fungus Aspergillus terreus uncovered a number of silent core structural biosynthetic genes encoding enzymes presumed to be involved in the production of cryptic secondary metabolites. There are five nonribosomal peptide synthetase (NRPS)-like genes with the predicted A-T-TE domain architecture within the A. terreus genome. Among the five genes, only the product of pgnA remains unknown. The Tet-on system is an inducible, tunable and metabolism-independent expression system originally developed for Aspergillus niger. Here we report the adoption of the Tet-on system as an effective gene activation tool in A. terreus. Application of this system in A. terreus allowed us to uncover the product of the cryptic NRPS-like gene, pgnA. Furthermore expression of pgnA in the heterologous Aspergillus nidulans host suggested that the pgnA gene alone is necessary for phenguignardic acid (1) biosynthesis. Copyright © 2016 Elsevier Inc. All rights reserved.
Identification and functional analysis of the aspergillic acid gene cluster in Aspergillus flavus
USDA-ARS?s Scientific Manuscript database
Aspergillus flavus can colonize important food staples and produces aflatoxins, toxic and carcinogenic secondary metabolites. In silico analysis of the A. flavus genome revealed 56 gene clusters encoding for secondary metabolites. How these many of these metabolites affect fungal development, surviv...
Alanio, A; Beretti, J-L; Dauphin, B; Mellado, E; Quesne, G; Lacroix, C; Amara, A; Berche, P; Nassif, X; Bougnoux, M-E
2011-05-01
New Aspergillus species have recently been described with the use of multilocus sequencing in refractory cases of invasive aspergillosis. The classical phenotypic identification methods routinely used in clinical laboratories failed to identify them adequately. Some of these Aspergillus species have specific patterns of susceptibility to antifungal agents, and misidentification may lead to inappropriate therapy. We developed a matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry (MS)-based strategy to adequately identify Aspergillus species to the species level. A database including the reference spectra of 28 clinically relevant species from seven Aspergillus sections (five common and 23 unusual species) was engineered. The profiles of young and mature colonies were analysed for each reference strain, and species-specific spectral fingerprints were identified. The performance of the database was then tested on 124 clinical and 16 environmental isolates previously characterized by partial sequencing of the β-tubulin and calmodulin genes. One hundred and thirty-eight isolates of 140 (98.6%) were correctly identified. Two atypical isolates could not be identified, but no isolate was misidentified (specificity: 100%). The database, including species-specific spectral fingerprints of young and mature colonies of the reference strains, allowed identification regardless of the maturity of the clinical isolate. These results indicate that MALDI-TOF MS is a powerful tool for rapid and accurate identification of both common and unusual species of Aspergillus. It can give better results than morphological identification in clinical laboratories. © 2010 The Authors. Clinical Microbiology and Infection © 2010 European Society of Clinical Microbiology and Infectious Diseases.
Enhanced production of fructosyltransferase in Aspergillus oryzae by genome shuffling.
Wang, Shenghai; Duan, Mengjie; Liu, Yalan; Fan, Sen; Lin, Xiaoshan; Zhang, Yi
2017-03-01
To breed Aspergillus oryzae strains with high fructosyltransferase (FTase) activity using intraspecific protoplast fusion via genome-shuffling. A candidate library was developed using UV/LiCl of the conidia of A. oryzae SBB201. By screening for enzyme activity and cell biomass, two mutants (UV-11 and UV-76) were chosen for protoplast fusion and subsequent genome shuffling. After three rounds of genome recombination, a fusion mutant RIII-7 was obtained. Its FTase activity was 180 U g -1 , approximately double that of the original strain, and RIII-7 was genetically stable. In fermentation culture, FTase activity of the genome-shuffled strain reached a maximum of 353 U g -1 using substrate-feeding method, and this value was approximately 3.4-times higher than that of the original strain A. oryzae SBB201. Intraspecific protoplast fusion of A. oryzae significantly enhanced FTase activity and generated a potentially useful strain for industrial production.
Futagami, Taiki; Mori, Kazuki; Yamashita, Ayaka; Wada, Shotaro; Kajiwara, Yasuhiro; Takashita, Hideharu; Omori, Toshiro; Takegawa, Kaoru; Tashiro, Kosuke; Kuhara, Satoru; Goto, Masatoshi
2011-01-01
The filamentous fungus Aspergillus kawachii has traditionally been used for brewing the Japanese distilled spirit shochu. A. kawachii characteristically hyperproduces citric acid and a variety of polysaccharide glycoside hydrolases. Here the genome sequence of A. kawachii IFO 4308 was determined and annotated. Analysis of the sequence may provide insight into the properties of this fungus that make it superior for use in shochu production, leading to the further development of A. kawachii for industrial applications. PMID:22045919
2005-05-01
5 mM citric acid). A solution of fl-gal [G5160 from Aspergillus oryzae (Aldrich), 95 ug in 10 /L of buffer] was added and the absorption (A = 420 nm...At pH 4.5-the optimal pH for fl-gal (5) Herschman, H. R. (2002) Non-invasive imaging of reporter derived from Aspergillus oryzae-16 showed very...instrumentation; 2. An understanding of the genome for several species and associated genomics, proteomics , etc.; 3. Novel pharmaceuticals providing high target
Genome-Wide Association Mapping of and Aspergillus flavus Aflatoxin Accumulation Resistance in Maize
Marilyn L. Warburton; Juliet D. Tang; Gary L. Windham; Leigh K. Hawkins; Seth C. Murray; Wenwei Xu; Debbie Boykin; Andy Perkins; W. Paul Williams
2015-01-01
Contamination of maize (Zea mays L.) with aflatoxin, produced by the fungus Aspergillus flavus Link, has severe health and economic consequences. Efforts to reduce aflatoxin accumulation in maize have focused on identifying and selecting germplasm with natural host resistance factors, and several maize lines with significantly...
Genome sequence and comparative analyses of atoxigenic Aspergillus flavus WRRL 1519
USDA-ARS?s Scientific Manuscript database
Aflatoxins are fungal secondary metabolites that often contaminate foodstuffs and crops, the major producer of which is Aspergillus flavus. Use of non-aflatoxigenic strains of A. flavus to compete against aflatoxin-producing strains has emerged as one of the best management practices for reducing af...
USDA-ARS?s Scientific Manuscript database
The genomes of the A. ochraceoroseus and A. rambellii type strains were sequenced using a personal genome machine, followed by annotation of their genes. The genome size for A. ochraceoroseus was found to be approximately 23 Mb and contained 7,837 genes, while the A. rambellii genome was found to be...
Leigh Hawkins; Marilyn Warburton; Juliet Tang; John Tomashek; Dafne Alves Oliveira; Oluwaseun Ogunola; J. Smith; W. Williams
2018-01-01
Many projects have identified candidate genes for resistance to aflatoxin accumulation or Aspergillus flavus infection and growth in maize using genetic mapping, genomics, transcriptomics and/or proteomics studies. However, only a small percentage of these candidates have been validated in field conditions, and their relative contribution to...
Evaluation of Aspergillus PCR protocols for testing serum specimens.
White, P Lewis; Mengoli, Carlo; Bretagne, Stéphane; Cuenca-Estrella, Manuel; Finnstrom, Niklas; Klingspor, Lena; Melchers, Willem J G; McCulloch, Elaine; Barnes, Rosemary A; Donnelly, J Peter; Loeffler, Juergen
2011-11-01
A panel of human serum samples spiked with various amounts of Aspergillus fumigatus genomic DNA was distributed to 23 centers within the European Aspergillus PCR Initiative to determine analytical performance of PCR. Information regarding specific methodological components and PCR performance was requested. The information provided was made anonymous, and meta-regression analysis was performed to determine any procedural factors that significantly altered PCR performance. Ninety-seven percent of protocols were able to detect a threshold of 10 genomes/ml on at least one occasion, with 83% of protocols reproducibly detecting this concentration. Sensitivity and specificity were 86.1% and 93.6%, respectively. Positive associations between sensitivity and the use of larger sample volumes, an internal control PCR, and PCR targeting the internal transcribed spacer (ITS) region were shown. Negative associations between sensitivity and the use of larger elution volumes (≥100 μl) and PCR targeting the mitochondrial genes were demonstrated. Most Aspergillus PCR protocols used to test serum generate satisfactory analytical performance. Testing serum requires less standardization, and the specific recommendations shown in this article will only improve performance.
Evaluation of Aspergillus PCR Protocols for Testing Serum Specimens▿†
White, P. Lewis; Mengoli, Carlo; Bretagne, Stéphane; Cuenca-Estrella, Manuel; Finnstrom, Niklas; Klingspor, Lena; Melchers, Willem J. G.; McCulloch, Elaine; Barnes, Rosemary A.; Donnelly, J. Peter; Loeffler, Juergen
2011-01-01
A panel of human serum samples spiked with various amounts of Aspergillus fumigatus genomic DNA was distributed to 23 centers within the European Aspergillus PCR Initiative to determine analytical performance of PCR. Information regarding specific methodological components and PCR performance was requested. The information provided was made anonymous, and meta-regression analysis was performed to determine any procedural factors that significantly altered PCR performance. Ninety-seven percent of protocols were able to detect a threshold of 10 genomes/ml on at least one occasion, with 83% of protocols reproducibly detecting this concentration. Sensitivity and specificity were 86.1% and 93.6%, respectively. Positive associations between sensitivity and the use of larger sample volumes, an internal control PCR, and PCR targeting the internal transcribed spacer (ITS) region were shown. Negative associations between sensitivity and the use of larger elution volumes (≥100 μl) and PCR targeting the mitochondrial genes were demonstrated. Most Aspergillus PCR protocols used to test serum generate satisfactory analytical performance. Testing serum requires less standardization, and the specific recommendations shown in this article will only improve performance. PMID:21940479
Aspergillus flavus: human pathogen, allergen and mycotoxin producer.
Hedayati, M T; Pasqualotto, A C; Warn, P A; Bowyer, P; Denning, D W
2007-06-01
Aspergillus infections have grown in importance in the last years. However, most of the studies have focused on Aspergillus fumigatus, the most prevalent species in the genus. In certain locales and hospitals, Aspergillus flavus is more common in air than A. fumigatus, for unclear reasons. After A. fumigatus, A. flavus is the second leading cause of invasive aspergillosis and it is the most common cause of superficial infection. Experimental invasive infections in mice show A. flavus to be 100-fold more virulent than A. fumigatus in terms of inoculum required. Particularly common clinical syndromes associated with A. flavus include chronic granulomatous sinusitis, keratitis, cutaneous aspergillosis, wound infections and osteomyelitis following trauma and inoculation. Outbreaks associated with A. flavus appear to be associated with single or closely related strains, in contrast to those associated with A. fumigatus. In addition, A. flavus produces aflatoxins, the most toxic and potent hepatocarcinogenic natural compounds ever characterized. Accurate species identification within Aspergillus flavus complex remains difficult due to overlapping morphological and biochemical characteristics, and much taxonomic and population genetics work is necessary to better understand the species and related species. The flavus complex currently includes 23 species or varieties, including two sexual species, Petromyces alliaceus and P. albertensis. The genome of the highly related Aspergillus oryzae is completed and available; that of A. flavus in the final stages of annotation. Our understanding of A. flavus lags far behind that of A. fumigatus. Studies of the genomics, taxonomy, population genetics, pathogenicity, allergenicity and antifungal susceptibility of A. flavus are all required.
Mäkelä, Miia R; Dilokpimol, Adiphol; Koskela, Salla M; Kuuskeri, Jaana; de Vries, Ronald P; Hildén, Kristiina
2018-04-26
Feruloyl esterases (FAEs) are accessory enzymes for plant biomass degradation, which catalyse hydrolysis of carboxylic ester linkages between hydroxycinnamic acids and plant cell-wall carbohydrates. They are a diverse group of enzymes evolved from, e.g. acetyl xylan esterases (AXEs), lipases and tannases, thus complicating their classification and prediction of function by sequence similarity. Recently, an increasing number of fungal FAEs have been biochemically characterized, owing to their potential in various biotechnological applications and multitude of candidate FAEs in fungal genomes. However, only part of the fungal FAEs are included in Carbohydrate Esterase family 1 (CE1) of the carbohydrate-active enzymes (CAZy) database. In this work, we performed a phylogenetic analysis that divided the fungal members of CE1 into five subfamilies of which three contained characterized enzymes with conserved activities. Conservation within one of the subfamilies was confirmed by characterization of an additional CE1 enzyme from Aspergillus terreus. Recombinant A. terreus FaeD (AtFaeD) showed broad specificity towards synthetic methyl and ethyl esters, and released ferulic acid from plant biomass substrates, demonstrating its true FAE activity and interesting features as potential biocatalyst. The subfamily division of the fungal CE1 members enables more efficient selection of candidate enzymes for biotechnological processes. © 2018 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.
Katayama, Takuya; Tanaka, Yuki; Okabe, Tomoya; Nakamura, Hidetoshi; Fujii, Wataru; Kitamoto, Katsuhiko; Maruyama, Jun-Ichi
2016-04-01
To develop a genome editing method using the CRISPR/Cas9 system in Aspergillus oryzae, the industrial filamentous fungus used in Japanese traditional fermentation and for the production of enzymes and heterologous proteins. To develop the CRISPR/Cas9 system as a genome editing technique for A. oryzae, we constructed plasmids expressing the gene encoding Cas9 nuclease and single guide RNAs for the mutagenesis of target genes. We introduced these into an A. oryzae strain and obtained transformants containing mutations within each target gene that exhibited expected phenotypes. The mutational rates ranged from 10 to 20 %, and 1 bp deletions or insertions were the most commonly induced mutations. We developed a functional and versatile genome editing method using the CRISPR/Cas9 system in A. oryzae. This technique will contribute to the use of efficient targeted mutagenesis in many A. oryzae industrial strains.
Aspergillus Niger Genomics: Past, Present and into the Future
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Scott E.
2006-09-01
Aspergillus niger is a filamentous ascomycete fungus that is ubiquitous in the environment and has been implicated in opportunistic infections of humans. In addition to its role as an opportunistic human pathogen, A. niger is economically important as a fermentation organism used for the production of citric acid. Industrial citric acid production by A. niger represents one of the most efficient, highest yield bioprocesses in use currently by industry. The genome size of A. niger is estimated to be between 35.5 and 38.5 megabases (Mb) divided among eight chromosomes/linkage groups that vary in size from 3.5 - 6.6 Mb. Currently,more » there are three independent A. niger genome projects, an indication of the economic importance of this organism. The rich amount of data resulting from these multiple A. niger genome sequences will be used for basic and applied research programs applicable to fermentation process development, morphology and pathogenicity.« less
USDA-ARS?s Scientific Manuscript database
The genome of the filamentous fungus, Aspergillus flavus, has been shown to harbor as many as 55 putative secondary metabolic gene clusters including the one responsible for production of the toxic and carcinogenic, polyketide synthase (PKS)-derived family of secondary metabolites termed aflatoxins....
USDA-ARS?s Scientific Manuscript database
Aspergillus flavus is a saprophytic fungus that infects corn, peanuts, tree nuts and other agriculturally important crops. Once the crop is infected the fungus has the potential to secrete one or more mycotoxins, the most carcinogenic of which is aflatoxin. Aflatoxin contaminated crops are deemed un...
Rapid genome resequencing of an atoxigenic strain of Aspergillus carbonarius
Cabañes, F. Javier; Sanseverino, Walter; Castellá, Gemma; ...
2015-03-13
In microorganisms, Ion Torrent sequencing technology has been proved to be useful in whole-genome sequencing of bacterial genomes (5 Mbp). In our study, for the first time we used this technology to perform a resequencing approach in a whole fungal genome (36 Mbp), a non-ochratoxin A producing strain of Aspergillus carbonarius. Ochratoxin A (OTA) is a potent nephrotoxin which is found mainly in cereals and their products, but it also occurs in a variety of common foods and beverages. Due to the fact that this strain does not produce OTA, we focused some of the bioinformatics analyses in genes involvedmore » in OTA biosynthesis, using a reference genome of an OTA producing strain of the same species. This study revealed that in the atoxigenic strain there is a high accumulation of nonsense and missense mutations in several genes. Importantly, a two fold increase in gene mutation ratio was observed in PKS and NRPS encoding genes which are suggested to be involved in OTA biosynthesis.« less
Rapid genome resequencing of an atoxigenic strain of Aspergillus carbonarius
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cabañes, F. Javier; Sanseverino, Walter; Castellá, Gemma
In microorganisms, Ion Torrent sequencing technology has been proved to be useful in whole-genome sequencing of bacterial genomes (5 Mbp). In our study, for the first time we used this technology to perform a resequencing approach in a whole fungal genome (36 Mbp), a non-ochratoxin A producing strain of Aspergillus carbonarius. Ochratoxin A (OTA) is a potent nephrotoxin which is found mainly in cereals and their products, but it also occurs in a variety of common foods and beverages. Due to the fact that this strain does not produce OTA, we focused some of the bioinformatics analyses in genes involvedmore » in OTA biosynthesis, using a reference genome of an OTA producing strain of the same species. This study revealed that in the atoxigenic strain there is a high accumulation of nonsense and missense mutations in several genes. Importantly, a two fold increase in gene mutation ratio was observed in PKS and NRPS encoding genes which are suggested to be involved in OTA biosynthesis.« less
NASA Astrophysics Data System (ADS)
Yee, Chai Sin; Murad, Abdul Munir Abdul; Bakar, Farah Diba Abu
2013-11-01
A gene encoding an endo-β-1,4-mannanase from Trichoderma virens UKM1 (manTV) and Aspergillus flavus UKM1 (manAF) was analysed with bioinformatic tools. In addition, A. flavus NRRL 3357 genome database was screened for a β-mannosidase gene and analysed (mndA-AF). These three genes were analysed to understand their gene properties. manTV and manAF both consists of 1,332-bp and 1,386-bp nucleotides encoding 443 and 461 amino acid residues, respectively. Both the endo-β-1,4-mannanases belong to the glycosyl hydrolase family 5 and contain a carbohydrate-binding module family 1 (CBM1). On the other hand, mndA-AF which is a 2,745-bp gene encodes a protein sequence of 914 amino acid residues. This β-mannosidase belongs to the glycosyl hydrolase family 2. Predicted molecular weight of manTV, manAF and mndA-AF are 47.74 kDa, 49.71 kDa and 103 kDa, respectively. All three predicted protein sequences possessed signal peptide sequence and are highly conserved among other fungal β-mannanases and β-mannosidases.
Integrated database for identifying candate genes for Aspergillus flavus resistance in maize
USDA-ARS?s Scientific Manuscript database
Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent af...
Aspergillus niger contains the cryptic phylogenetic species A. awamori.
Perrone, Giancarlo; Stea, Gaetano; Epifani, Filomena; Varga, János; Frisvad, Jens C; Samson, Robert A
2011-11-01
Aspergillus section Nigri is an important group of species for food and medical mycology, and biotechnology. The Aspergillus niger 'aggregate' represents its most complicated taxonomic subgroup containing eight morphologically indistinguishable taxa: A. niger, Aspergillus tubingensis, Aspergillus acidus, Aspergillus brasiliensis, Aspergillus costaricaensis, Aspergillus lacticoffeatus, Aspergillus piperis, and Aspergillus vadensis. Aspergillus awamori, first described by Nakazawa, has been compared taxonomically with other black aspergilli and recently it has been treated as a synonym of A. niger. Phylogenetic analyses of sequences generated from portions of three genes coding for the proteins β-tubulin (benA), calmodulin (CaM), and the translation elongation factor-1 alpha (TEF-1α) of a population of A. niger strains isolated from grapes in Europe revealed the presence of a cryptic phylogenetic species within this population, A. awamori. Morphological, physiological, ecological and chemical data overlap occurred between A. niger and the cryptic A. awamori, however the splitting of these two species was also supported by AFLP analysis of the full genome. Isolates in both phylospecies can produce the mycotoxins ochratoxin A and fumonisin B₂, and they also share the production of pyranonigrin A, tensidol B, funalenone, malformins, and naphtho-γ-pyrones. In addition, sequence analysis of four putative A. awamori strains from Japan, used in the koji industrial fermentation, revealed that none of these strains belong to the A. awamori phylospecies. Copyright © 2011 British Mycological Society. Published by Elsevier Ltd. All rights reserved.
USDA-ARS?s Scientific Manuscript database
Aflatoxin contamination of peanut and other crops is a major concern for producers globally, and has been shown to be exacerbated by drought stress. Previous transcriptomic and proteomic examination of the responses of isolates of Aspergillus flavus to drought-related oxidative stress in vitro have ...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chiang, Yi Ming; Meyer, Kristen M; Praseuth, Michael
2010-12-06
The genome sequencing of the fungus Aspergillus niger, an industrial workhorse, uncovered a large cache of genes encoding enzymes thought to be involved in the production of secondary metabolites yet to be identified. Identification and structural characterization of many of these predicted secondary metabolites are hampered by their low concentration relative to the known A. niger metabolites such as the naphtho-γ-pyrone family of polyketides. We deleted a nonreducing PKS gene in A. niger strain ATCC 11414, a daughter strain of A. niger ATCC strain 1015 whose genome was sequenced by the DOE Joint Genome Institute. This PKS encoding gene ismore » a predicted ortholog of alb1 from Aspergillus fumigatus which is responsible for production of YWA1, a precursor of fungal DHN melanin. Our results show that the A. niger alb1 PKS is responsible for the production of the polyketide precursor for DHN melanin biosynthesis. Deletion of alb1 elimnates the production of major metabolites, naphtho-γ-pyrones. The generation of an A. niger strain devoid of naphtho-γ-pyrones will greatly facilitate the elucidation of cryptic biosynthetic pathways in this organism.« less
Kanhayuwa, Lakkhana; Kotta-Loizou, Ioly; Özkan, Selin; Gunning, A. Patrick; Coutts, Robert H. A.
2015-01-01
We report the discovery and characterization of a double-stranded RNA (dsRNA) mycovirus isolated from the human pathogenic fungus Aspergillus fumigatus, Aspergillus fumigatus tetramycovirus-1 (AfuTmV-1), which reveals several unique features not found previously in positive-strand RNA viruses, including the fact that it represents the first dsRNA (to our knowledge) that is not only infectious as a purified entity but also as a naked dsRNA. The AfuTmV-1 genome consists of four capped dsRNAs, the largest of which encodes an RNA-dependent RNA polymerase (RdRP) containing a unique GDNQ motif normally characteristic of negative-strand RNA viruses. The third largest dsRNA encodes an S-adenosyl methionine–dependent methyltransferase capping enzyme and the smallest dsRNA a P-A-S–rich protein that apparently coats but does not encapsidate the viral genome as visualized by atomic force microscopy. A combination of a capping enzyme with a picorna-like RdRP in the AfuTmV-1 genome is a striking case of chimerism and the first example (to our knowledge) of such a phenomenon. AfuTmV-1 appears to be intermediate between dsRNA and positive-strand ssRNA viruses, as well as between encapsidated and capsidless RNA viruses. PMID:26139522
Allergens/Antigens, toxins and polyketides of important Aspergillus species.
Bhetariya, Preetida J; Madan, Taruna; Basir, Seemi Farhat; Varma, Anupam; Usha, Sarma P
2011-04-01
The medical, agricultural and biotechnological importance of the primitive eukaryotic microorganisms, the Fungi was recognized way back in 1920. Among various groups of fungi, the Aspergillus species are studied in great detail using advances in genomics and proteomics to unravel biological and molecular mechanisms in these fungi. Aspergillus fumigatus, Aspergillus flavus, Aspergillus niger, Aspergillus parasiticus, Aspergillus nidulans and Aspergillus terreus are some of the important species relevant to human, agricultural and biotechnological applications. The potential of Aspergillus species to produce highly diversified complex biomolecules such as multifunctional proteins (allergens, antigens, enzymes) and polyketides is fascinating and demands greater insight into the understanding of these fungal species for application to human health. Recently a regulator gene for secondary metabolites, LaeA has been identified. Gene mining based on LaeA has facilitated new metabolites with antimicrobial activity such as emericellamides and antitumor activity such as terrequinone A from A. nidulans. Immunoproteomic approach was reported for identification of few novel allergens for A. fumigatus. In this context, the review is focused on recent developments in allergens, antigens, structural and functional diversity of the polyketide synthases that produce polyketides of pharmaceutical and biological importance. Possible antifungal drug targets for development of effective antifungal drugs and new strategies for development of molecular diagnostics are considered.
Diba, K; Mirhendi, H; Kordbacheh, P; Rezaie, S
2014-01-01
In this study we attempted to modify the PCR-RFLP method using restriction enzyme MwoI for the identification of medically important Aspergillus species. Our subjects included nine standard Aspergillus species and 205 Aspergillus isolates of approved hospital acquired infections and hospital indoor sources. First of all, Aspergillus isolates were identified in the level of species by using morphologic method. A twenty four hours culture was performed for each isolates to harvest Aspergillus mycelia and then genomic DNA was extracted using Phenol-Chloroform method. PCR-RFLP using single restriction enzyme MwoI was performed in ITS regions of rDNA gene. The electrophoresis data were analyzed and compared with those of morphologic identifications. Total of 205 Aspergillus isolates included 153 (75%) environmental and 52 (25%) clinical isolates. A. flavus was the most frequently isolate in our study (55%), followed by A. niger 65(31.7%), A. fumigatus 18(8.7%), A. nidulans and A. parasiticus 2(1% each). MwoI enabled us to discriminate eight medically important Aspergillus species including A. fumigatus, A. niger, A. flavus as the most common isolated species. PCR-RFLP method using the restriction enzyme MwoI is a rapid and reliable test for identification of at least the most medically important Aspergillus species.
Diba, K.; Mirhendi, H.; Kordbacheh, P.; Rezaie, S.
2014-01-01
In this study we attempted to modify the PCR-RFLP method using restriction enzyme MwoI for the identification of medically important Aspergillus species. Our subjects included nine standard Aspergillus species and 205 Aspergillus isolates of approved hospital acquired infections and hospital indoor sources. First of all, Aspergillus isolates were identified in the level of species by using morphologic method. A twenty four hours culture was performed for each isolates to harvest Aspergillus mycelia and then genomic DNA was extracted using Phenol-Chloroform method. PCR-RFLP using single restriction enzyme MwoI was performed in ITS regions of rDNA gene. The electrophoresis data were analyzed and compared with those of morphologic identifications. Total of 205 Aspergillus isolates included 153 (75%) environmental and 52 (25%) clinical isolates. A. flavus was the most frequently isolate in our study (55%), followed by A. niger 65(31.7%), A. fumigatus 18(8.7%), A. nidulans and A. parasiticus 2(1% each). MwoI enabled us to discriminate eight medically important Aspergillus species including A. fumigatus, A. niger, A. flavus as the most common isolated species. PCR-RFLP method using the restriction enzyme MwoI is a rapid and reliable test for identification of at least the most medically important Aspergillus species. PMID:25242934
Genome sequence of an aflatoxigenic pathogen of Argentinian peanut, Aspergillus arachidicola
USDA-ARS?s Scientific Manuscript database
In this study we sequenced the genome of the A. arachidicola Type strain (CBS 117610) and found its genome size to be 38.9 Mb, and its number of predicted genes to be 12,091, which are values comparable to those in other sequenced Aspergilli. Of its predicted genes, 691 were identified as unique to ...
Bills, Gerald F; Yue, Qun; Chen, Li; Li, Yan; An, Zhiqiang; Frisvad, Jens C
2016-03-01
The invalidly published name Aspergillus sydowii var. mulundensis was proposed for a strain of Aspergillus that produced new echinocandin metabolites designated as the mulundocadins. Reinvestigation of this strain (Y-30462=DSMZ 5745) using phylogenetic, morphological, and metabolic data indicated that it is a distinct and novel species of Aspergillus sect. Nidulantes. The taxonomic novelty, Aspergillus mulundensis, is introduced for this historically important echinocandin-producing strain. The closely related A. nidulans FGSC A4 has one of the most extensively characterized secondary metabolomes of any filamentous fungus. Comparison of the full-genome sequences of DSMZ 5745 and FGSC A4 indicated that the two strains share 33 secondary metabolite biosynthetic gene clusters. These shared gene clusters represent ~45% of the total secondary metabolome of each strain, thus indicating a high level intraspecific divergence in terms of secondary metabolism.
Luis F. Larrondo; Marcela Avila; Loreto Salas; Dan Cullen; Rafael Vicuna
2003-01-01
Analysis of genomic clones encoding a putative laccase in homokaryon strains of Ceriporiopsis subvermispora led to the identification of an allelic variant of the previously described lcs-1 gene. A cDNA clone corresponding to this gene was expressed in Aspergillus nidulans and in Aspergillus niger. Enzyme assays and Western blots showed that both hosts secreted active...
USDA-ARS?s Scientific Manuscript database
The fungus Aspergillus flavus is known for its ability to produce the toxic and carcinogenic aflatoxins in food and feed. While aflatoxins are of most concern, A. flavus is predicted to be capable of producing many more metabolites based on a study of its complete genome sequence. Some of these meta...
Kudo, Kanako; Watanabe, Akira; Ujiie, Seiryu; Shintani, Takahiro; Gomi, Katsuya
2015-12-01
By a global search of the genome database of Aspergillus oryzae, we found 23 genes encoding putative β-glucosidases, among which 10 genes with a signal peptide belonging to glycoside hydrolase family 3 (GH3) were overexpressed in A. oryzae using the improved glaA gene promoter. Consequently, crude enzyme preparations from three strains, each harboring the genes AO090038000223 (bglA), AO090103000127 (bglF), and AO090003001511 (bglJ), showed a substrate preference toward p-nitrophenyl-β-d-glucopyranoside (pNPGlc) and thus were purified to homogeneity and enzymatically characterized. All the purified enzymes (BglA, BglF, and BglJ) preferentially hydrolyzed aryl β-glycosides, including pNPGlc, rather than cellobiose, and these enzymes were proven to be aryl β-glucosidases. Although the specific activity of BglF toward all the substrates tested was significantly low, BglA and BglJ showed appreciably high activities toward pNPGlc and arbutin. The kinetic parameters of BglA and BglJ for pNPGlc suggested that both the enzymes had relatively higher hydrolytic activity toward pNPGlc among the fungal β-glucosidases reported. The thermal and pH stabilities of BglA were higher than those of BglJ, and BglA was particularly stable in a wide pH range (pH 4.5-10). In contrast, BglJ was the most heat- and alkaline-labile among the three β-glucosidases. Furthermore, BglA was more tolerant to ethanol than BglJ; as a result, it showed much higher hydrolytic activity toward isoflavone glycosides in the presence of ethanol than BglJ. This study suggested that the mining of novel β-glucosidases exhibiting higher activity from microbial genome sequences is of great use for the production of beneficial compounds such as isoflavone aglycones. Copyright © 2015 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
Dan Cullen
2007-01-01
Few microbes compare with the filamentous fungus Aspergillus niger in its ability to produce prodigious amounts of useful chemicals and enzymes. This fungus is the principal source of citric acid for food, beverages and pharmaceuticals and of several important commercial enzymes, including glucoamylase, which is widely used for the conversion of starch to food syrups...
Zhao, Guozhong; Yao, Yunping; Hou, Lihua; Wang, Chunling; Cao, Xiaohong
2014-10-01
Aspergillus oryzae is used to produce traditional fermented foods and beverages. A. oryzae 3.042 produces a neutral protease and an alkaline protease but rarely an acid protease, which is unfavourable to soy-sauce fermentation. A. oryzae 100-8 was obtained by N(+) ion implantation mutagenesis of A. oryzae 3.042, and the protease secretions of these two strains are different. Sequencing the genome of A. oryzae 100-8 and comparing it to the genomes of A. oryzae 100-8 and 3.042 revealed some differences, such as single nucleotide polymorphisms, nucleotide deletion or insertion. Some of these differences may reflect the ability of A. oryzae to secrete proteases. Transcriptional sequencing and analysis of the two strains during the same growth processes provided further insights into the genes and pathways involved in protease secretion.
Andersen, Mikael Rørdam
2014-11-01
Primary metabolism affects all phenotypical traits of filamentous fungi. Particular examples include reacting to extracellular stimuli, producing precursor molecules required for cell division and morphological changes as well as providing monomer building blocks for production of secondary metabolites and extracellular enzymes. In this review, all annotated genes from four Aspergillus species have been examined. In this process, it becomes evident that 80-96% of the genes (depending on the species) are still without verified function. A significant proportion of the genes with verified metabolic functions are assigned to secondary or extracellular metabolism, leaving only 2-4% of the annotated genes within primary metabolism. It is clear that primary metabolism has not received the same attention in the post-genomic area as many other research areas--despite its role at the very centre of cellular function. However, several methods can be employed to use the metabolic networks in tandem with comparative genomics to accelerate functional assignment of genes in primary metabolism. In particular, gaps in metabolic pathways can be used to assign functions to orphan genes. In this review, applications of this from the Aspergillus genes will be examined, and it is proposed that, where feasible, this should be a standard part of functional annotation of fungal genomes. © The Author 2014. Published by Oxford University Press.
Pacheco-Arjona, Jose Ramon; Ramirez-Prado, Jorge Humberto
2014-01-01
The cell wall is a protective and versatile structure distributed in all fungi. The component responsible for its rigidity is chitin, a product of chitin synthase (Chsp) enzymes. There are seven classes of chitin synthase genes (CHS) and the amount and type encoded in fungal genomes varies considerably from one species to another. Previous Chsp sequence analyses focused on their study as individual units, regardless of genomic context. The identification of blocks of conserved genes between genomes can provide important clues about the interactions and localization of chitin synthases. On the present study, we carried out an in silico search of all putative Chsp encoded in 54 full fungal genomes, encompassing 21 orders from five phyla. Phylogenetic studies of these Chsp were able to confidently classify 347 out of the 369 Chsp identified (94%). Patterns in the distribution of Chsp related to taxonomy were identified, the most prominent being related to the type of fungal growth. More importantly, a synteny analysis for genomic blocks centered on class IV Chsp (the most abundant and widely distributed Chsp class) identified a putative cell wall metabolism gene cluster in members of the genus Aspergillus, the first such association reported for any fungal genome. PMID:25148134
Recent advances in reconstructing microbial secondary metabolites biosynthesis in Aspergillus spp.
He, Yi; Wang, Bin; Chen, Wanping; Cox, Russell J; He, Jingren; Chen, Fusheng
High throughput genome sequencing has revealed a multitude of potential secondary metabolites biosynthetic pathways that remain cryptic. Pathway reconstruction coupled with genetic engineering via heterologous expression enables discovery of novel compounds, elucidation of biosynthetic pathways, and optimization of product yields. Apart from Escherichia coli and yeast, fungi, especially Aspergillus spp., are well known and efficient heterologous hosts. This review summarizes recent advances in heterologous expression of microbial secondary metabolite biosynthesis in Aspergillus spp. We also discuss the technological challenges and successes in regard to heterologous host selection and DNA assembly behind the reconstruction of microbial secondary metabolite biosynthesis. Copyright © 2018 Elsevier Inc. All rights reserved.
Genomics of peanut leaf-spot pathogens; and RNA-interference-mediated control of aflatoxins
USDA-ARS?s Scientific Manuscript database
An overview update of the research done at USDA-ARS National Peanut Research Laboratory will be presented: including: the release of the Cercospora arachidicola genome, sequencing of Cercosporidium personatum, a workflow to study genetic diversity of aflatoxigenic Aspergillus, and progress on the us...
Ochratoxin A production by Penicillium thymicola.
Nguyen, Hai D T; McMullin, David R; Ponomareva, Ekaterina; Riley, Robert; Pomraning, Kyle R; Baker, Scott E; Seifert, Keith A
2016-08-01
Ochratoxin A (OTA) is a mycotoxin produced by some Aspergillus and Penicillium species that grow on economically important agricultural crops and food products. OTA is classified as Group 2B carcinogen and is potently nephrotoxic, which is the basis for its regulation in some jurisdictions. Using high resolution mass spectroscopy, OTA and ochratoxin B (OTB) were detected in liquid culture extracts of Penicillium thymicola DAOMC 180753 isolated from Canadian cheddar cheese. The genome of this strain was sequenced, assembled and annotated to probe for putative genes involved in OTA biosynthesis. Known OTA biosynthetic genes from Penicillium verrucosum or Penicillium nordicum, two related Penicillium species that produce OTA, were not found in P. thymicola. However, a gene cluster containing a polyketide synthase (PKS) and PKS-nonribosomal peptide synthase (NRPS) hybrid encoding genes were located in the P. thymicola genome that showed a high degree of similarity to OTA biosynthetic enzymes of Aspergillus carbonarius and Aspergillus ochraceus. This is the first report of ochratoxin from P. thymicola and a new record of the species in Canada. Crown Copyright © 2016. Published by Elsevier Ltd. All rights reserved.
Mirhendi, H; Zarei, F; Motamedi, M; Nouripour-Sisakht, S
2016-03-01
This work aimed to identify the species distribution of common clinical and environmental isolates of black Aspergilli based on simple restriction fragment length polymorphism (RFLP) analysis of the β-tubulin gene. A total of 149 clinical and environmental strains of black Aspergilli were collected and subjected to preliminary morphological examination. Total genomic DNAs were extracted, and PCR was performed to amplify part of the β-tubulin gene. At first, 52 randomly selected samples were species-delineated by sequence analysis. In order to distinguish the most common species, PCR amplicons of 117 black Aspergillus strains were identified by simple PCR-RFLP analysis using the enzyme TasI. Among 52 sequenced isolates, 28 were Aspergillus tubingensis, 21 Aspergillus niger, and the three remaining isolates included Aspergillus uvarum, Aspergillus awamori, and Aspergillus acidus. All 100 environmental and 17 BAL samples subjected to TasI-RFLP analysis of the β-tubulin gene, fell into two groups, consisting of about 59% (n=69) A. tubingensis and 41% (n=48) A. niger. Therefore, the method successfully and rapidly distinguished A. tubingensis and A. niger as the most common species among the clinical and environmental isolates. Although tardy, the Ehrlich test was also able to differentiate A. tubingensis and A. niger according to the yellow color reaction specific to A. niger. A. tubingensis and A. niger are the most common black Aspergillus in both clinical and environmental isolates in Iran. PCR-RFLP using TasI digestion of β-tubulin DNA enables rapid screening for these common species. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Yin, Chao; Wang, Bin; He, Pan; Lin, Ying; Pan, Li
2014-05-15
Aspergillus niger is usually regarded as a beneficial species widely used in biotechnological industry. Obtaining the genome sequence of the widely used aconidial A. niger SH2 strain is of great importance to understand its unusual production capability. In this study we assembled a high-quality genome sequence of A. niger SH2 with approximately 11,517 ORFs. Relatively high proportion of genes enriched for protein expression related FunCat items verify its efficient capacity in protein production. Furthermore, genome-wide comparative analysis between A. niger SH2 and CBS513.88 reveals insights into unique properties of A. niger SH2. A. niger SH2 lacks the gene related with the initiation of asexual sporulation (PrpA), leading to its distinct aconidial phenotype. Frame shift mutations and non-synonymous SNPs in genes of cell wall integrity signaling, β-1,3-glucan synthesis and chitin synthesis influence its cell wall development which is important for its hyphal fragmentation during industrial high-efficiency protein production. Copyright © 2014 Elsevier B.V. All rights reserved.
Shittu, Olufunke Bolatito; Adelaja, Oluwabunmi Molade; Obuotor, Tolulope Mobolaji; Sam-Wobo, Sam Olufemi; Adenaike, Adeyemi Sunday
2016-03-01
Aspergillosis has been identified as one of the hospital acquired infections but the contribution of water and inhouse air as possible sources of Aspergillus infection in immunocompromised individuals like HIV-TB patients have not been studied in any hospital setting in Nigeria. To identify and investigate genetic relationship between clinical and environmental Aspergillus sp. associated with HIV-TB co infected patients. DNA extraction, purification, amplification and sequencing of Internal Transcribed Spacer (ITS) genes were performed using standard protocols. Similarity search using BLAST on NCBI was used for species identification and MEGA 5.0 was used for phylogenetic analysis. Analyses of sequenced ITS genes of selected fourteen (14) Aspergillus isolates identified in the GenBank database revealed Aspergillus niger (28.57%), A. tubingensis (7.14%), A. flavus (7.14%) and A. fumigatus (57.14%). Aspergillus in sputum of HIV patients were Aspergillus niger, A. fumigatus, A. tubingensis and A. flavus. Also, A. niger and A. fumigatus were identified from water and open-air. Phylogenetic analysis of sequences yielded genetic relatedness between clinical and environmental isolates. Water and air in health care settings in Nigeria are important sources of Aspergillus sp. for HIV-TB patients.
FPD: A comprehensive phosphorylation database in fungi.
Bai, Youhuang; Chen, Bin; Li, Mingzhu; Zhou, Yincong; Ren, Silin; Xu, Qin; Chen, Ming; Wang, Shihua
2017-10-01
Protein phosphorylation, one of the most classic post-translational modification, plays a critical role in diverse cellular processes including cell cycle, growth, and signal transduction pathways. However, the available information about phosphorylation in fungi is limited. Here, we provided a Fungi Phosphorylation Database (FPD) that comprises high-confidence in vivo phosphosites identified by MS-based proteomics in various fungal species. This comprehensive phosphorylation database contains 62 272 non-redundant phosphorylation sites in 11 222 proteins across eight organisms, including Aspergillus flavus, Aspergillus nidulans, Fusarium graminearum, Magnaporthe oryzae, Neurospora crassa, Saccharomyces cerevisiae, Schizosaccharomyces pombe, and Cryptococcus neoformans. A fungi-specific phosphothreonine motif and several conserved phosphorylation motifs were discovered by comparatively analysing the pattern of phosphorylation sites in plants, animals, and fungi. Copyright © 2017 British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Cloning and bioinformatic analysis of lovastatin biosynthesis regulatory gene lovE.
Huang, Xin; Li, Hao-ming
2009-08-05
Lovastatin is an effective drug for treatment of hyperlipidemia. This study aimed to clone lovastatin biosynthesis regulatory gene lovE and analyze the structure and function of its encoding protein. According to the lovastatin synthase gene sequence from genebank, primers were designed to amplify and clone the lovastatin biosynthesis regulatory gene lovE from Aspergillus terrus genomic DNA. Bioinformatic analysis of lovE and its encoding animo acid sequence was performed through internet resources and software like DNAMAN. Target fragment lovE, almost 1500 bp in length, was amplified from Aspergillus terrus genomic DNA and the secondary and three-dimensional structures of LovE protein were predicted. In the lovastatin biosynthesis process lovE is a regulatory gene and LovE protein is a GAL4-like transcriptional factor.
Brown, T A; Davies, R W; Ray, J A; Waring, R B; Scazzocchio, C
1983-01-01
A 2830-bp segment of the mitochondrial genome of the fungus Aspergillus nidulans was sequenced and shown to contain two unidentified reading frames (URFs). These reading frames are 352 and 488 codons in length, and would specify unmodified proteins of mol. wts. 39,000 and 54,000, respectively. The derived amino acid sequences indicate that these genes are equivalent to the human mitochondrial URFs 1 and 4, with 39% amino acid homology for URF1 and 26% for URF4. Both URFs were shown by secondary structure predictions to code for predominantly beta-sheeted proteins with strong structural conservation between the fungal and human homologues. Counterparts of mammalian URFs have not previously been identified in non-mammalian genomes, and the discovery that A. nidulans possesses reading frames so closely homologous with URF1 and URF4 shows that these genes are of general functional importance in the mitochondria of diverse species. PMID:11894959
Epidemiological and Genomic Landscape of Azole Resistance Mechanisms in Aspergillus Fungi
Hagiwara, Daisuke; Watanabe, Akira; Kamei, Katsuhiko; Goldman, Gustavo H.
2016-01-01
Invasive aspergillosis is a life-threatening mycosis caused by the pathogenic fungus Aspergillus. The predominant causal species is Aspergillus fumigatus, and azole drugs are the treatment of choice. Azole drugs approved for clinical use include itraconazole, voriconazole, posaconazole, and the recently added isavuconazole. However, epidemiological research has indicated that the prevalence of azole-resistant A. fumigatus isolates has increased significantly over the last decade. What is worse is that azole-resistant strains are likely to have emerged not only in response to long-term drug treatment but also because of exposure to azole fungicides in the environment. Resistance mechanisms include amino acid substitutions in the target Cyp51A protein, tandem repeat sequence insertions at the cyp51A promoter, and overexpression of the ABC transporter Cdr1B. Environmental azole-resistant strains harboring the association of a tandem repeat sequence and punctual mutation of the Cyp51A gene (TR34/L98H and TR46/Y121F/T289A) have become widely disseminated across the world within a short time period. The epidemiological data also suggests that the number of Aspergillus spp. other than A. fumigatus isolated has risen. Some non-fumigatus species intrinsically show low susceptibility to azole drugs, imposing the need for accurate identification, and drug susceptibility testing in most clinical cases. Currently, our knowledge of azole resistance mechanisms in non-fumigatus Aspergillus species such as A. flavus, A. niger, A. tubingensis, A. terreus, A. fischeri, A. lentulus, A. udagawae, and A. calidoustus is limited. In this review, we present recent advances in our understanding of azole resistance mechanisms particularly in A. fumigatus. We then provide an overview of the genome sequences of non-fumigatus species, focusing on the proteins related to azole resistance mechanisms. PMID:27708619
Fidler, Gabor; Kocsube, Sandor; Leiter, Eva; Biro, Sandor; Paholcsek, Melinda
2017-08-01
We describe a high-resolution melting (HRM) analysis method that is rapid, reproducible, and able to identify reference strains and further 40 clinical isolates of Aspergillus fumigatus (14), A. lentulus (3), A. terreus (7), A. flavus (8), A. niger (2), A. welwitschiae (4), and A. tubingensis (2). Asp1 and Asp2 primer sets were designed to amplify partial sequences of the Aspergillus benA (beta-tubulin) genes in a closed-, single-tube system. Human placenta DNA, further Aspergillus (3), Candida (9), Fusarium (6), and Scedosporium (2) nucleic acids from type strains and clinical isolates were also included in this study to evaluate cross reactivity with other relevant pathogens causing invasive fungal infections. The barcoding capacity of this method proved to be 100% providing distinctive binomial scores; 14, 34, 36, 35, 25, 15, 26 when tested among species, while the within-species distinction capacity of the assay proved to be 0% based on the aligned thermodynamic profiles of the Asp1, Asp2 melting clusters allowing accurate species delimitation of all tested clinical isolates. The identification limit of this HRM assay was also estimated on Aspergillus reference gDNA panels where it proved to be 10-102 genomic equivalents (GE) except the A. fumigatus panel where it was 103 only. Furthermore, misidentification was not detected with human genomic DNA or with Candida, Fusarium, and Scedosporium strains. Our DNA barcoding assay introduced here provides results within a few hours, and it may possess further diagnostic utility when analyzing standard cultures supporting adequate therapeutic decisions. © The Author 2016. Published by Oxford University Press on behalf of The International Society for Human and Animal Mycology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Ashtiani, Nafiseh Mohebbi; Kachuei, Reza; Yalfani, Roozbeh; Harchegani, Asghar Beigi; Nosratabadi, Mohsen
2017-06-01
Aspergillus species are important in medicine, agriculture and various industries. The sections Fumigati, Flavi, and Nigri are the most important members of the Aspergillus genus. This study intended to identify and separate these three Aspergillus sections and to differentiate among them using specific primers. A bioinformatics study was initially performed to analyse the sequences of five genes, namely, beta-tubulin, calmodulin, the pre-rRNA processing protein Tsr1, the DNA-replication licensing factor Mcm7, and RNA polymerase II second largest subunit (RPB2) in the three Aspergillus sections using MEGA6 software and the NCBI database. Primers were designed to select genes for each of the Aspergillus sections being analysed. A total of 134 environmental and clinical Aspergillus species were isolated, purified and initially identified by colony morphology.. Subsequently, DNA was extracted using the phenol-chloroform method, specific primers were synthesized, PCR was performed for DNA from all isolates, and the results were compared to morphological characteristics. Of the 134 isolates tested, 56 were Nigri, 32 were Fumigati, 32 were Flavi, and the rest (14 isolates) belonged to other sections. The beta-tubulin and calmodulin genes were found to be the most suitable for differentiating among these three groups; the beta-tubulin gene was used for molecular identification of Aspergillus section Fumigati, and the calmodulin gene for identifying sections Flavi and Nigri.
Characteristic clinical features of Aspergillus appendicitis: Case report and literature review.
Gjeorgjievski, Mihajlo; Amin, Mitual B; Cappell, Mitchell S
2015-11-28
This work aims to facilitate diagnosing Aspergillus appendicitis, which can be missed clinically due to its rarity, by proposing a clinical pentad for Aspergillus appendicitis based on literature review and one new case. The currently reported case of pathologically-proven Aspergillus appendicitis was identified by computerized search of pathology database at William Beaumont Hospital, 1999-2014. Prior cases were identified by computerized literature search. Among 10980 pathology reports of pathologically-proven appendicitis, one case of Aspergillus appendicitis was identified (rate = 0.01%). A young boy with profound neutropenia, recent chemotherapy, and acute myelogenous leukemia presented with right lower quadrant pain, pyrexia, and generalized malaise. Abdominal computed tomography scan showed a thickened appendiceal wall and periappendiceal inflammation, suggesting appendicitis. Emergent laparotomy showed an inflamed, thickened appendix, which was resected. The patient did poorly postoperatively with low-grade-fevers while receiving antibacterial therapy, but rapidly improved after initiating amphotericin therapy. Microscopic examination of a silver stain of the appendectomy specimen revealed fungi with characteristic Aspergillus morphology, findings confirmed by immunohistochemistry. Primary Aspergillus appendicitis is exceptionally rare, with only 3 previously reported cases. All three cases presented with (1)-neutropenia, (2)-recent chemotherapy, (3)-acute leukemia, and (4)-suspected appendicitis; (5)-the two prior cases initially treated with antibacterial therapy, fared poorly before instituting anti-Aspergillus therapy. The current patient satisfied all these five criteria. Based on these four cases, a clinical pentad is proposed for Aspergillus appendicitis: clinically-suspected appendicitis, neutropenia, recent chemotherapy, acute leukemia, and poor clinical response if treated solely by antibacterial/anti-candidial therapy. Patients presenting with this proposed pentad may benefit from testing for Aspergillus infection by silver-stains/immunohistochemistry and considering empirical anti-Aspergillus therapy pending a tissue diagnosis.
Vidal-Acuña, M Reyes; Ruiz-Pérez de Pipaón, Maite; Torres-Sánchez, María José; Aznar, Javier
2017-12-08
An expanded library of matrix assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) has been constructed using the spectra generated from 42 clinical isolates and 11 reference strains, including 23 different species from 8 sections (16 cryptic plus 7 noncryptic species). Out of a total of 379 strains of Aspergillus isolated from clinical samples, 179 strains were selected to be identified by sequencing of beta-tubulin or calmodulin genes. Protein spectra of 53 strains, cultured in liquid medium, were used to construct an in-house reference database in the MALDI-TOF MS. One hundred ninety strains (179 clinical isolates previously identified by sequencing and the 11 reference strains), cultured on solid medium, were blindy analyzed by the MALDI-TOF MS technology to validate the generated in-house reference database. A 100% correlation was obtained with both identification methods, gene sequencing and MALDI-TOF MS, and no discordant identification was obtained. The HUVR database provided species level (score of ≥2.0) identification in 165 isolates (86.84%) and for the remaining 25 (13.16%) a genus level identification (score between 1.7 and 2.0) was obtained. The routine MALDI-TOF MS analysis with the new database, was then challenged with 200 Aspergillus clinical isolates grown on solid medium in a prospective evaluation. A species identification was obtained in 191 strains (95.5%), and only nine strains (4.5%) could not be identified at the species level. Among the 200 strains, A. tubingensis was the only cryptic species identified. We demonstrated the feasibility and usefulness of the new HUVR database in MALDI-TOF MS by the use of a standardized procedure for the identification of Aspergillus clinical isolates, including cryptic species, grown either on solid or liquid media. © The Author 2017. Published by Oxford University Press on behalf of The International Society for Human and Animal Mycology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Molecular biological researches of Kuro-Koji molds, their classification and safety.
Yamada, Osamu; Takara, Ryo; Hamada, Ryoko; Hayashi, Risa; Tsukahara, Masatoshi; Mikami, Shigeaki
2011-09-01
To assess the position of Kuro-Koji molds in black Aspergillus, we performed sequence analysis of approximately 2500 nucleotides of partial gene fragments, such as histone 3, on a total of 57 Aspergillus strains, including Aspergillus kawachii NBRC 4308, 12 Kuro-Koji molds isolated from awamori breweries in Japan, Aspergillus niger ATCC 1015, and A. tubingensis ATCC10550. Sequence results showed that all black Aspergillus strains could be classified into 3 types, type N which includes A. niger ATCC 1015, type T which includes A. tubingensis ATCC 10550, and type L which includes A. kawachii NBRC 4308. Phylogenetic analysis showed these three types belong to different clusters. All 12 Kuro-Koji molds isolated from awamori breweries were classified as type L, thus we concluded type L represents the industrial Kuro-Koji molds. We found all type L strains lack the An15g07920 gene which is required for ochratoxin A biosynthesis in black Aspergillus. This sequence is present in the genome of A. niger CBS 513.88 and has homology to the polyketide synthase fragment of A. ochraceus which is involved in ochratoxin A biosynthesis. Based on the industrial importance and the safety of Kuro-Koji molds, we propose to classify the type L strains as Aspergillus luchuensis, as initially reported by Dr. Inui. Copyright © 2011 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
De Carolis, E; Posteraro, B; Lass-Flörl, C; Vella, A; Florio, A R; Torelli, R; Girmenia, C; Colozza, C; Tortorano, A M; Sanguinetti, M; Fadda, G
2012-05-01
Accurate species discrimination of filamentous fungi is essential, because some species have specific antifungal susceptibility patterns, and misidentification may result in inappropriate therapy. We evaluated matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) for species identification through direct surface analysis of the fungal culture. By use of culture collection strains representing 55 species of Aspergillus, Fusarium and Mucorales, a reference database was established for MALDI-TOF MS-based species identification according to the manufacturer's recommendations for microflex measurements and MALDI BioTyper 2.0 software. The profiles of young and mature colonies were analysed for each of the reference strains, and species-specific spectral fingerprints were obtained. To evaluate the database, 103 blind-coded fungal isolates collected in the routine clinical microbiology laboratory were tested. As a reference method for species designation, multilocus sequencing was used. Eighty-five isolates were unequivocally identified to the species level (≥99% sequence similarity); 18 isolates producing ambiguous results at this threshold were initially rated as identified to the genus level only. Further molecular analysis definitively assigned these isolates to the species Aspergillus oryzae (17 isolates) and Aspergillus flavus (one isolate), concordant with the MALDI-TOF MS results. Excluding nine isolates that belong to the fungal species not included in our reference database, 91 (96.8%) of 94 isolates were identified by MALDI-TOF MS to the species level, in agreement with the results of the reference method; three isolates were identified to the genus level. In conclusion, MALDI-TOF MS is suitable for the routine identification of filamentous fungi in a medical microbiology laboratory. © 2011 The Authors. Clinical Microbiology and Infection © 2011 European Society of Clinical Microbiology and Infectious Diseases.
Reeves, Emer P; Reiber, Kathrin; Neville, Claire; Scheibner, Olaf; Kavanagh, Kevin; Doyle, Sean
2006-07-01
Aspergillus fumigatus is an important human fungal pathogen. The Aspergillus fumigatus genome contains 14 nonribosomal peptide synthetase genes, potentially responsible for generating metabolites that contribute to organismal virulence. Differential expression of the nonribosomal peptide synthetase gene, pes1, in four strains of Aspergillus fumigatus was observed. The pattern of pes1 expression differed from that of a putative siderophore synthetase gene, sidD, and so is unlikely to be involved in iron acquisition. The Pes1 protein (expected molecular mass 698 kDa) was partially purified and identified by immunoreactivity, peptide mass fingerprinting (36% sequence coverage) and MALDI LIFT-TOF/TOF MS (four internal peptides sequenced). A pes1 disruption mutant (delta pes1) of Aspergillus fumigatus strain 293.1 was generated and confirmed by Southern and western analysis, in addition to RT-PCR. The delta pes1 mutant also showed significantly reduced virulence in the Galleria mellonella model system (P < 0.001) and increased sensitivity to oxidative stress (P = 0.002) in culture and during neutrophil-mediated phagocytosis. In addition, the mutant exhibited altered conidial surface morphology and hydrophilicity, compared to Aspergillus fumigatus 293.1. It is concluded that pes1 contributes to improved fungal tolerance against oxidative stress, mediated by the conidial phenotype, during the infection process.
[Progress in omics research of Aspergillus niger].
Sui, Yufei; Ouyang, Liming; Lu, Hongzhong; Zhuang, Yingping; Zhang, Siliang
2016-08-25
Aspergillus niger, as an important industrial fermentation strain, is widely applied in the production of organic acids and industrial enzymes. With the development of diverse omics technologies, the data of genome, transcriptome, proteome and metabolome of A. niger are increasing continuously, which declared the coming era of big data for the research in fermentation process of A. niger. The data analysis from single omics and the comparison of multi-omics, to the integrations of multi-omics based on the genome-scale metabolic network model largely extends the intensive and systematic understanding of the efficient production mechanism of A. niger. It also provides possibilities for the reasonable global optimization of strain performance by genetic modification and process regulation. We reviewed and summarized progress in omics research of A. niger, and proposed the development direction of omics research on this cell factory.
Heo, Min Seok; Shin, Jong Hee; Choi, Min Ji; Park, Yeon Joon; Lee, Hye Soo; Koo, Sun Hoe; Lee, Won Gil; Kim, Soo Hyun; Shin, Myung Geun; Suh, Soon Pal; Ryang, Dong Wook
2015-11-01
We investigated the species distribution and amphotericin B (AMB) susceptibility of Korean clinical Aspergillus isolates by using two Etests and the CLSI broth microdilution method. A total of 136 Aspergillus isolates obtained from 11 university hospitals were identified by sequencing the internal transcribed spacer (ITS) and β-tubulin genomic regions. Minimal inhibitory concentrations (MICs) of AMB were determined in Etests using Mueller-Hinton agar (Etest-MH) and RPMI agar (Etest-RPG), and categorical agreement with the CLSI method was assessed by using epidemiological cutoff values. ITS sequencing identified the following six Aspergillus species complexes: Aspergillus fumigatus (42.6% of the isolates), A. niger (23.5%), A. flavus (17.6%), A. terreus (11.0%), A. versicolor (4.4%), and A. ustus (0.7%). Cryptic species identifiable by β-tubulin sequencing accounted for 25.7% (35/136) of the isolates. Of all 136 isolates, 36 (26.5%) had AMB MICs of ≥2 μg/mL by the CLSI method. The categorical agreement of Etest-RPG with the CLSI method was 98% for the A. fumigatus, A. niger, and A. versicolor complexes, 87% for the A. terreus complex, and 37.5% for the A. flavus complex. That of Etest-MH was ≤75% for the A. niger, A. flavus, A. terreus, and A. versicolor complexes but was higher for the A. fumigatus complex (98.3%). Aspergillus species other than A. fumigatus constitute about 60% of clinical Aspergillus isolates, and reduced AMB susceptibility is common among clinical isolates of Aspergillus in Korea. Molecular identification and AMB susceptibility testing by Etest-RPG may be useful for characterizing Aspergillus isolates of clinical relevance.
The state of proteome profiling in the fungal genus Aspergillus.
Kim, Yonghyun; Nandakumar, M P; Marten, Mark R
2008-03-01
Aspergilli are an important genus of filamentous fungi that contribute to a multibillion dollar industry. Since many fungal genome sequencing were recently completed, it would be advantageous to profile their proteome to better understand the fungal cell factory. Here, we review proteomic data generated for the Aspergilli in recent years. Thus far, a combined total of 28 cell surface, 102 secreted and 139 intracellular proteins have been identified based on 10 different studies on Aspergillus proteomics. A summary proteome map highlighting identified proteins in major metabolic pathway is presented.
Analysis of Aspergillus nidulans metabolism at the genome-scale
David, Helga; Özçelik, İlknur Ş; Hofmann, Gerald; Nielsen, Jens
2008-01-01
Background Aspergillus nidulans is a member of a diverse group of filamentous fungi, sharing many of the properties of its close relatives with significance in the fields of medicine, agriculture and industry. Furthermore, A. nidulans has been a classical model organism for studies of development biology and gene regulation, and thus it has become one of the best-characterized filamentous fungi. It was the first Aspergillus species to have its genome sequenced, and automated gene prediction tools predicted 9,451 open reading frames (ORFs) in the genome, of which less than 10% were assigned a function. Results In this work, we have manually assigned functions to 472 orphan genes in the metabolism of A. nidulans, by using a pathway-driven approach and by employing comparative genomics tools based on sequence similarity. The central metabolism of A. nidulans, as well as biosynthetic pathways of relevant secondary metabolites, was reconstructed based on detailed metabolic reconstructions available for A. niger and Saccharomyces cerevisiae, and information on the genetics, biochemistry and physiology of A. nidulans. Thereby, it was possible to identify metabolic functions without a gene associated, and to look for candidate ORFs in the genome of A. nidulans by comparing its sequence to sequences of well-characterized genes in other species encoding the function of interest. A classification system, based on defined criteria, was developed for evaluating and selecting the ORFs among the candidates, in an objective and systematic manner. The functional assignments served as a basis to develop a mathematical model, linking 666 genes (both previously and newly annotated) to metabolic roles. The model was used to simulate metabolic behavior and additionally to integrate, analyze and interpret large-scale gene expression data concerning a study on glucose repression, thereby providing a means of upgrading the information content of experimental data and getting further insight into this phenomenon in A. nidulans. Conclusion We demonstrate how pathway modeling of A. nidulans can be used as an approach to improve the functional annotation of the genome of this organism. Furthermore we show how the metabolic model establishes functional links between genes, enabling the upgrade of the information content of transcriptome data. PMID:18405346
Jin, Feng-Jie; Katayama, Takuya; Maruyama, Jun-Ichi; Kitamoto, Katsuhiko
2016-11-01
Genomic mapping of mutations using next-generation sequencing technologies has facilitated the identification of genes contributing to fundamental biological processes, including human diseases. However, few studies have used this approach to identify mutations contributing to heterologous protein production in industrial strains of filamentous fungi, such as Aspergillus oryzae. In a screening of A. oryzae strains that hyper-produce human lysozyme (HLY), we previously isolated an AUT1 mutant that showed higher production of various heterologous proteins; however, the underlying factors contributing to the increased heterologous protein production remained unclear. Here, using a comparative genomic approach performed with whole-genome sequences, we attempted to identify the genes responsible for the high-level production of heterologous proteins in the AUT1 mutant. The comparative sequence analysis led to the detection of a gene (AO090120000003), designated autA, which was predicted to encode an unknown cytoplasmic protein containing an alpha/beta-hydrolase fold domain. Mutation or deletion of autA was associated with higher production levels of HLY. Specifically, the HLY yields of the autA mutant and deletion strains were twofold higher than that of the control strain during the early stages of cultivation. Taken together, these results indicate that combining classical mutagenesis approaches with comparative genomic analysis facilitates the identification of novel genes involved in heterologous protein production in filamentous fungi.
LAMP-PCR detection of ochratoxigenic Aspergillus species collected from peanut kernel.
Al-Sheikh, H M
2015-01-30
Over the last decade, ochratoxin A (OTA) has been widely described and is ubiquitous in several agricultural products. Ochratoxins represent the second-most important mycotoxin group after aflatoxins. A total of 34 samples were surveyed from 3 locations, including Mecca, Madina, and Riyadh, Saudi Arabia, during 2012. Fungal contamination frequency was determined for surface-sterilized peanut seeds, which were seeded onto malt extract agar media. Aspergillus niger (35%), Aspergillus ochraceus (30%), and Aspergillus carbonarius (25%) were the most frequently observed Aspergillius species, while Aspergillus flavus and Aspergillus phoenicis isolates were only infrequently recovered and in small numbers (10%). OTA production was evaluated on yeast extract sucrose medium, which revealed that 57% of the isolates were A. niger and 60% of A. carbonarius isolates were OTA producers; 100% belonged to A. ochraceus. Only one isolate, morphologically identified as A. carbonarius, and 3 A. niger isolates unstably produced OTA. A polymerase chain reaction (PCR)-based identification and detection assay was used to identify A. ochraceus isolates. Using the primer sets OCRA1/OCRA2, 400-base pair PCR fragments were produced only when genomic DNA from A. ochraceus isolates was used. Recently, the loop-mediated isothermal amplification assay using recombinase polymerase amplification chemistry was used for A. carbonarius and A. niger DNA identification. As a non-gel-based technique, the amplification product was directly visualized in the reaction tube after adding calcein for naked-eye examination.
Mizutani, Osamu; Arazoe, Takayuki; Toshida, Kenji; Hayashi, Risa; Ohsato, Shuichi; Sakuma, Tetsushi; Yamamoto, Takashi; Kuwata, Shigeru; Yamada, Osamu
2017-03-01
Transcription activator-like effector nucleases (TALENs), which can generate DNA double-strand breaks at specific sites in the desired genome locus, have been used in many organisms as a tool for genome editing. In Aspergilli, including Aspergillus oryzae, however, the use of TALENs has not been validated. In this study, we performed genome editing of A. oryzae wild-type strain via error of nonhomologous end-joining (NHEJ) repair by transient expression of high-efficiency Platinum-Fungal TALENs (PtFg TALENs). Targeted mutations were observed as various mutation patterns. In particular, approximately half of the PtFg TALEN-mediated deletion mutants had deletions larger than 1 kb in the TALEN-targeting region. We also conducted PtFg TALEN-based genome editing in A. oryzae ligD disruptant (ΔligD) lacking the ligD gene involved in the final step of the NHEJ repair and found that mutations were still obtained as well as wild-type. In this case, the ratio of the large deletions reduced compared to PtFg TALEN-based genome editing in the wild-type. In conclusion, we demonstrate that PtFg TALENs are sufficiently functional to cause genome editing via error of NHEJ in A. oryzae. In addition, we reveal that genome editing using TALENs in A. oryzae tends to cause large deletions at the target region, which were partly suppressed by deletion of ligD. Copyright © 2016 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
Xu, Defeng; Pan, Li; Zhao, Haifeng; Zhao, Mouming; Sun, Jiaxin; Liu, Dongmei
2011-09-01
Acid protease is essential for degradation of proteins during soy sauce fermentation. To breed more suitable koji molds with high activity of acid protease, interspecific genome recombination between A. oryzae and A. niger was performed. Through stabilization with d-camphor and haploidization with benomyl, several stable fusants with higher activity of acid protease were obtained, showing different degrees of improvement in acid protease activity compared with the parental strain A. oryzae. In addition, analyses of mycelial morphology, expression profiles of extracellular proteins, esterase isoenzyme profiles, and random amplified polymorphic DNA (RAPD) were applied to identify the fusants through their phenotypic and genetic relationships. Morphology analysis of the mycelial shape of fusants indicated a phenotype intermediate between A. oryzae and A. niger. The profiles of extracellular proteins and esterase isoenzyme electrophoresis showed the occurrence of genome recombination during or after protoplast fusion. The dendrogram constructed from RAPD data revealed great heterogeneity, and genetic dissimilarity indices showed there were considerable differences between the fusants and their parental strains. This investigation suggests that genome recombination is a powerful tool for improvement of food-grade industrial strains. Furthermore, the presented strain improvement procedure will be applicable for widespread use for other industrial strains.
Park, Ju Heon; Shin, Jong Hee; Choi, Min Ji; Choi, Jin Un; Park, Yeon-Joon; Jang, Sook Jin; Won, Eun Jeong; Kim, Soo Hyun; Kee, Seung Jung; Shin, Myung Geun; Suh, Soon Pal
2017-01-01
We evaluated the ability of the Filamentous Fungi Library 1.0 of the MALDI-TOF MS Biotyper system to identify 345 clinical Aspergillus isolates from 11 Korean hospitals. Compared with results of the internal transcribed spacer region sequencing, the frequencies of correct identification at the species-complex level were 94.5% and 98.8% with cutoff values of 2.0 and 1.7, respectively. Compared with results of β-tubulin gene sequencing, the frequencies of correct identification at the species level were 96.0% (cutoff 2.0) and 100% (cutoff 1.7) for 303 Aspergillus isolates of five common, non-cryptic species, but only 4.8% (cutoff 1.7) and 0% (cutoff 2.0) for 42 Aspergillus isolates of six cryptic species (identifiable by β-tubulin or calmodulin sequencing). These results show that the MALDI Biotyper using the Filamentous Fungi Library version 1.0 enables reliable identification of the majority of common clinical Aspergillus isolates, although the database should be expanded to facilitate identification of cryptic species. Copyright © 2016 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andersen, Mikael R.; Salazar, Margarita; Schaap, Peter
2011-06-01
The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compels additional exploration. We therefore undertook whole genome sequencing of the acidogenic A. niger wild type strain (ATCC 1015), and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence and half the telomeric regionsmore » have been elucidated. Moreover, sequence information from ATCC 1015 was utilized to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 megabase of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis revealed up-regulation of the electron transport chain, specifically the alternative oxidative pathway in ATCC 1015, while CBS 513.88 showed significant up regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases and protein transporters.« less
An Integrated Molecular Database on Indian Insects.
Pratheepa, Maria; Venkatesan, Thiruvengadam; Gracy, Gandhi; Jalali, Sushil Kumar; Rangheswaran, Rajagopal; Antony, Jomin Cruz; Rai, Anil
2018-01-01
MOlecular Database on Indian Insects (MODII) is an online database linking several databases like Insect Pest Info, Insect Barcode Information System (IBIn), Insect Whole Genome sequence, Other Genomic Resources of National Bureau of Agricultural Insect Resources (NBAIR), Whole Genome sequencing of Honey bee viruses, Insecticide resistance gene database and Genomic tools. This database was developed with a holistic approach for collecting information about phenomic and genomic information of agriculturally important insects. This insect resource database is available online for free at http://cib.res.in. http://cib.res.in/.
Current status of genomics research on mycotoxigenic fungi
USDA-ARS?s Scientific Manuscript database
Mold-produced secondary metabolites that are toxic and carcinogenic are termed mycotoxins. They are biosynthesized in a number of fungi, mainly from species in the Aspergillus, Fusarium and Penicillium genera. Mycotoxins contaminate agricultural commodities such as grains, fruits and nuts. Due to th...
The Sequenced Angiosperm Genomes and Genome Databases.
Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng
2018-01-01
Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.
The Sequenced Angiosperm Genomes and Genome Databases
Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng
2018-01-01
Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology. PMID:29706973
Genome mining and functional genomics for siderophore production in Aspergillus niger.
Franken, Angelique C W; Lechner, Beatrix E; Werner, Ernst R; Haas, Hubertus; Lokman, B Christien; Ram, Arthur F J; van den Hondel, Cees A M J J; de Weert, Sandra; Punt, Peter J
2014-11-01
Iron is an essential metal for many organisms, but the biologically relevant form of iron is scarce because of rapid oxidation resulting in low solubility. Simultaneously, excessive accumulation of iron is toxic. Consequently, iron uptake is a highly controlled process. In most fungal species, siderophores play a central role in iron handling. Siderophores are small iron-specific chelators that can be secreted to scavenge environmental iron or bind intracellular iron with high affinity. A second high-affinity iron uptake mechanism is reductive iron assimilation (RIA). As shown in Aspergillus fumigatus and Aspergillus nidulans, synthesis of siderophores in Aspergilli is predominantly under control of the transcription factors SreA and HapX, which are connected by a negative transcriptional feedback loop. Abolishing this fine-tuned regulation corroborates iron homeostasis, including heme biosynthesis, which could be biotechnologically of interest, e.g. the heterologous production of heme-dependent peroxidases. Aspergillus niger genome inspection identified orthologues of several genes relevant for RIA and siderophore metabolism, as well as sreA and hapX. Interestingly, genes related to synthesis of the common fungal extracellular siderophore triacetylfusarinine C were absent. Reverse-phase high-performance liquid chromatography (HPLC) confirmed the absence of triacetylfusarinine C, and demonstrated that the major secreted siderophores of A. niger are coprogen B and ferrichrome, which is also the dominant intracellular siderophore. In A. niger wild type grown under iron-replete conditions, the expression of genes involved in coprogen biosynthesis and RIA was low in the exponential growth phase but significantly induced during ascospore germination. Deletion of sreA in A. niger resulted in elevated iron uptake and increased cellular ferrichrome accumulation. Increased sensitivity toward phleomycin and high iron concentration reflected the toxic effects of excessive iron uptake. Moreover, SreA-deficiency resulted in increased accumulation of heme intermediates, but no significant increase in heme content. Together with the upregulation of several heme biosynthesis genes, these results reveal a complex heme regulatory mechanism. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Elusive Origins of the Extra Genes in Aspergillus oryzae
Khaldi, Nora; Wolfe, Kenneth H.
2008-01-01
The genome sequence of Aspergillus oryzae revealed unexpectedly that this species has approximately 20% more genes than its congeneric species A. nidulans and A. fumigatus. Where did these extra genes come from? Here, we evaluate several possible causes of the elevated gene number. Many gene families are expanded in A. oryzae relative to A. nidulans and A. fumigatus, but we find no evidence of ancient whole-genome duplication or other segmental duplications, either in A. oryzae or in the common ancestor of the genus Aspergillus. We show that the presence of divergent pairs of paralogs is a feature peculiar to A. oryzae and is not shared with A. nidulans or A. fumigatus. In phylogenetic trees that include paralog pairs from A. oryzae, we frequently find that one of the genes in a pair from A. oryzae has the expected orthologous relationship with A. nidulans, A. fumigatus and other species in the subphylum Eurotiomycetes, whereas the other A. oryzae gene falls outside this clade but still within the Ascomycota. We identified 456 such gene pairs in A. oryzae. Further phylogenetic analysis did not however indicate a single consistent evolutionary origin for the divergent members of these pairs. Approximately one-third of them showed phylogenies that are suggestive of horizontal gene transfer (HGT) from Sordariomycete species, and these genes are closer together in the A. oryzae genome than expected by chance, but no unique Sordariomycete donor species was identifiable. The postulated HGTs from Sordariomycetes still leave the majority of extra A. oryzae genes unaccounted for. One possible explanation for our observations is that A. oryzae might have been the recipient of many separate HGT events from diverse donors. PMID:18725939
Guo, Baozhu; Chen, Xiaoping; Dang, Phat; Scully, Brian T; Liang, Xuanqiang; Holbrook, C Corley; Yu, Jiujiang; Culbreath, Albert K
2008-01-01
Background Peanut (Arachis hypogaea L.) is an important crop economically and nutritionally, and is one of the most susceptible host crops to colonization of Aspergillus parasiticus and subsequent aflatoxin contamination. Knowledge from molecular genetic studies could help to devise strategies in alleviating this problem; however, few peanut DNA sequences are available in the public database. In order to understand the molecular basis of host resistance to aflatoxin contamination, a large-scale project was conducted to generate expressed sequence tags (ESTs) from developing seeds to identify resistance-related genes involved in defense response against Aspergillus infection and subsequent aflatoxin contamination. Results We constructed six different cDNA libraries derived from developing peanut seeds at three reproduction stages (R5, R6 and R7) from a resistant and a susceptible cultivated peanut genotypes, 'Tifrunner' (susceptible to Aspergillus infection with higher aflatoxin contamination and resistant to TSWV) and 'GT-C20' (resistant to Aspergillus with reduced aflatoxin contamination and susceptible to TSWV). The developing peanut seed tissues were challenged by A. parasiticus and drought stress in the field. A total of 24,192 randomly selected cDNA clones from six libraries were sequenced. After removing vector sequences and quality trimming, 21,777 high-quality EST sequences were generated. Sequence clustering and assembling resulted in 8,689 unique EST sequences with 1,741 tentative consensus EST sequences (TCs) and 6,948 singleton ESTs. Functional classification was performed according to MIPS functional catalogue criteria. The unique EST sequences were divided into twenty-two categories. A similarity search against the non-redundant protein database available from NCBI indicated that 84.78% of total ESTs showed significant similarity to known proteins, of which 165 genes had been previously reported in peanuts. There were differences in overall expression patterns in different libraries and genotypes. A number of sequences were expressed throughout all of the libraries, representing constitutive expressed sequences. In order to identify resistance-related genes with significantly differential expression, a statistical analysis to estimate the relative abundance (R) was used to compare the relative abundance of each gene transcripts in each cDNA library. Thirty six and forty seven unique EST sequences with threshold of R > 4 from libraries of 'GT-C20' and 'Tifrunner', respectively, were selected for examination of temporal gene expression patterns according to EST frequencies. Nine and eight resistance-related genes with significant up-regulation were obtained in 'GT-C20' and 'Tifrunner' libraries, respectively. Among them, three genes were common in both genotypes. Furthermore, a comparison of our EST sequences with other plant sequences in the TIGR Gene Indices libraries showed that the percentage of peanut EST matched to Arabidopsis thaliana, maize (Zea mays), Medicago truncatula, rapeseed (Brassica napus), rice (Oryza sativa), soybean (Glycine max) and wheat (Triticum aestivum) ESTs ranged from 33.84% to 79.46% with the sequence identity ≥ 80%. These results revealed that peanut ESTs are more closely related to legume species than to cereal crops, and more homologous to dicot than to monocot plant species. Conclusion The developed ESTs can be used to discover novel sequences or genes, to identify resistance-related genes and to detect the differences among alleles or markers between these resistant and susceptible peanut genotypes. Additionally, this large collection of cultivated peanut EST sequences will make it possible to construct microarrays for gene expression studies and for further characterization of host resistance mechanisms. It will be a valuable genomic resource for the peanut community. The 21,777 ESTs have been deposited to the NCBI GenBank database with accession numbers ES702769 to ES724546. PMID:18248674
MIPS: a database for genomes and protein sequences.
Mewes, H W; Heumann, K; Kaps, A; Mayer, K; Pfeiffer, F; Stocker, S; Frishman, D
1999-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried near Munich, Germany, develops and maintains genome oriented databases. It is commonplace that the amount of sequence data available increases rapidly, but not the capacity of qualified manual annotation at the sequence databases. Therefore, our strategy aims to cope with the data stream by the comprehensive application of analysis tools to sequences of complete genomes, the systematic classification of protein sequences and the active support of sequence analysis and functional genomics projects. This report describes the systematic and up-to-date analysis of genomes (PEDANT), a comprehensive database of the yeast genome (MYGD), a database reflecting the progress in sequencing the Arabidopsis thaliana genome (MATD), the database of assembled, annotated human EST clusters (MEST), and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). MIPS provides access through its WWW server (http://www.mips.biochem.mpg.de) to a spectrum of generic databases, including the above mentioned as well as a database of protein families (PROTFAM), the MITOP database, and the all-against-all FASTA database. PMID:9847138
Draft Genome Sequencing and Comparative Analysis of Aspergillus sojae NBRC4239
Sato, Atsushi; Oshima, Kenshiro; Noguchi, Hideki; Ogawa, Masahiro; Takahashi, Tadashi; Oguma, Tetsuya; Koyama, Yasuji; Itoh, Takehiko; Hattori, Masahira; Hanya, Yoshiki
2011-01-01
We conducted genome sequencing of the filamentous fungus Aspergillus sojae NBRC4239 isolated from the koji used to prepare Japanese soy sauce. We used the 454 pyrosequencing technology and investigated the genome with respect to enzymes and secondary metabolites in comparison with other Aspergilli sequenced. Assembly of 454 reads generated a non-redundant sequence of 39.5-Mb possessing 13 033 putative genes and 65 scaffolds composed of 557 contigs. Of the 2847 open reading frames with Pfam domain scores of >150 found in A. sojae NBRC4239, 81.7% had a high degree of similarity with the genes of A. oryzae. Comparative analysis identified serine carboxypeptidase and aspartic protease genes unique to A. sojae NBRC4239. While A. oryzae possessed three copies of α-amyalse gene, A. sojae NBRC4239 possessed only a single copy. Comparison of 56 gene clusters for secondary metabolites between A. sojae NBRC4239 and A. oryzae revealed that 24 clusters were conserved, whereas 32 clusters differed between them that included a deletion of 18 508 bp containing mfs1, mao1, dmaT, and pks-nrps for the cyclopiazonic acid (CPA) biosynthesis, explaining the no productivity of CPA in A. sojae. The A. sojae NBRC4239 genome data will be useful to characterize functional features of the koji moulds used in Japanese industries. PMID:21659486
Takeda, Itaru; Umemura, Myco; Koike, Hideaki; Asai, Kiyoshi; Machida, Masayuki
2014-08-01
Despite their biological importance, a significant number of genes for secondary metabolite biosynthesis (SMB) remain undetected due largely to the fact that they are highly diverse and are not expressed under a variety of cultivation conditions. Several software tools including SMURF and antiSMASH have been developed to predict fungal SMB gene clusters by finding core genes encoding polyketide synthase, nonribosomal peptide synthetase and dimethylallyltryptophan synthase as well as several others typically present in the cluster. In this work, we have devised a novel comparative genomics method to identify SMB gene clusters that is independent of motif information of the known SMB genes. The method detects SMB gene clusters by searching for a similar order of genes and their presence in nonsyntenic blocks. With this method, we were able to identify many known SMB gene clusters with the core genes in the genomic sequences of 10 filamentous fungi. Furthermore, we have also detected SMB gene clusters without core genes, including the kojic acid biosynthesis gene cluster of Aspergillus oryzae. By varying the detection parameters of the method, a significant difference in the sequence characteristics was detected between the genes residing inside the clusters and those outside the clusters. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Characterization of oxylipins and dioxygenase genes in the asexual fungus Aspergillus niger
2009-01-01
Background Aspergillus niger is an ascomycetous fungus that is known to reproduce through asexual spores, only. Interestingly, recent genome analysis of A. niger has revealed the presence of a full complement of functional genes related to sexual reproduction [1]. An example of such genes are the dioxygenase genes which in Aspergillus nidulans, have been shown to be connected to oxylipin production and regulation of both sexual and asexual sporulation [2-4]. Nevertheless, the presence of sex related genes alone does not confirm sexual sporulation in A. niger. Results The current study shows experimentally that A. niger produces the oxylipins 8,11-dihydroxy octadecadienoic acid (8,11-diHOD), 5,8-dihydroxy octadecadienoic acid (5,8-diHOD), lactonized 5,8-diHOD, 8-hydroxy octadecadienoic acid (8-HOD), 10-hydroxy octadecadienoic acid (10-HOD), small amounts of 8-hydroxy octadecamonoenoic acid (8-HOM), 9-hydroxy octadecadienoic acid (9-HOD) and 13-hydroxy octadecadienoic acid (13-HOD). Importantly, this study shows that the A. niger genome contains three putative dioxygenase genes, ppoA, ppoC and ppoD. Expression analysis confirmed that all three genes are indeed expressed under the conditions tested. Conclusion A. niger produces the same oxylipins and has similar dioxygenase genes as A. nidulans. Their presence could point towards the existence of sexual reproduction in A. niger or a broader role for the gene products in physiology, than just sexual development. PMID:19309517
Nakagawa, So; Takahashi, Mahoko Ueda
2016-01-01
In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species.Database URL: http://geve.med.u-tokai.ac.jp. © The Author(s) 2016. Published by Oxford University Press.
Nakagawa, So; Takahashi, Mahoko Ueda
2016-01-01
In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species. Database URL: http://geve.med.u-tokai.ac.jp PMID:27242033
USDA-ARS?s Scientific Manuscript database
Tomato Functional Genomics Database (TFGD; http://ted.bti.cornell.edu) provides a comprehensive systems biology resource to store, mine, analyze, visualize and integrate large-scale tomato functional genomics datasets. The database is expanded from the previously described Tomato Expression Database...
MIPS: a database for genomes and protein sequences
Mewes, H. W.; Frishman, D.; Güldener, U.; Mannhaupt, G.; Mayer, K.; Mokrejs, M.; Morgenstern, B.; Münsterkötter, M.; Rudd, S.; Weil, B.
2002-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz–Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91–93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155–158; Barker et al. (2001) Nucleic Acids Res., 29, 29–32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de). PMID:11752246
MIPS: a database for genomes and protein sequences.
Mewes, H W; Frishman, D; Güldener, U; Mannhaupt, G; Mayer, K; Mokrejs, M; Morgenstern, B; Münsterkötter, M; Rudd, S; Weil, B
2002-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).
MIPS: analysis and annotation of proteins from whole genomes
Mewes, H. W.; Amid, C.; Arnold, R.; Frishman, D.; Güldener, U.; Mannhaupt, G.; Münsterkötter, M.; Pagel, P.; Strack, N.; Stümpflen, V.; Warfsmann, J.; Ruepp, A.
2004-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein–protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de). PMID:14681354
MIPS: analysis and annotation of proteins from whole genomes.
Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A
2004-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).
Aspergillus Section Fumigati Typing by PCR-Restriction Fragment Polymorphism▿
Staab, Janet F.; Balajee, S. Arunmozhi; Marr, Kieren A.
2009-01-01
Recent studies have shown that there are multiple clinically important members of the Aspergillus section Fumigati that are difficult to distinguish on the basis of morphological features (e.g., Aspergillus fumigatus, A. lentulus, and Neosartorya udagawae). Identification of these organisms may be clinically important, as some species vary in their susceptibilities to antifungal agents. In a prior study, we utilized multilocus sequence typing to describe A. lentulus as a species distinct from A. fumigatus. The sequence data show that the gene encoding β-tubulin, benA, has high interspecies variability at intronic regions but is conserved among isolates of the same species. These data were used to develop a PCR-restriction fragment length polymorphism (PCR-RFLP) method that rapidly and accurately distinguishes A. fumigatus, A. lentulus, and N. udagawae, three major species within the section Fumigati that have previously been implicated in disease. Digestion of the benA amplicon with BccI generated unique banding patterns; the results were validated by screening a collection of clinical strains and by in silico analysis of the benA sequences of Aspergillus spp. deposited in the GenBank database. PCR-RFLP of benA is a simple method for the identification of clinically important, similar morphotypes of Aspergillus spp. within the section Fumigati. PMID:19403766
Aspergillus section Fumigati typing by PCR-restriction fragment polymorphism.
Staab, Janet F; Balajee, S Arunmozhi; Marr, Kieren A
2009-07-01
Recent studies have shown that there are multiple clinically important members of the Aspergillus section Fumigati that are difficult to distinguish on the basis of morphological features (e.g., Aspergillus fumigatus, A. lentulus, and Neosartorya udagawae). Identification of these organisms may be clinically important, as some species vary in their susceptibilities to antifungal agents. In a prior study, we utilized multilocus sequence typing to describe A. lentulus as a species distinct from A. fumigatus. The sequence data show that the gene encoding beta-tubulin, benA, has high interspecies variability at intronic regions but is conserved among isolates of the same species. These data were used to develop a PCR-restriction fragment length polymorphism (PCR-RFLP) method that rapidly and accurately distinguishes A. fumigatus, A. lentulus, and N. udagawae, three major species within the section Fumigati that have previously been implicated in disease. Digestion of the benA amplicon with BccI generated unique banding patterns; the results were validated by screening a collection of clinical strains and by in silico analysis of the benA sequences of Aspergillus spp. deposited in the GenBank database. PCR-RFLP of benA is a simple method for the identification of clinically important, similar morphotypes of Aspergillus spp. within the section Fumigati.
USDA-ARS?s Scientific Manuscript database
Worldwide recognition that aflatoxin contamination of agricultural commodities by the fungus Aspergillus flavus is a global problem which has significantly benefitted from global collaboration for understanding the contaminating fungus as well as for developing and implementing solutions against the...
USDA-ARS?s Scientific Manuscript database
The ARS Microbial Genome Sequence Database (http://199.133.98.43), a web-based database server, was established utilizing the BIGSdb (Bacterial Isolate Genomics Sequence Database) software package, developed at Oxford University, as a tool to manage multi-locus sequence data for the family Streptomy...
Govindaraj, Mahalingam
2015-01-01
The number of sequenced crop genomes and associated genomic resources is growing rapidly with the advent of inexpensive next generation sequencing methods. Databases have become an integral part of all aspects of science research, including basic and applied plant and animal sciences. The importance of databases keeps increasing as the volume of datasets from direct and indirect genomics, as well as other omics approaches, keeps expanding in recent years. The databases and associated web portals provide at a minimum a uniform set of tools and automated analysis across a wide range of crop plant genomes. This paper reviews some basic terms and considerations in dealing with crop plant databases utilization in advancing genomic era. The utilization of databases for variation analysis with other comparative genomics tools, and data interpretation platforms are well described. The major focus of this review is to provide knowledge on platforms and databases for genome-based investigations of agriculturally important crop plants. The utilization of these databases in applied crop improvement program is still being achieved widely; otherwise, the end for sequencing is not far away. PMID:25874133
Andersen, Mikael R.; Salazar, Margarita P.; Schaap, Peter J.; van de Vondervoort, Peter J.I.; Culley, David; Thykaer, Jette; Frisvad, Jens C.; Nielsen, Kristian F.; Albang, Richard; Albermann, Kaj; Berka, Randy M.; Braus, Gerhard H.; Braus-Stromeyer, Susanna A.; Corrochano, Luis M.; Dai, Ziyu; van Dijck, Piet W.M.; Hofmann, Gerald; Lasure, Linda L.; Magnuson, Jon K.; Menke, Hildegard; Meijer, Martin; Meijer, Susan L.; Nielsen, Jakob B.; Nielsen, Michael L.; van Ooyen, Albert J.J.; Pel, Herman J.; Poulsen, Lars; Samson, Rob A.; Stam, Hein; Tsang, Adrian; van den Brink, Johannes M.; Atkins, Alex; Aerts, Andrea; Shapiro, Harris; Pangilinan, Jasmyn; Salamov, Asaf; Lou, Yigong; Lindquist, Erika; Lucas, Susan; Grimwood, Jane; Grigoriev, Igor V.; Kubicek, Christian P.; Martinez, Diego; van Peij, Noël N.M.E.; Roubos, Johannes A.; Nielsen, Jens; Baker, Scott E.
2011-01-01
The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compel additional exploration. We therefore undertook whole-genome sequencing of the acidogenic A. niger wild-type strain (ATCC 1015) and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence, and half the telomeric regions have been elucidated. Moreover, sequence information from ATCC 1015 was used to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 Mb of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis supported up-regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases, and protein transporters in the protein producing CBS 513.88 strain. Our results and data sets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi. PMID:21543515
Standards for Clinical Grade Genomic Databases.
Yohe, Sophia L; Carter, Alexis B; Pfeifer, John D; Crawford, James M; Cushman-Vokoun, Allison; Caughron, Samuel; Leonard, Debra G B
2015-11-01
Next-generation sequencing performed in a clinical environment must meet clinical standards, which requires reproducibility of all aspects of the testing. Clinical-grade genomic databases (CGGDs) are required to classify a variant and to assist in the professional interpretation of clinical next-generation sequencing. Applying quality laboratory standards to the reference databases used for sequence-variant interpretation presents a new challenge for validation and curation. To define CGGD and the categories of information contained in CGGDs and to frame recommendations for the structure and use of these databases in clinical patient care. Members of the College of American Pathologists Personalized Health Care Committee reviewed the literature and existing state of genomic databases and developed a framework for guiding CGGD development in the future. Clinical-grade genomic databases may provide different types of information. This work group defined 3 layers of information in CGGDs: clinical genomic variant repositories, genomic medical data repositories, and genomic medicine evidence databases. The layers are differentiated by the types of genomic and medical information contained and the utility in assisting with clinical interpretation of genomic variants. Clinical-grade genomic databases must meet specific standards regarding submission, curation, and retrieval of data, as well as the maintenance of privacy and security. These organizing principles for CGGDs should serve as a foundation for future development of specific standards that support the use of such databases for patient care.
Li, Ying; Wang, He; Zhao, Yu-Pei; Xu, Ying-Chun; Hsueh, Po-Ren
2017-01-01
We evaluated the accuracy of the Bruker Biotyper matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) system at identifying clinical isolates of Aspergillus species that were grown on agar media. A total of 381 non-duplicate Aspergillus isolates representing 21 different Aspergillus species identified by molecular analysis were included in this study. The Bruker Biotyper MALDI-TOF MS system was able to identify 30.2% (115/381) of the isolates to the species level (score values of ≥2.000) and 49.3% to the genus level (score values of 1.700–1.999). When the identification cutoff value was lowered from ≥2.000 to ≥1.700, the species-level identification rate increased to 79.5% with a slight rise of false identification from 2.6 to 5.0%. From another aspect, a correct species-level identification rate of 89% could be reached by the Bruker Biotyper MALDI-TOF MS system regardless of the score values obtained. The Bruker Biotyper MALDI-TOF MS system had a moderate performance in identification of Aspergillus directly inoculated on solid agar media. Continued expansion of the Bruker Biotyper MALDI-TOF MS database and adoption of alternative cutoff values for interpretation are required to improve the performance of the system for identifying highly diverse species of clinically encountered Aspergillus isolates. PMID:28706514
Li, Ying; Wang, He; Zhao, Yu-Pei; Xu, Ying-Chun; Hsueh, Po-Ren
2017-01-01
We evaluated the accuracy of the Bruker Biotyper matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) system at identifying clinical isolates of Aspergillus species that were grown on agar media. A total of 381 non-duplicate Aspergillus isolates representing 21 different Aspergillus species identified by molecular analysis were included in this study. The Bruker Biotyper MALDI-TOF MS system was able to identify 30.2% (115/381) of the isolates to the species level (score values of ≥2.000) and 49.3% to the genus level (score values of 1.700-1.999). When the identification cutoff value was lowered from ≥2.000 to ≥1.700, the species-level identification rate increased to 79.5% with a slight rise of false identification from 2.6 to 5.0%. From another aspect, a correct species-level identification rate of 89% could be reached by the Bruker Biotyper MALDI-TOF MS system regardless of the score values obtained. The Bruker Biotyper MALDI-TOF MS system had a moderate performance in identification of Aspergillus directly inoculated on solid agar media. Continued expansion of the Bruker Biotyper MALDI-TOF MS database and adoption of alternative cutoff values for interpretation are required to improve the performance of the system for identifying highly diverse species of clinically encountered Aspergillus isolates.
Private and Efficient Query Processing on Outsourced Genomic Databases.
Ghasemi, Reza; Al Aziz, Md Momin; Mohammed, Noman; Dehkordi, Massoud Hadian; Jiang, Xiaoqian
2017-09-01
Applications of genomic studies are spreading rapidly in many domains of science and technology such as healthcare, biomedical research, direct-to-consumer services, and legal and forensic. However, there are a number of obstacles that make it hard to access and process a big genomic database for these applications. First, sequencing genomic sequence is a time consuming and expensive process. Second, it requires large-scale computation and storage systems to process genomic sequences. Third, genomic databases are often owned by different organizations, and thus, not available for public usage. Cloud computing paradigm can be leveraged to facilitate the creation and sharing of big genomic databases for these applications. Genomic data owners can outsource their databases in a centralized cloud server to ease the access of their databases. However, data owners are reluctant to adopt this model, as it requires outsourcing the data to an untrusted cloud service provider that may cause data breaches. In this paper, we propose a privacy-preserving model for outsourcing genomic data to a cloud. The proposed model enables query processing while providing privacy protection of genomic databases. Privacy of the individuals is guaranteed by permuting and adding fake genomic records in the database. These techniques allow cloud to evaluate count and top-k queries securely and efficiently. Experimental results demonstrate that a count and a top-k query over 40 Single Nucleotide Polymorphisms (SNPs) in a database of 20 000 records takes around 100 and 150 s, respectively.
Private and Efficient Query Processing on Outsourced Genomic Databases
Ghasemi, Reza; Al Aziz, Momin; Mohammed, Noman; Dehkordi, Massoud Hadian; Jiang, Xiaoqian
2017-01-01
Applications of genomic studies are spreading rapidly in many domains of science and technology such as healthcare, biomedical research, direct-to-consumer services, and legal and forensic. However, there are a number of obstacles that make it hard to access and process a big genomic database for these applications. First, sequencing genomic sequence is a time-consuming and expensive process. Second, it requires large-scale computation and storage systems to processes genomic sequences. Third, genomic databases are often owned by different organizations and thus not available for public usage. Cloud computing paradigm can be leveraged to facilitate the creation and sharing of big genomic databases for these applications. Genomic data owners can outsource their databases in a centralized cloud server to ease the access of their databases. However, data owners are reluctant to adopt this model, as it requires outsourcing the data to an untrusted cloud service provider that may cause data breaches. In this paper, we propose a privacy-preserving model for outsourcing genomic data to a cloud. The proposed model enables query processing while providing privacy protection of genomic databases. Privacy of the individuals is guaranteed by permuting and adding fake genomic records in the database. These techniques allow cloud to evaluate count and top-k queries securely and efficiently. Experimental results demonstrate that a count and a top-k query over 40 SNPs in a database of 20,000 records takes around 100 and 150 seconds, respectively. PMID:27834660
ReprDB and panDB: minimalist databases with maximal microbial representation.
Zhou, Wei; Gay, Nicole; Oh, Julia
2018-01-18
Profiling of shotgun metagenomic samples is hindered by a lack of unified microbial reference genome databases that (i) assemble genomic information from all open access microbial genomes, (ii) have relatively small sizes, and (iii) are compatible to various metagenomic read mapping tools. Moreover, computational tools to rapidly compile and update such databases to accommodate the rapid increase in new reference genomes do not exist. As a result, database-guided analyses often fail to profile a substantial fraction of metagenomic shotgun sequencing reads from complex microbiomes. We report pipelines that efficiently traverse all open access microbial genomes and assemble non-redundant genomic information. The pipelines result in two species-resolution microbial reference databases of relatively small sizes: reprDB, which assembles microbial representative or reference genomes, and panDB, for which we developed a novel iterative alignment algorithm to identify and assemble non-redundant genomic regions in multiple sequenced strains. With the databases, we managed to assign taxonomic labels and genome positions to the majority of metagenomic reads from human skin and gut microbiomes, demonstrating a significant improvement over a previous database-guided analysis on the same datasets. reprDB and panDB leverage the rapid increases in the number of open access microbial genomes to more fully profile metagenomic samples. Additionally, the databases exclude redundant sequence information to avoid inflated storage or memory space and indexing or analyzing time. Finally, the novel iterative alignment algorithm significantly increases efficiency in pan-genome identification and can be useful in comparative genomic analyses.
USDA-ARS?s Scientific Manuscript database
StuA, first discovered in Aspergillus nidulans and a member of the APSES class of transcription factors, regulates several essential developmental stages in fungi such as virulence, sporulation and toxin production in phytopathogenic fungi. Fusarium verticillioides (Fv), a maize phytopathogen, produ...
Aspergillus, Penicillium and Talaromyces isolated from house dust samples collected around the world
Visagie, C.M.; Hirooka, Y.; Tanney, J.B.; Whitfield, E.; Mwange, K.; Meijer, M.; Amend, A.S.; Seifert, K.A.; Samson, R.A.
2014-01-01
As part of a worldwide survey of the indoor mycobiota, dust was collected from nine countries. Analyses of dust samples included the culture-dependent dilution-to-extinction method and the culture-independent 454-pyrosequencing. Of the 7 904 isolates, 2 717 isolates were identified as belonging to Aspergillus, Penicillium and Talaromyces. The aim of this study was to identify isolates to species level and describe the new species found. Secondly, we wanted to create a reliable reference sequence database to be used for next-generation sequencing projects. Isolates represented 59 Aspergillus species, including eight undescribed species, 49 Penicillium species of which seven were undescribed and 18 Talaromyces species including three described here as new. In total, 568 ITS barcodes were generated, and 391 β-tubulin and 507 calmodulin sequences, which serve as alternative identification markers. PMID:25492981
Tam, Emily W T; Chen, Jonathan H K; Lau, Eunice C L; Ngan, Antonio H Y; Fung, Kitty S C; Lee, Kim-Chung; Lam, Ching-Wan; Yuen, Kwok-Yung; Lau, Susanna K P; Woo, Patrick C Y
2014-04-01
Aspergillus nomius and Aspergillus tamarii are Aspergillus species that phenotypically resemble Aspergillus flavus. In the last decade, a number of case reports have identified A. nomius and A. tamarii as causes of human infections. In this study, using an internal transcribed spacer, β-tubulin, and calmodulin gene sequencing, only 8 of 11 clinical isolates reported as A. flavus in our clinical microbiology laboratory by phenotypic methods were identified as A. flavus. The other three isolates were A. nomius (n = 2) or A. tamarii (n = 1). The results corresponded with those of metabolic fingerprinting, in which the A. flavus, A. nomius, and A. tamarii strains were separated into three clusters based on ultra-high-performance liquid chromatography-tandem mass spectrometry (UHPLC MS) analysis. The first two patients with A. nomius infections had invasive aspergillosis and chronic cavitary and fibrosing pulmonary and pleural aspergillosis, respectively, whereas the third patient had A. tamarii colonization of the airway. Identification of the 11 clinical isolates and three reference strains by matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) showed that only six of the nine strains of A. flavus were identified correctly. None of the strains of A. nomius and A. tamarii was correctly identified. β-Tubulin or the calmodulin gene should be the gene target of choice for identifying A. flavus, A. nomius, and A. tamarii. To improve the usefulness of MALDI-TOF MS, the number of strains for each species in MALDI-TOF MS databases should be expanded to cover intraspecies variability.
Tam, Emily W. T.; Chen, Jonathan H. K.; Lau, Eunice C. L.; Ngan, Antonio H. Y.; Fung, Kitty S. C.; Lee, Kim-Chung; Lam, Ching-Wan; Yuen, Kwok-Yung
2014-01-01
Aspergillus nomius and Aspergillus tamarii are Aspergillus species that phenotypically resemble Aspergillus flavus. In the last decade, a number of case reports have identified A. nomius and A. tamarii as causes of human infections. In this study, using an internal transcribed spacer, β-tubulin, and calmodulin gene sequencing, only 8 of 11 clinical isolates reported as A. flavus in our clinical microbiology laboratory by phenotypic methods were identified as A. flavus. The other three isolates were A. nomius (n = 2) or A. tamarii (n = 1). The results corresponded with those of metabolic fingerprinting, in which the A. flavus, A. nomius, and A. tamarii strains were separated into three clusters based on ultra-high-performance liquid chromatography-tandem mass spectrometry (UHPLC MS) analysis. The first two patients with A. nomius infections had invasive aspergillosis and chronic cavitary and fibrosing pulmonary and pleural aspergillosis, respectively, whereas the third patient had A. tamarii colonization of the airway. Identification of the 11 clinical isolates and three reference strains by matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) showed that only six of the nine strains of A. flavus were identified correctly. None of the strains of A. nomius and A. tamarii was correctly identified. β-Tubulin or the calmodulin gene should be the gene target of choice for identifying A. flavus, A. nomius, and A. tamarii. To improve the usefulness of MALDI-TOF MS, the number of strains for each species in MALDI-TOF MS databases should be expanded to cover intraspecies variability. PMID:24452174
Xyloglucan breakdown by endo-xyloglucanase family 74 from Aspergillus fumigatus.
Damasio, André Ricardo de Lima; Rubio, Marcelo Ventura; Gonçalves, Thiago Augusto; Persinoti, Gabriela Felix; Segato, Fernando; Prade, Rolf Alexander; Contesini, Fabiano Jares; de Souza, Amanda Pereira; Buckeridge, Marcos Silveira; Squina, Fabio Marcio
2017-04-01
Xyloglucan is the most abundant hemicellulose in primary walls of spermatophytes except for grasses. Xyloglucan-degrading enzymes are important in lignocellulosic biomass hydrolysis because they remove xyloglucan, which is abundant in monocot-derived biomass. Fungal genomes encode numerous xyloglucanase genes, belonging to at least six glycoside hydrolase (GH) families. GH74 endo-xyloglucanases cleave xyloglucan backbones with unsubstituted glucose at the -1 subsite or prefer xylosyl-substituted residues in the -1 subsite. In this work, 137 GH74-related genes were detected by examining 293 Eurotiomycete genomes and Ascomycete fungi contained one or no GH74 xyloglucanase gene per genome. Another interesting feature is that the triad of tryptophan residues along the catalytic cleft was found to be widely conserved among Ascomycetes. The GH74 from Aspergillus fumigatus (AfXEG74) was chosen as an example to conduct comprehensive biochemical studies to determine the catalytic mechanism. AfXEG74 has no CBM and cleaves the xyloglucan backbone between the unsubstituted glucose and xylose-substituted glucose at specific positions, along the XX motif when linked to regions deprived of galactosyl branches. It resembles an endo-processive activity, which after initial random hydrolysis releases xyloglucan-oligosaccharides as major reaction products. This work provides insights on phylogenetic diversity and catalytic mechanism of GH74 xyloglucanases from Ascomycete fungi.
Kim, Changkug; Park, Dongsuk; Seol, Youngjoo; Hahn, Jangho
2011-01-01
The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage.
Thammarongtham, Chinae; Nookaew, Intawat; Vorapreeda, Tayvich; ...
2017-09-01
The selected robust fungus, Aspergillus oryzae strain BCC7051 is of interest for biotechnological production of lipid-derived products due to its capability to accumulate high amount of intracellular lipids using various sugars and agro-industrial substrates. Here in this paper, we report the genome sequence of the oleaginous A. oryzae BCC7051. The obtained reads were de novo assembled into 25 scaffolds spanning of 38,550,958 bps with predicted 11,456 protein-coding genes. By synteny mapping, a large rearrangement was found in two scaffolds of A. oryzae BCC7051 as compared to the reference RIB40 strain. The genetic relationship between BCC7051 and other strains of A.more » oryzae in terms of aflatoxin production was investigated, indicating that the A. oryzae BCC7051 was categorized into group 2 nonaflatoxin-producing strain. Moreover, a comparative analysis of the structural genes focusing on the involvement in lipid metabolism among oleaginous yeast and fungi revealed the presence of multiple isoforms of metabolic enzymes responsible for fatty acid synthesis in BCC7051. The alternative routes of acetyl-CoA generation as oleaginous features and malate/citrate/pyruvate shuttle were also identified in this A. oryzae strain. The genome sequence generated in this work is a dedicated resource for expanding genome-wide study of microbial lipids at systems level, and developing the fungal-based platform for production of diversified lipids with commercial relevance.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thammarongtham, Chinae; Nookaew, Intawat; Vorapreeda, Tayvich
The selected robust fungus, Aspergillus oryzae strain BCC7051 is of interest for biotechnological production of lipid-derived products due to its capability to accumulate high amount of intracellular lipids using various sugars and agro-industrial substrates. Here in this paper, we report the genome sequence of the oleaginous A. oryzae BCC7051. The obtained reads were de novo assembled into 25 scaffolds spanning of 38,550,958 bps with predicted 11,456 protein-coding genes. By synteny mapping, a large rearrangement was found in two scaffolds of A. oryzae BCC7051 as compared to the reference RIB40 strain. The genetic relationship between BCC7051 and other strains of A.more » oryzae in terms of aflatoxin production was investigated, indicating that the A. oryzae BCC7051 was categorized into group 2 nonaflatoxin-producing strain. Moreover, a comparative analysis of the structural genes focusing on the involvement in lipid metabolism among oleaginous yeast and fungi revealed the presence of multiple isoforms of metabolic enzymes responsible for fatty acid synthesis in BCC7051. The alternative routes of acetyl-CoA generation as oleaginous features and malate/citrate/pyruvate shuttle were also identified in this A. oryzae strain. The genome sequence generated in this work is a dedicated resource for expanding genome-wide study of microbial lipids at systems level, and developing the fungal-based platform for production of diversified lipids with commercial relevance.« less
Zheng, Xiaomei; Zheng, Ping; Sun, Jibin; Kun, Zhang; Ma, Yanhe
2018-01-01
U6 promoters have been used for single guide RNA (sgRNA) transcription in the clustered regularly interspaced short palindromic repeats/CRISPR-associated protein (CRISPR/Cas9) genome editing system. However, no available U6 promoters have been identified in Aspergillus niger, which is an important industrial platform for organic acid and protein production. Two CRISPR/Cas9 systems established in A. niger have recourse to the RNA polymerase II promoter or in vitro transcription for sgRNA synthesis, but these approaches generally increase cloning efforts and genetic manipulation. The validation of functional RNA polymerase II promoters is therefore an urgent need for A. niger . Here, we developed a novel CRISPR/Cas9 system in A. niger for sgRNA expression, based on one endogenous U6 promoter and two heterologous U6 promoters. The three tested U6 promoters enabled sgRNA transcription and the disruption of the polyketide synthase albA gene in A. niger . Furthermore, this system enabled highly efficient gene insertion at the targeted genome loci in A. niger using donor DNAs with homologous arms as short as 40-bp. This study demonstrated that both heterologous and endogenous U6 promoters were functional for sgRNA expression in A. niger . Based on this result, a novel and simple CRISPR/Cas9 toolbox was established in A. niger, that will benefit future gene functional analysis and genome editing.
Kim, ChangKug; Park, DongSuk; Seol, YoungJoo; Hahn, JangHo
2011-01-01
The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage. PMID:21887015
McMullen, Allison R; Wallace, Meghan A; Pincus, David H; Wilkey, Kathy; Burnham, C A
2016-08-01
Invasive fungal infections have a high rate of morbidity and mortality, and accurate identification is necessary to guide appropriate antifungal therapy. With the increasing incidence of invasive disease attributed to filamentous fungi, rapid and accurate species-level identification of these pathogens is necessary. Traditional methods for identification of filamentous fungi can be slow and may lack resolution. Matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) has emerged as a rapid and accurate method for identification of bacteria and yeasts, but a paucity of data exists on the performance characteristics of this method for identification of filamentous fungi. The objective of our study was to evaluate the accuracy of the Vitek MS for mold identification. A total of 319 mold isolates representing 43 genera recovered from clinical specimens were evaluated. Of these isolates, 213 (66.8%) were correctly identified using the Vitek MS Knowledge Base, version 3.0 database. When a modified SARAMIS (Spectral Archive and Microbial Identification System) database was used to augment the version 3.0 Knowledge Base, 245 (76.8%) isolates were correctly identified. Unidentified isolates were subcultured for repeat testing; 71/319 (22.3%) remained unidentified. Of the unidentified isolates, 69 were not in the database. Only 3 (0.9%) isolates were misidentified by MALDI-TOF MS (including Aspergillus amoenus [n = 2] and Aspergillus calidoustus [n = 1]) although 10 (3.1%) of the original phenotypic identifications were not correct. In addition, this methodology was able to accurately identify 133/144 (93.6%) Aspergillus sp. isolates to the species level. MALDI-TOF MS has the potential to expedite mold identification, and misidentifications are rare. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Juvvadi, Praveen Rao; Seshime, Yasuyo; Kitamoto, Katsuhiko
2005-12-01
Fungal secondary metabolites constitute a wide variety of compounds which either play a vital role in agricultural, pharmaceutical and industrial contexts, or have devastating effects on agriculture, animal and human affairs by virtue of their toxigenicity. Owing to their beneficial and deleterious characteristics, these complex compounds and the genes responsible for their synthesis have been the subjects of extensive investigation by microbiologists and pharmacologists. A majority of the fungal secondary metabolic genes are classified as type I polyketide synthases (PKS) which are often clustered with other secondary metabolism related genes. In this review we discuss on the significance of our recent discovery of chalcone synthase (CHS) genes belonging to the type III PKS superfamily in an industrially important fungus, Aspergillus oryzae. CHS genes are known to play a vital role in the biosynthesis of flavonoids in plants. A comparative genome analyses revealed the unique character of A. oryzae with four CHS-like genes (csyA, csyB, csyC and csyD) amongst other Aspergilli (Aspergillus nidulans and Aspergillus fumigatus) which contained none of the CHS-like genes. Some other fungi such as Neurospora crassa, Fusarium graminearum, Magnaporthe grisea, Podospora anserina and Phanerochaete chrysosporium also contained putative type III PKSs, with a phylogenic distinction from bacteria and plants. The enzymatically active nature of these newly discovered homologues is expected owing to the conservation in the catalytic residues across the different species of plants and fungi, and also by the fact that a majority of these genes (csyA, csyB and csyD) were expressed in A. oryzae. While this finding brings filamentous fungi closer to plants and bacteria which until recently were the only ones considered to possess the type III PKSs, the presence of putative genes encoding other principal enzymes involved in the phenylpropanoid and flavonoid biosynthesis (viz., phenylalanine ammonia-lyase, cinnamic acid hydroxylase and p-coumarate CoA ligase) in the A. oryzae genome undoubtedly prove the extent of its metabolic diversity. Since many of these genes have not been identified earlier, knowledge on their corresponding products or activities remain undeciphered. In future, it is anticipated that these enzymes may be reasonable targets for metabolic engineering in fungi to produce agriculturally and nutritionally important metabolites.
Zwane, Eunice N; Rose, Shaunita H; van Zyl, Willem H; Rumbold, Karl; Viljoen-Bloom, Marinda
2014-06-01
The production of ferulic acid esterase involved in the release of ferulic acid side groups from xylan was investigated in strains of Aspergillus tubingensis, Aspergillus carneus, Aspergillus niger and Rhizopus oryzae. The highest activity on triticale bran as sole carbon source was observed with the A. tubingensis T8.4 strain, which produced a type A ferulic acid esterase active against methyl p-coumarate, methyl ferulate and methyl sinapate. The activity of the A. tubingensis ferulic acid esterase (AtFAEA) was inhibited twofold by glucose and induced twofold in the presence of maize bran. An initial accumulation of endoglucanase was followed by the production of endoxylanase, suggesting a combined action with ferulic acid esterase on maize bran. A genomic copy of the A. tubingensis faeA gene was cloned and expressed in A. niger D15#26 under the control of the A. niger gpd promoter. The recombinant strain has reduced protease activity and does not acidify the media, therefore promoting high-level expression of recombinant enzymes. It produced 13.5 U/ml FAEA after 5 days on autoclaved maize bran as sole carbon source, which was threefold higher than for the A. tubingensis donor strain. The recombinant AtFAEA was able to extract 50 % of the available ferulic acid from non-pretreated maize bran, making this enzyme suitable for the biological production of ferulic acid from lignocellulosic plant material.
Massi, Fernanda Pelisson; Sartori, Daniele; de Souza Ferranti, Larissa; Iamanaka, Beatriz Thie; Taniwaki, Marta Hiromi; Vieira, Maria Lucia Carneiro; Fungaro, Maria Helena Pelegrinelli
2016-03-16
Aspergillus niger "aggregate" is an informal taxonomic rank that represents a group of species from the section Nigri. Among A. niger "aggregate" species Aspergillus niger sensu stricto and its cryptic species Aspergillus welwitschiae (=Aspergillus awamori sensu Perrone) are proven as ochratoxin A and fumonisin B2 producing species. A. niger has been frequently found in tropical and subtropical foods. A. welwitschiae is a new species, which was recently dismembered from the A. niger taxon. These species are morphologically very similar and molecular data are indispensable for their identification. A total of 175 Brazilian isolates previously identified as A. niger collected from dried fruits, Brazil nuts, coffee beans, grapes, cocoa and onions were investigated in this study. Based on partial calmodulin gene sequences about one-half of our isolates were identified as A. welwitschiae. This new species was the predominant species in onions analyzed in Brazil. A. niger and A. welwitschiae differ in their ability to produce ochratoxin A and fumonisin B2. Among A. niger isolates, approximately 32% were OTA producers, but in contrast only 1% of the A. welwitschiae isolates revealed the ability to produce ochratoxin A. Regarding fumonisin B2 production, there was a higher frequency of FB2 producing isolates in A. niger (74%) compared to A. welwitschiae (34%). Because not all A. niger and A. welwitschiae strains produce ochratoxin A and fumonisin B2, in this study a multiplex PCR was developed for detecting the presence of essential genes involved in ochratoxin (polyketide synthase and radHflavin-dependent halogenase) and fumonisin (α-oxoamine synthase) biosynthesis in the genome of A. niger and A. welwitschiae isolates. The frequency of strains harboring the mycotoxin genes was markedly different between A. niger and A. welwitschiae. All OTA producing isolates of A. niger and A. welwitschiae showed in their genome the pks and radH genes, and 95.2% of the nonproducing isolates did not contain these genes. The α-oxoamine synthase gene was detected in 100% and 36% of the A. niger and A. welwitschiae isolates, respectively. The loss of ochratoxin A production in A. niger and A. welwitschiae is highly associated with gene deletions within the ochratoxin biosynthetic gene cluster. The loss of fumonisin production in A. welwitschiae is associated with gene deletions within the fumonisin biosynthetic gene cluster, but this is not the case with A. niger. Published by Elsevier B.V.
PGSB/MIPS PlantsDB Database Framework for the Integration and Analysis of Plant Genome Data.
Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai; Gundlach, Heidrun; Mayer, Klaus F X
2017-01-01
Plant Genome and Systems Biology (PGSB), formerly Munich Institute for Protein Sequences (MIPS) PlantsDB, is a database framework for the integration and analysis of plant genome data, developed and maintained for more than a decade now. Major components of that framework are genome databases and analysis resources focusing on individual (reference) genomes providing flexible and intuitive access to data. Another main focus is the integration of genomes from both model and crop plants to form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny). Data exchange and integrated search functionality with/over many plant genome databases is provided within the transPLANT project.
High Throughput Sequence Analysis for Disease Resistance in Maize
USDA-ARS?s Scientific Manuscript database
Preliminary results of a computational analysis of high throughput sequencing data from Zea mays and the fungus Aspergillus are reported. The Illumina Genome Analyzer was used to sequence RNA samples from two strains of Z. mays (Va35 and Mp313) collected over a time course as well as several specie...
Reilly, Morgann C.; Kim, Joonhoon; Lynn, Jed; ...
2018-01-06
Plant biomass, once reduced to its composite sugars, can be converted to fuel substitutes. One means of overcoming the recalcitrance of lignocellulose is pretreatment followed by enzymatic hydrolysis. However, currently available commercial enzyme cocktails are inhibited in the presence of residual pretreatment chemicals. Recent studies have identified a number of cellulolytic enzymes from bacteria that are tolerant to pretreatment chemicals such as ionic liquids. The challenge now is generation of these enzymes in copious amounts, an arena where fungal organisms such as Aspergillus niger have proven efficient. Fungal host strains still need to be engineered to increase production titers ofmore » heterologous protein over native enzymes, which has been a difficult task. Here, we developed a forward genetics screen coupled with whole-genome resequencing to identify specific lesions responsible for a protein hyper-production phenotype in A. niger. As a result, this strategy successfully identified novel targets, including a low-affinity glucose transporter, MstC, whose deletion significantly improved secretion of recombinant proteins driven by a glucoamylase promoter.« less
Comparative Chemistry of Aspergillus oryzae (RIB40) and A. flavus (NRRL 3357)
Rank, Christian; Klejnstrup, Marie Louise; Petersen, Lene Maj; Kildgaard, Sara; Frisvad, Jens Christian; Gotfredsen, Charlotte Held; Larsen, Thomas Ostenfeld
2012-01-01
Aspergillus oryzae and A. flavus are important species in industrial biotechnology and food safety and have been some of the first aspergilli to be fully genome sequenced. Bioinformatic analysis has revealed 99.5% gene homology between the two species pointing towards a large coherence in the secondary metabolite production. In this study we report on the first comparison of secondary metabolite production between the full genome sequenced strains of A. oryzae (RIB40) and A. flavus (NRRL 3357). Surprisingly, the overall chemical profiles of the two strains were mostly very different across 15 growth conditions. Contrary to previous studies we found the aflatrem precursor 13-desoxypaxilline to be a major metabolite from A. oryzae under certain growth conditions. For the first time, we additionally report A. oryzae to produce parasiticolide A and two new analogues hereof, along with four new alkaloids related to the A. flavus metabolites ditryptophenalines and miyakamides. Generally the secondary metabolite capability of A. oryzae presents several novel end products likely to result from the domestication process from A. flavus. PMID:24957367
Gibbons, John G.; Salichos, Leonidas; Slot, Jason C.; Rinker, David C.; McGary, Kriston L.; King, Jonas G.; Klich, Maren A.; Tabb, David L.; McDonald, W. Hayes; Rokas, Antonis
2012-01-01
Summary The domestication of animals, plants and microbes fundamentally transformed the lifestyle and demography of the human species [1]. Although the genetic and functional underpinnings of animal and plant domestication are well understood, little is known about microbe domestication [2–6]. We systematically examined genome-wide sequence and functional variation between the domesticated fungus Aspergillus oryzae, whose saccharification abilities humans have harnessed for thousands of years to produce sake, soy sauce and miso from starch-rich grains, and its wild relative A. flavus, a potentially toxigenic plant and animal pathogen [7]. We discovered dramatic changes in the sequence variation and abundance profiles of genes and wholesale primary and secondary metabolic pathways between domesticated and wild relative isolates during growth on rice. Through selection by humans, our data suggest that an atoxigenic lineage of A. flavus gradually evolved into a “cell factory” for enzymes and metabolites involved in the saccharification process. These results suggest that whereas animal and plant domestication was largely driven by Neolithic “genetic tinkering” of developmental pathways, microbe domestication was driven by extensive remodeling of metabolism. PMID:22795693
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yeh, Hsu-Hua; Chiang, Yi Ming; Entwistle, Ruth
2012-04-10
Genome sequencing of Aspergillus species including A. nidulans has revealed that there are far more secondary metabolite biosynthetic gene clusters than secondary metabolites isolated from these organisms. This implies that these organisms can produce additional secondary metabolites have not yet been elucidated. The A. nidulans genome contains twelve nonribosomal peptide synthetase (NRPS), one hybrid polyketide synthase/nonribosomal peptide synthetase (PKS/NRPS), and fourteen NRPS-like genes. The only NRPS-like gene in A. nidulans with a known product is tdiA which is involved in terrequinone A biosynthesis. To attempt to identify the products of these NRPS-like genes, we replaced the native promoters of themore » NRPS-like genes with the inducible alcohol dehydrogenase (alcA) promoter. Our results demonstrated that induction of the single NRPS-like gene AN3396.4 led to the enhanced production of microperfuranone. Furthermore, heterologous expression of AN3396.4 in A. niger confirmed that only one NRPS-like gene, AN3396.4, is necessary for the production of microperfuranone.« less
Terbinafine Resistance Mediated by Salicylate 1-Monooxygenase in Aspergillus nidulans
Graminha, Marcia A. S.; Rocha, Eleusa M. F.; Prade, Rolf A.; Martinez-Rossi, Nilce M.
2004-01-01
Resistance to antifungal agents is a recurring and growing problem among patients with systemic fungal infections. UV-induced Aspergillus nidulans mutants resistant to terbinafine have been identified, and we report here the characterization of one such gene. A sib-selected, 6.6-kb genomic DNA fragment encodes a salicylate 1-monooxygenase (salA), and a fatty acid synthase subunit (fasC) confers terbinafine resistance upon transformation of a sensitive strain. Subfragments carrying salA but not fasC confer terbinafine resistance. salA is present as a single-copy gene on chromosome VI and encodes a protein of 473 amino acids that is homologous to salicylate 1-monooxygenase, a well-characterized naphthalene-degrading enzyme in bacteria. salA transcript accumulation analysis showed terbinafine-dependent induction in the wild type and the UV-induced mutant Terb7, as well as overexpression in a strain containing the salA subgenomic DNA fragment, probably due to the multicopy effect caused by the transformation event. Additional naphthalene degradation enzyme-coding genes are present in fungal genomes, suggesting that resistance could follow degradation of the naphthalene ring contained in terbinafine. PMID:15328121
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reilly, Morgann C.; Kim, Joonhoon; Lynn, Jed
Plant biomass, once reduced to its composite sugars, can be converted to fuel substitutes. One means of overcoming the recalcitrance of lignocellulose is pretreatment followed by enzymatic hydrolysis. However, currently available commercial enzyme cocktails are inhibited in the presence of residual pretreatment chemicals. Recent studies have identified a number of cellulolytic enzymes from bacteria that are tolerant to pretreatment chemicals such as ionic liquids. The challenge now is generation of these enzymes in copious amounts, an arena where fungal organisms such as Aspergillus niger have proven efficient. Fungal host strains still need to be engineered to increase production titers ofmore » heterologous protein over native enzymes, which has been a difficult task. Here, we developed a forward genetics screen coupled with whole-genome resequencing to identify specific lesions responsible for a protein hyper-production phenotype in A. niger. This strategy successfully identified novel targets, including a low-affinity glucose transporter, MstC, whose deletion significantly improved secretion of recombinant proteins driven by a glucoamylase promoter.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reilly, Morgann C.; Kim, Joonhoon; Lynn, Jed
Plant biomass, once reduced to its composite sugars, can be converted to fuel substitutes. One means of overcoming the recalcitrance of lignocellulose is pretreatment followed by enzymatic hydrolysis. However, currently available commercial enzyme cocktails are inhibited in the presence of residual pretreatment chemicals. Recent studies have identified a number of cellulolytic enzymes from bacteria that are tolerant to pretreatment chemicals such as ionic liquids. The challenge now is generation of these enzymes in copious amounts, an arena where fungal organisms such as Aspergillus niger have proven efficient. Fungal host strains still need to be engineered to increase production titers ofmore » heterologous protein over native enzymes, which has been a difficult task. Here, we developed a forward genetics screen coupled with whole-genome resequencing to identify specific lesions responsible for a protein hyper-production phenotype in A. niger. As a result, this strategy successfully identified novel targets, including a low-affinity glucose transporter, MstC, whose deletion significantly improved secretion of recombinant proteins driven by a glucoamylase promoter.« less
Zheng, Xiaomei; Zheng, Ping; Zhang, Kun; Cairns, Timothy C; Meyer, Vera; Sun, Jibin; Ma, Yanhe
2018-04-30
The CRISPR/Cas9 system is a revolutionary genome editing tool. However, in eukaryotes, search and optimization of a suitable promoter for guide RNA expression is a significant technical challenge. Here we used the industrially important fungus, Aspergillus niger, to demonstrate that the 5S rRNA gene, which is both highly conserved and efficiently expressed in eukaryotes, can be used as a guide RNA promoter. The gene editing system was established with 100% rates of precision gene modifications among dozens of transformants using short (40-bp) homologous donor DNA. This system was also applicable for generation of designer chromosomes, as evidenced by deletion of a 48 kb gene cluster required for biosynthesis of the mycotoxin fumonisin B1. Moreover, this system also facilitated simultaneous mutagenesis of multiple genes in A. niger. We anticipate that the use of the 5S rRNA gene as guide RNA promoter can broadly be applied for engineering highly efficient eukaryotic CRISPR/Cas9 toolkits. Additionally, the system reported here will enable development of designer chromosomes in model and industrially important fungi.
Reilly, Morgann C.; Kim, Joonhoon; Lynn, Jed; ...
2018-01-06
Plant biomass, once reduced to its composite sugars, can be converted to fuel substitutes. One means of overcoming the recalcitrance of lignocellulose is pretreatment followed by enzymatic hydrolysis. However, currently available commercial enzyme cocktails are inhibited in the presence of residual pretreatment chemicals. Recent studies have identified a number of cellulolytic enzymes from bacteria that are tolerant to pretreatment chemicals such as ionic liquids. The challenge now is generation of these enzymes in copious amounts, an arena where fungal organisms such as Aspergillus niger have proven efficient. Fungal host strains still need to be engineered to increase production titers ofmore » heterologous protein over native enzymes, which has been a difficult task. Here, we developed a forward genetics screen coupled with whole-genome resequencing to identify specific lesions responsible for a protein hyper-production phenotype in A. niger. This strategy successfully identified novel targets, including a low-affinity glucose transporter, MstC, whose deletion significantly improved secretion of recombinant proteins driven by a glucoamylase promoter.« less
Recent updates and developments to plant genome size databases
Garcia, Sònia; Leitch, Ilia J.; Anadon-Rosell, Alba; Canela, Miguel Á.; Gálvez, Francisco; Garnatje, Teresa; Gras, Airy; Hidalgo, Oriane; Johnston, Emmeline; Mas de Xaxars, Gemma; Pellicer, Jaume; Siljak-Yakovlev, Sonja; Vallès, Joan; Vitales, Daniel; Bennett, Michael D.
2014-01-01
Two plant genome size databases have been recently updated and/or extended: the Plant DNA C-values database (http://data.kew.org/cvalues), and GSAD, the Genome Size in Asteraceae database (http://www.asteraceaegenomesize.com). While the first provides information on nuclear DNA contents across land plants and some algal groups, the second is focused on one of the largest and most economically important angiosperm families, Asteraceae. Genome size data have numerous applications: they can be used in comparative studies on genome evolution, or as a tool to appraise the cost of whole-genome sequencing programs. The growing interest in genome size and increasing rate of data accumulation has necessitated the continued update of these databases. Currently, the Plant DNA C-values database (Release 6.0, Dec. 2012) contains data for 8510 species, while GSAD has 1219 species (Release 2.0, June 2013), representing increases of 17 and 51%, respectively, in the number of species with genome size data, compared with previous releases. Here we provide overviews of the most recent releases of each database, and outline new features of GSAD. The latter include (i) a tool to visually compare genome size data between species, (ii) the option to export data and (iii) a webpage containing information about flow cytometry protocols. PMID:24288377
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sorensen, Anette; Ahring, Birgitte K.; Lubeck, Mette
2012-08-20
A newly discovered fungal species, Aspergillus saccharolyticus, was found to produce a culture broth rich in beta-glucosidase activity. In this present work, the main beta-glucosidase of A. saccharolyticus responsible for the efficient hydrolytic activity was identified, isolated, and characterized. Ion exchange chromatography was used to fractionate the culture broth, yielding fractions with high beta-glucosidase activity and only one visible band on an SDS-PAGE gel. Mass spectrometry analysis of this band gave peptide matches to beta-glucosidases from aspergilli. Through a PCR approach using degenerate primers and genome walking, a 2919 base pair sequence encoding the 860 amino acid BGL1 polypeptide wasmore » determined. BGL1 of A. saccharolyticus has 91% and 82% identity with BGL1 from Aspergillus aculeatus and BGL1 from Aspergillus niger, respectively, both belonging to Glycoside hydrolase family 3. Homology modeling studies suggested beta-glucosidase activity with preserved retaining mechanism and a wider catalytic pocket compared to other beta-glucosidases. The bgl1 gene was heterologously expressed in Trichoderma reesei QM6a, purified, and characterized by enzyme kinetics studies. The enzyme can hydrolyze cellobiose, pNPG, and cellodextrins. The enzyme showed good thermostability, was stable at 50°C, and at 60°C it had a half-life of approximately 6 hours.« less
MIPS: a database for protein sequences and complete genomes.
Mewes, H W; Hani, J; Pfeiffer, F; Frishman, D
1998-01-01
The MIPS group [Munich Information Center for Protein Sequences of the German National Center for Environment and Health (GSF)] at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, is involved in a number of data collection activities, including a comprehensive database of the yeast genome, a database reflecting the progress in sequencing the Arabidopsis thaliana genome, the systematic analysis of other small genomes and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). Through its WWW server (http://www.mips.biochem.mpg.de ) MIPS provides access to a variety of generic databases, including a database of protein families as well as automatically generated data by the systematic application of sequence analysis algorithms. The yeast genome sequence and its related information was also compiled on CD-ROM to provide dynamic interactive access to the 16 chromosomes of the first eukaryotic genome unraveled. PMID:9399795
The Effects of Signal Erosion and Core Genome Reduction on the Identification of Diagnostic Markers
2016-09-20
31 diagnostics for the identification of bacterial pathogens. To do this effectively, 32 genomics databases must be comprehensive to identify the...diverse B. 118 pseudomallei/mallei strains were sequenced, assembled, and deposited in public 119 databases (Supplemental Table 1); these genomes were...combined with 160 B. 120 pseudomallei/mallei genome assemblies already in public databases . Most of the 121 genomes (n=779) in this study were
Analysis of secreted proteins from Aspergillus flavus.
Medina, Martha L; Haynes, Paul A; Breci, Linda; Francisco, Wilson A
2005-08-01
MS/MS techniques in proteomics make possible the identification of proteins from organisms with little or no genome sequence information available. Peptide sequences are obtained from tandem mass spectra by matching peptide mass and fragmentation information to protein sequence information from related organisms, including unannotated genome sequence data. This peptide identification data can then be grouped and reconstructed into protein data. In this study, we have used this approach to study protein secretion by Aspergillus flavus, a filamentous fungus for which very little genome sequence information is available. A. flavus is capable of degrading the flavonoid rutin (quercetin 3-O-glycoside), as the only source of carbon via an extracellular enzyme system. In this continuing study, a proteomic analysis was used to identify secreted proteins from A. flavus when grown on rutin. The growth media glucose and potato dextrose were used to identify differentially expressed secreted proteins. The secreted proteins were analyzed by 1- and 2-DE and MS/MS. A total of 51 unique A. flavus secreted proteins were identified from the three growth conditions. Ten proteins were unique to rutin-, five to glucose- and one to potato dextrose-grown A. flavus. Sixteen secreted proteins were common to all three media. Fourteen identifications were of hypothetical proteins or proteins of unknown functions. To our knowledge, this is the first extensive proteomic study conducted to identify the secreted proteins from a filamentous fungus.
Nucleotide and amino acid variations of tannase gene from different Aspergillus strains.
Borrego-Terrazas, J A; Lara-Victoriano, F; Flores-Gallegos, A C; Veana, F; Aguilar, C N; Rodríguez-Herrera, R
2014-08-01
Tannase is an enzyme that catalyses the hydrolysis of ester bonds present in tannins. Most of the scientific reports about this biocatalysis focus on aspects related to tannase production and its recovery; on the other hand, reports assessing the molecular aspects of the tannase gene or protein are scarce. In the present study, a tannase gene fragment from several Aspergillus strains isolated from the Mexican semidesert was sequenced and compared with tannase amino acid sequences reported in NCBI database using bioinformatics tools. The genetic relationship among the different tannase sequences was also determined. A conserved region of 7 amino acids was found with the conserved motif GXSXG common to esterases, in which the active-site serine residue is located. In addition, in Aspergillus niger strains GH1 and PSH, we found an extra codon in the tannase sequences encoding glycine. The tannase gene belonging to semidesert fungal strains followed a neutral evolution path with the formation of 10 haplotypes, of which A. niger GH1 and PSH haplotypes are the oldest.
Genomics of Aspergillus oryzae: Learning from the History of Koji Mold and Exploration of Its Future
Machida, Masayuki; Yamada, Osamu; Gomi, Katsuya
2008-01-01
At a time when the notion of microorganisms did not exist, our ancestors empirically established methods for the production of various fermentation foods: miso (bean curd seasoning) and shoyu (soy sauce), both of which have been widely used and are essential for Japanese cooking, and sake, a magical alcoholic drink consumed at a variety of ritual occasions, are typical examples. A filamentous fungus, Aspergillus oryzae, is the key organism in the production of all these traditional foods, and its solid-state cultivation (SSC) has been confirmed to be the secret for the high productivity of secretory hydrolases vital for the fermentation process. Indeed, our genome comparison and transcriptome analysis uncovered mechanisms for effective degradation of raw materials in SSC: the extracellular hydrolase genes that have been found only in the A. oryzae genome but not in A. fumigatus are highly induced during SSC but not in liquid cultivation. Also, the temperature reduction process empirically adopted in the traditional soy-sauce fermentation processes has been found to be important to keep strong expression of the A. oryzae-specific extracellular hydrolases. One of the prominent potentials of A. oryzae is that it has been successfully applied to effective degradation of biodegradable plastic. Both cutinase, responsible for the degradation of plastic, and hydrophobin, which recruits cutinase on the hydrophobic surface to enhance degradation, have been discovered in A. oryzae. Genomic analysis in concert with traditional knowledge and technology will continue to be powerful tools in the future exploration of A. oryzae. PMID:18820080
Kato, Hiroki; Tsunematsu, Yuta; Yamamoto, Tsuyoshi; Namiki, Takuya; Kishimoto, Shinji; Noguchi, Hiroshi; Watanabe, Kenji
2016-07-01
To rapidly identify novel natural products and their associated biosynthetic genes from underutilized and genetically difficult-to-manipulate microbes, we developed a method that uses (1) chemical screening to isolate novel microbial secondary metabolites, (2) bioinformatic analyses to identify a potential biosynthetic gene cluster and (3) heterologous expression of the genes in a convenient host to confirm the identity of the gene cluster and the proposed biosynthetic mechanism. The chemical screen was achieved by searching known natural product databases with data from liquid chromatographic and high-resolution mass spectrometric analyses collected on the extract from a target microbe culture. Using this method, we were able to isolate two new meroterpenes, subglutinols C (1) and D (2), from an entomopathogenic filamentous fungus Metarhizium robertsii ARSEF 23. Bioinformatics analysis of the genome allowed us to identify a gene cluster likely to be responsible for the formation of subglutinols. Heterologous expression of three genes from the gene cluster encoding a polyketide synthase, a prenyltransferase and a geranylgeranyl pyrophosphate synthase in Aspergillus nidulans A1145 afforded an α-pyrone-fused uncyclized diterpene, the expected intermediate of the subglutinol biosynthesis, thereby confirming the gene cluster to be responsible for the subglutinol biosynthesis. These results indicate the usefulness of our methodology in isolating new natural products and identifying their associated biosynthetic gene cluster from microbes that are not amenable to genetic manipulation. Our method should facilitate the natural product discovery efforts by expediting the identification of new secondary metabolites and their associated biosynthetic genes from a wider source of microbes.
Susca, Antonia; Proctor, Robert H; Butchko, Robert A E; Haidukowski, Miriam; Stea, Gaetano; Logrieco, Antonio; Moretti, Antonio
2014-12-01
The ability to produce fumonisin mycotoxins varies among members of the black aspergilli. Previously, analyses of selected genes in the fumonisin biosynthetic gene (fum) cluster in black aspergilli from California grapes indicated that fumonisin-nonproducing isolates of Aspergillus welwitschiae lack six fum genes, but nonproducing isolates of Aspergillus niger do not. In the current study, analyses of black aspergilli from grapes from the Mediterranean Basin indicate that the genomic context of the fum cluster is the same in isolates of A. niger and A. welwitschiae regardless of fumonisin-production ability and that full-length clusters occur in producing isolates of both species and nonproducing isolates of A. niger. In contrast, the cluster has undergone an eight-gene deletion in fumonisin-nonproducing isolates of A. welwitschiae. Phylogenetic analyses suggest each species consists of a mixed population of fumonisin-producing and nonproducing individuals, and that existence of both production phenotypes may provide a selective advantage to these species. Differences in gene content of fum cluster homologues and phylogenetic relationships of fum genes suggest that the mutation(s) responsible for the nonproduction phenotype differs, and therefore arose independently, in the two species. Partial fum cluster homologues were also identified in genome sequences of four other black Aspergillus species. Gene content of these partial clusters and phylogenetic relationships of fum sequences indicate that non-random partial deletion of the cluster has occurred multiple times among the species. This in turn suggests that an intact cluster and fumonisin production were once more widespread among black aspergilli. Copyright © 2014 Elsevier Inc. All rights reserved.
Brassica ASTRA: an integrated database for Brassica genomic research.
Love, Christopher G; Robinson, Andrew J; Lim, Geraldine A C; Hopkins, Clare J; Batley, Jacqueline; Barker, Gary; Spangenberg, German C; Edwards, David
2005-01-01
Brassica ASTRA is a public database for genomic information on Brassica species. The database incorporates expressed sequences with Swiss-Prot and GenBank comparative sequence annotation as well as secondary Gene Ontology (GO) annotation derived from the comparison with Arabidopsis TAIR GO annotations. Simple sequence repeat molecular markers are identified within resident sequences and mapped onto the closely related Arabidopsis genome sequence. Bacterial artificial chromosome (BAC) end sequences derived from the Multinational Brassica Genome Project are also mapped onto the Arabidopsis genome sequence enabling users to identify candidate Brassica BACs corresponding to syntenic regions of Arabidopsis. This information is maintained in a MySQL database with a web interface providing the primary means of interrogation. The database is accessible at http://hornbill.cspp.latrobe.edu.au.
The Ruby UCSC API: accessing the UCSC genome database using Ruby.
Mishima, Hiroyuki; Aerts, Jan; Katayama, Toshiaki; Bonnal, Raoul J P; Yoshiura, Koh-ichiro
2012-09-21
The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast.The API uses the bin index-if available-when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby). Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/.
The Ruby UCSC API: accessing the UCSC genome database using Ruby
2012-01-01
Background The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. Results The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast. The API uses the bin index—if available—when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby). Conclusions Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/. PMID:22994508
Ma, Yazhen; Xu, Ting; Wan, Dongshi; Ma, Tao; Shi, Sheng; Liu, Jianquan; Hu, Quanjun
2015-03-17
Soil salinity is a significant factor that impairs plant growth and agricultural productivity, and numerous efforts are underway to enhance salt tolerance of economically important plants. Populus species are widely cultivated for diverse uses. Especially, they grow in different habitats, from salty soil to mesophytic environment, and are therefore used as a model genus for elucidating physiological and molecular mechanisms of stress tolerance in woody plants. The Salinity Tolerant Poplar Database (STPD) is an integrative database for salt-tolerant poplar genome biology. Currently the STPD contains Populus euphratica genome and its related genetic resources. P. euphratica, with a preference of the salty habitats, has become a valuable genetic resource for the exploitation of tolerance characteristics in trees. This database contains curated data including genomic sequence, genes and gene functional information, non-coding RNA sequences, transposable elements, simple sequence repeats and single nucleotide polymorphisms information of P. euphratica, gene expression data between P. euphratica and Populus tomentosa, and whole-genome alignments between Populus trichocarpa, P. euphratica and Salix suchowensis. The STPD provides useful searching and data mining tools, including GBrowse genome browser, BLAST servers and genome alignments viewer, which can be used to browse genome regions, identify similar sequences and visualize genome alignments. Datasets within the STPD can also be downloaded to perform local searches. A new Salinity Tolerant Poplar Database has been developed to assist studies of salt tolerance in trees and poplar genomics. The database will be continuously updated to incorporate new genome-wide data of related poplar species. This database will serve as an infrastructure for researches on the molecular function of genes, comparative genomics, and evolution in closely related species as well as promote advances in molecular breeding within Populus. The STPD can be accessed at http://me.lzu.edu.cn/stpd/ .
CyanoBase: the cyanobacteria genome database update 2010.
Nakao, Mitsuteru; Okamoto, Shinobu; Kohara, Mitsuyo; Fujishiro, Tsunakazu; Fujisawa, Takatomo; Sato, Shusei; Tabata, Satoshi; Kaneko, Takakazu; Nakamura, Yasukazu
2010-01-01
CyanoBase (http://genome.kazusa.or.jp/cyanobase) is the genome database for cyanobacteria, which are model organisms for photosynthesis. The database houses cyanobacteria species information, complete genome sequences, genome-scale experiment data, gene information, gene annotations and mutant information. In this version, we updated these datasets and improved the navigation and the visual display of the data views. In addition, a web service API now enables users to retrieve the data in various formats with other tools, seamlessly.
GenomeHubs: simple containerized setup of a custom Ensembl database and web server for any species
Kumar, Sujai; Stevens, Lewis; Blaxter, Mark
2017-01-01
Abstract As the generation and use of genomic datasets is becoming increasingly common in all areas of biology, the need for resources to collate, analyse and present data from one or more genome projects is becoming more pressing. The Ensembl platform is a powerful tool to make genome data and cross-species analyses easily accessible through a web interface and a comprehensive application programming interface. Here we introduce GenomeHubs, which provide a containerized environment to facilitate the setup and hosting of custom Ensembl genome browsers. This simplifies mirroring of existing content and import of new genomic data into the Ensembl database schema. GenomeHubs also provide a set of analysis containers to decorate imported genomes with results of standard analyses and functional annotations and support export to flat files, including EMBL format for submission of assemblies and annotations to International Nucleotide Sequence Database Collaboration. Database URL: http://GenomeHubs.org PMID:28605774
Susca, Antonia; Moretti, Antonio; Stea, Gaetano; Villani, Alessandra; Haidukowski, Miriam; Logrieco, Antonio; Munkvold, Gary
2014-10-01
Fumonisin contamination of maize is considered a serious problem in most maize-growing regions of the world, due to the widespread occurrence of these mycotoxins and their association with toxicosis in livestock and humans. Fumonisins are produced primarily by species of Fusarium that are common in maize grain, but also by some species of Aspergillus sect. Nigri, which can also occur on maize kernels as opportunistic pathogens. Understanding the origin of fumonisin contamination in maize is a key component in developing effective management strategies. Although some fungi in Aspergillus sect. Nigri are known to produce fumonisins, little is known about the species which are common in maize and whether they make a measurable contribution to fumonisin contamination of maize grain. In this work, we evaluated populations of Aspergillus sect. Nigri isolated from maize in USA and Italy, focusing on analysis of housekeeping genes, the fum8 gene and in vitro capability of producing fumonisins. DNA sequencing was used to identify Aspergillus strains belonging to sect. Nigri, in order to compare species composition between the two populations, which might influence specific mycotoxicological risks. Combined beta-tubulin/calmodulin sequences were used to genetically characterize 300 strains (199 from Italy and 101 from USA) which grouped into 4 clades: Aspergillus welwitschiae (syn. Aspergillus awamori, 14.7%), Aspergillus tubingensis (37.0%) and Aspergillus niger group 1 (6.7%) and group 2 (41.3%). Only one strain was identified as Aspergillus carbonarius. Species composition differed between the two populations; A. niger predominated among the USA isolates (69%), but comprised a smaller percentage (38%) of Italian isolates. Conversely, A. tubingensis and A. welwitschiae occurred at higher frequencies in the Italian population (42% and 20%, respectively) than in the USA population (27% and 5%). The evaluation of FB2 production on CY20S agar revealed 118 FB2 producing and 84 non-producing strains distributed among the clades: A. welwitschiae, A. niger group 1 and A. niger group 2, confirming the potential of Aspergillus sect. Nigri species to contribute to total fumonisin contamination of maize. A higher percentage of A. niger isolates (72.0%) produced FB2 compared to A. welwitschiae (36.6%). The percentage of FB2-producing A. niger strains was similar in the USA and Italian populations; however, the predominance of A. niger in the USA population suggests a higher potential for fumonisin production. Some strains with fum8 present in the genome did not produce FB2in vitro, confirming the ineffectiveness of fum8 presence as a predictor of FB2 production. Copyright © 2014 Elsevier B.V. All rights reserved.
dBBQs: dataBase of Bacterial Quality scores.
Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David
2017-12-28
It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database. Prokaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses. dBBQs (available at http://arc-gem.uams.edu/dbbqs ) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose.
Plant Genome Resources at the National Center for Biotechnology Information
Wheeler, David L.; Smith-White, Brian; Chetvernin, Vyacheslav; Resenchuk, Sergei; Dombrowski, Susan M.; Pechous, Steven W.; Tatusova, Tatiana; Ostell, James
2005-01-01
The National Center for Biotechnology Information (NCBI) integrates data from more than 20 biological databases through a flexible search and retrieval system called Entrez. A core Entrez database, Entrez Nucleotide, includes GenBank and is tightly linked to the NCBI Taxonomy database, the Entrez Protein database, and the scientific literature in PubMed. A suite of more specialized databases for genomes, genes, gene families, gene expression, gene variation, and protein domains dovetails with the core databases to make Entrez a powerful system for genomic research. Linked to the full range of Entrez databases is the NCBI Map Viewer, which displays aligned genetic, physical, and sequence maps for eukaryotic genomes including those of many plants. A specialized plant query page allow maps from all plant genomes covered by the Map Viewer to be searched in tandem to produce a display of aligned maps from several species. PlantBLAST searches against the sequences shown in the Map Viewer allow BLAST alignments to be viewed within a genomic context. In addition, precomputed sequence similarities, such as those for proteins offered by BLAST Link, enable fluid navigation from unannotated to annotated sequences, quickening the pace of discovery. NCBI Web pages for plants, such as Plant Genome Central, complete the system by providing centralized access to NCBI's genomic resources as well as links to organism-specific Web pages beyond NCBI. PMID:16010002
CyanoBase: the cyanobacteria genome database update 2010
Nakao, Mitsuteru; Okamoto, Shinobu; Kohara, Mitsuyo; Fujishiro, Tsunakazu; Fujisawa, Takatomo; Sato, Shusei; Tabata, Satoshi; Kaneko, Takakazu; Nakamura, Yasukazu
2010-01-01
CyanoBase (http://genome.kazusa.or.jp/cyanobase) is the genome database for cyanobacteria, which are model organisms for photosynthesis. The database houses cyanobacteria species information, complete genome sequences, genome-scale experiment data, gene information, gene annotations and mutant information. In this version, we updated these datasets and improved the navigation and the visual display of the data views. In addition, a web service API now enables users to retrieve the data in various formats with other tools, seamlessly. PMID:19880388
CottonDB: A resource for cotton genome research
USDA-ARS?s Scientific Manuscript database
CottonDB (http://cottondb.org/) is a database and web resource for cotton genomic and genetic research. Created in 1995, CottonDB was among the first plant genome databases established by the USDA-ARS. Accessed through a website interface, the database aims to be a convenient, inclusive medium of ...
The Giardia genome project database.
McArthur, A G; Morrison, H G; Nixon, J E; Passamaneck, N Q; Kim, U; Hinkle, G; Crocker, M K; Holder, M E; Farr, R; Reich, C I; Olsen, G E; Aley, S B; Adam, R D; Gillin, F D; Sogin, M L
2000-08-15
The Giardia genome project database provides an online resource for Giardia lamblia (WB strain, clone C6) genome sequence information. The database includes edited single-pass reads, the results of BLASTX searches, and details of progress towards sequencing the entire 12 million-bp Giardia genome. Pre-sorted BLASTX results can be retrieved based on keyword searches and BLAST searches of the high throughput Giardia data can be initiated from the web site or through NCBI. Descriptions of the genomic DNA libraries, project protocols and summary statistics are also available. Although the Giardia genome project is ongoing, new sequences are made available on a bi-monthly basis to ensure that researchers have access to information that may assist them in the search for genes and their biological function. The current URL of the Giardia genome project database is www.mbl.edu/Giardia.
Chiapello, Hélène; Gendrault, Annie; Caron, Christophe; Blum, Jérome; Petit, Marie-Agnès; El Karoui, Meriem
2008-11-27
The recent availability of complete sequences for numerous closely related bacterial genomes opens up new challenges in comparative genomics. Several methods have been developed to align complete genomes at the nucleotide level but their use and the biological interpretation of results are not straightforward. It is therefore necessary to develop new resources to access, analyze, and visualize genome comparisons. Here we present recent developments on MOSAIC, a generalist comparative bacterial genome database. This database provides the bacteriologist community with easy access to comparisons of complete bacterial genomes at the intra-species level. The strategy we developed for comparison allows us to define two types of regions in bacterial genomes: backbone segments (i.e., regions conserved in all compared strains) and variable segments (i.e., regions that are either specific to or variable in one of the aligned genomes). Definition of these segments at the nucleotide level allows precise comparative and evolutionary analyses of both coding and non-coding regions of bacterial genomes. Such work is easily performed using the MOSAIC Web interface, which allows browsing and graphical visualization of genome comparisons. The MOSAIC database now includes 493 pairwise comparisons and 35 multiple maximal comparisons representing 78 bacterial species. Genome conserved regions (backbones) and variable segments are presented in various formats for further analysis. A graphical interface allows visualization of aligned genomes and functional annotations. The MOSAIC database is available online at http://genome.jouy.inra.fr/mosaic.
Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources.
Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu
2015-01-01
The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/. © The Author(s) 2015. Published by Oxford University Press.
Agrobacterium tumefaciens-mediated transformation of Mucor circinelloides.
Nyilasi, I; Acs, K; Papp, T; Nagy, E; Vágvölgyi, C
2005-01-01
The Agrobacterium tumefaciens-mediated transformation of the zygomycetous fungus Mucor circinelloides is described. A method was also developed for the hygromycin B-based selection of Mucor transformants. Transformation with the hygromycin B phosphotransferase gene of Escherichia coli controlled by the heterologous Aspergillus nidulans trpC promoter resulted in hygromycin B-resistant clones. The presence of the hygromycin resistance gene in the genome of the transformants was verified by polymerase chain reaction and Southern hybridization: the latter analyses revealed integrations in the host genome at different sites in different transformants. The stability of transformants remained questionable during the latter analyses.
The COG database: a tool for genome-scale analysis of protein functions and evolution
Tatusov, Roman L.; Galperin, Michael Y.; Natale, Darren A.; Koonin, Eugene V.
2000-01-01
Rational classification of proteins encoded in sequenced genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies. The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete genomes of bacteria, archaea and eukaryotes (http://www.ncbi.nlm.nih.gov/COG ). The COGs were constructed by applying the criterion of consistency of genome-specific best hits to the results of an exhaustive comparison of all protein sequences from these genomes. The database comprises 2091 COGs that include 56–83% of the gene products from each of the complete bacterial and archaeal genomes and ~35% of those from the yeast Saccharomyces cerevisiae genome. The COG database is accompanied by the COGNITOR program that is used to fit new proteins into the COGs and can be applied to functional and phylogenetic annotation of newly sequenced genomes. PMID:10592175
Uday, Uma Shankar Prasad; Majumdar, Ria; Tiwari, Onkar Nath; Mishra, Umesh; Mondal, Abhijit; Bandyopadhyay, Tarun Kanti; Bhunia, Biswanath
2017-12-01
In the present work, a potent xylanase producing fungal strain Aspergillus niger (KP874102.1) was isolated through cultural and morphological observations from soil sample of Baramura forest, Tripura west, India. 28S rDNA technique was applied for genomic identification of this fungal strain. The isolated strain was found to be phylogenetically closely related to Aspergillus niger. Kinetic constants such as K m and V max for extracellular xylanase were determined using various substrate such as beech wood xylan, oat spelt xylan and CM cellulose through Lineweaver-Burk plot. K m , V max and K cat for beech wood xylan are found to be 2.89mg/ml, 2442U and 426178Umlmg -1 respectively. Crude enzyme did not show also CM cellulose activity. The relative efficiency of oat spelt xylan was found to be 0.819 with respect to beech wood xylan. After acid hydrolysis, enzyme was able to produce reducing sugar with 17.7, 35.5, 50.8 and 65% (w/w) from orange peel after 15, 30, 45 and 60min incubation with cellulase free xylanase and maximum reducing sugar formation rate was found to be 55.96μg/ml/min. Therefore, the Aspergillus niger (KP874102.1) is considered as a potential candidate for enzymatic hydrolysis of orange peel. Copyright © 2017 Elsevier B.V. All rights reserved.
Mycoviruses in Aspergilli: A Comprehensive Review
Kotta-Loizou, Ioly; Coutts, Robert H. A.
2017-01-01
Fungi, similar to all species, are susceptible to viral infection. Aspergillus is arguably the most well studied fungal genus because of its medical, ecological and economical significance. Mycoviruses were initially detected in Aspergillus species almost 50 years ago and the field continues to be active today with ground-breaking discoveries. The aim of the present review is to cover the scientific progress in all aspects of mycovirology as exemplified by Aspergillus-focused research. Initially an overview of the population studies illustrating the presence of mycoviruses in numerous important Aspergillus species, such as A. niger, A. flavus, and A. fumigatus with be presented. Moreover the intricacies of mycovirus transmission, both inter- and intra-species, will be discussed together with the methodologies used to investigate viral dispersion in a laboratory setting. Subsequently, the genomic features of all molecularly characterized mycoviruses to date will be analyzed in depth. These include members of established viral families, such as Partitiviridae, Chrysoviridae and Totiviridae, but also more recent, novel discoveries that led to the proposal of new viral families, such as Polymycoviridae, Alternaviridae and, in the context of the present review, Exartaviridae. Finally, the major issue of phenotypic effects of mycoviral infection on the host is addressed, including aflatoxin production in A. flavus, together with growth and virulence in A. fumigatus. Although the molecular mechanisms behind these phenomena are yet to be elucidated, recent studies suggest that by implication, RNA silencing may be involved. PMID:28932216
Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.
Jayakodi, Murukarthick; Choi, Beom-Soon; Lee, Sang-Choon; Kim, Nam-Hoon; Park, Jee Young; Jang, Woojong; Lakshmanan, Meiyappan; Mohan, Shobhana V G; Lee, Dong-Yup; Yang, Tae-Jin
2018-04-12
The ginseng (Panax ginseng C.A. Meyer) is a perennial herbaceous plant that has been used in traditional oriental medicine for thousands of years. Ginsenosides, which have significant pharmacological effects on human health, are the foremost bioactive constituents in this plant. Having realized the importance of this plant to humans, an integrated omics resource becomes indispensable to facilitate genomic research, molecular breeding and pharmacological study of this herb. The first draft genome sequences of P. ginseng cultivar "Chunpoong" were reported recently. Here, using the draft genome, transcriptome, and functional annotation datasets of P. ginseng, we have constructed the Ginseng Genome Database http://ginsengdb.snu.ac.kr /, the first open-access platform to provide comprehensive genomic resources of P. ginseng. The current version of this database provides the most up-to-date draft genome sequence (of approximately 3000 Mbp of scaffold sequences) along with the structural and functional annotations for 59,352 genes and digital expression of genes based on transcriptome data from different tissues, growth stages and treatments. In addition, tools for visualization and the genomic data from various analyses are provided. All data in the database were manually curated and integrated within a user-friendly query page. This database provides valuable resources for a range of research fields related to P. ginseng and other species belonging to the Apiales order as well as for plant research communities in general. Ginseng genome database can be accessed at http://ginsengdb.snu.ac.kr /.
MIPS PlantsDB: a database framework for comparative plant genome research.
Nussbaumer, Thomas; Martis, Mihaela M; Roessner, Stephan K; Pfeifer, Matthias; Bader, Kai C; Sharma, Sapna; Gundlach, Heidrun; Spannagl, Manuel
2013-01-01
The rapidly increasing amount of plant genome (sequence) data enables powerful comparative analyses and integrative approaches and also requires structured and comprehensive information resources. Databases are needed for both model and crop plant organisms and both intuitive search/browse views and comparative genomics tools should communicate the data to researchers and help them interpret it. MIPS PlantsDB (http://mips.helmholtz-muenchen.de/plant/genomes.jsp) was initially described in NAR in 2007 [Spannagl,M., Noubibou,O., Haase,D., Yang,L., Gundlach,H., Hindemitt, T., Klee,K., Haberer,G., Schoof,H. and Mayer,K.F. (2007) MIPSPlantsDB-plant database resource for integrative and comparative plant genome research. Nucleic Acids Res., 35, D834-D840] and was set up from the start to provide data and information resources for individual plant species as well as a framework for integrative and comparative plant genome research. PlantsDB comprises database instances for tomato, Medicago, Arabidopsis, Brachypodium, Sorghum, maize, rice, barley and wheat. Building up on that, state-of-the-art comparative genomics tools such as CrowsNest are integrated to visualize and investigate syntenic relationships between monocot genomes. Results from novel genome analysis strategies targeting the complex and repetitive genomes of triticeae species (wheat and barley) are provided and cross-linked with model species. The MIPS Repeat Element Database (mips-REdat) and Catalog (mips-REcat) as well as tight connections to other databases, e.g. via web services, are further important components of PlantsDB.
MIPS PlantsDB: a database framework for comparative plant genome research
Nussbaumer, Thomas; Martis, Mihaela M.; Roessner, Stephan K.; Pfeifer, Matthias; Bader, Kai C.; Sharma, Sapna; Gundlach, Heidrun; Spannagl, Manuel
2013-01-01
The rapidly increasing amount of plant genome (sequence) data enables powerful comparative analyses and integrative approaches and also requires structured and comprehensive information resources. Databases are needed for both model and crop plant organisms and both intuitive search/browse views and comparative genomics tools should communicate the data to researchers and help them interpret it. MIPS PlantsDB (http://mips.helmholtz-muenchen.de/plant/genomes.jsp) was initially described in NAR in 2007 [Spannagl,M., Noubibou,O., Haase,D., Yang,L., Gundlach,H., Hindemitt, T., Klee,K., Haberer,G., Schoof,H. and Mayer,K.F. (2007) MIPSPlantsDB–plant database resource for integrative and comparative plant genome research. Nucleic Acids Res., 35, D834–D840] and was set up from the start to provide data and information resources for individual plant species as well as a framework for integrative and comparative plant genome research. PlantsDB comprises database instances for tomato, Medicago, Arabidopsis, Brachypodium, Sorghum, maize, rice, barley and wheat. Building up on that, state-of-the-art comparative genomics tools such as CrowsNest are integrated to visualize and investigate syntenic relationships between monocot genomes. Results from novel genome analysis strategies targeting the complex and repetitive genomes of triticeae species (wheat and barley) are provided and cross-linked with model species. The MIPS Repeat Element Database (mips-REdat) and Catalog (mips-REcat) as well as tight connections to other databases, e.g. via web services, are further important components of PlantsDB. PMID:23203886
Alazi, Ebru; Niu, Jing; Kowalczyk, Joanna E.; ...
2016-05-13
We identified the d-galacturonic acid (GA)-responsive transcriptional activator GaaR of the saprotrophic fungus, Aspergillus niger, which was found to be essential for growth on GA and polygalacturonic acid (PGA). Growth of the ΔgaaR strain was reduced on complex pectins. Genome-wide expression analysis showed that GaaR is required for the expression of genes necessary to release GA from PGA and more complex pectins, to transport GA into the cell, and to induce the GA catabolic pathway. Residual growth of ΔgaaR on complex pectins is likely due to the expression of pectinases acting on rhamnogalacturonan and subsequent metabolism of the monosaccharides othermore » than GA.« less
Ehrlich, Kenneth C; Mack, Brian M
2014-06-23
Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity.
Ehrlich, Kenneth C.; Mack, Brian M.
2014-01-01
Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity. PMID:24960201
Cloning and characterization of chsD, a chitin synthase-like gene of Aspergillus fumigatus.
Mellado, E; Specht, C A; Robbins, P W; Holden, D W
1996-09-15
A chitin synthase-like gene (chsD) was isolated from an Aspergillus fumigatus genomic DNA library. Comparisons with the predicted amino acid sequence from chsD reveals low but significant similarity to chitin synthases, to other N-acetylglucosaminyltransferases (NodC from Rhizopus spp., HasA from Streptococcus spp. and DG42 from vertebrates. A chsD- mutant strain constructed by gene disruption has a 20% reduction in total mycelial chitin content; however, no differences between the wild-type strain and the chsD- strain were found with respect to morphology, chitin synthase activity or virulence in a neutropenic murine model of aspergillosis. The results show that the chsD product has an important but inessential role in the synthesis of chitin in A. fumigatus.
Functional expression of amine oxidase from Aspergillus niger (AO-I) in Saccharomyces cerevisiae.
Kolaríková, Katerina; Galuszka, Petr; Sedlárová, Iva; Sebela, Marek; Frébort, Ivo
2009-01-01
The aim of this work was to prepare recombinant amine oxidase from Aspergillus niger after overexpressing in yeast. The yeast expression vector pDR197 that includes a constitutive PMA1 promoter was used for the expression in Saccharomyces cerevisiae. Recombinant amine oxidase was extracted from the growth medium of the yeast, purified to homogeneity and identified by activity assay and MALDI-TOF peptide mass fingerprinting. Similarity search in the newly published A. niger genome identified six genes coding for copper amine oxidase, two of them corresponding to the previously described enzymes AO-I a methylamine oxidase and three other genes coding for FAD amine oxidases. Thus, A. niger possesses an enormous metabolic gear to grow on amine compounds and thus support its saprophytic lifestyle.
GenColors-based comparative genome databases for small eukaryotic genomes.
Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot
2013-01-01
Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.
MIPS plant genome information resources.
Spannagl, Manuel; Haberer, Georg; Ernst, Rebecca; Schoof, Heiko; Mayer, Klaus F X
2007-01-01
The Munich Institute for Protein Sequences (MIPS) has been involved in maintaining plant genome databases since the Arabidopsis thaliana genome project. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable data sets for model plant genomes as a backbone against which experimental data, for example from high-throughput functional genomics, can be organized and evaluated. In addition, model genomes also form a scaffold for comparative genomics, and much can be learned from genome-wide evolutionary studies.
Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing
Wang, Bin; Guo, Guangwu; Wang, Chao; Lin, Ying; Wang, Xiaoning; Zhao, Mouming; Guo, Yong; He, Minghui; Zhang, Yong; Pan, Li
2010-01-01
Aspergillus oryzae, an important filamentous fungus used in food fermentation and the enzyme industry, has been shown through genome sequencing and various other tools to have prominent features in its genomic composition. However, the functional complexity of the A. oryzae transcriptome has not yet been fully elucidated. Here, we applied direct high-throughput paired-end RNA-sequencing (RNA-Seq) to the transcriptome of A. oryzae under four different culture conditions. With the high resolution and sensitivity afforded by RNA-Seq, we were able to identify a substantial number of novel transcripts, new exons, untranslated regions, alternative upstream initiation codons and upstream open reading frames, which provide remarkable insight into the A. oryzae transcriptome. We were also able to assess the alternative mRNA isoforms in A. oryzae and found a large number of genes undergoing alternative splicing. Many genes and pathways that might be involved in higher levels of protein production in solid-state culture than in liquid culture were identified by comparing gene expression levels between different cultures. Our analysis indicated that the transcriptome of A. oryzae is much more complex than previously anticipated, and these results may provide a blueprint for further study of the A. oryzae transcriptome. PMID:20392818
Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing.
Wang, Bin; Guo, Guangwu; Wang, Chao; Lin, Ying; Wang, Xiaoning; Zhao, Mouming; Guo, Yong; He, Minghui; Zhang, Yong; Pan, Li
2010-08-01
Aspergillus oryzae, an important filamentous fungus used in food fermentation and the enzyme industry, has been shown through genome sequencing and various other tools to have prominent features in its genomic composition. However, the functional complexity of the A. oryzae transcriptome has not yet been fully elucidated. Here, we applied direct high-throughput paired-end RNA-sequencing (RNA-Seq) to the transcriptome of A. oryzae under four different culture conditions. With the high resolution and sensitivity afforded by RNA-Seq, we were able to identify a substantial number of novel transcripts, new exons, untranslated regions, alternative upstream initiation codons and upstream open reading frames, which provide remarkable insight into the A. oryzae transcriptome. We were also able to assess the alternative mRNA isoforms in A. oryzae and found a large number of genes undergoing alternative splicing. Many genes and pathways that might be involved in higher levels of protein production in solid-state culture than in liquid culture were identified by comparing gene expression levels between different cultures. Our analysis indicated that the transcriptome of A. oryzae is much more complex than previously anticipated, and these results may provide a blueprint for further study of the A. oryzae transcriptome.
Sequencing the Black Aspergilli species complex
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuo, Alan; Salamov, Asaf; Zhou, Kemin
2011-03-11
The ~15 members of the Aspergillus section Nigri species complex (the "Black Aspergilli") are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as food processing and spoilage agents and agricultural toxigens. Despite their utility and ubiquity, the morphological and metabolic distinctiveness of the complex's members, and thus their taxonomy, is poorly defined. We are using short read pyrosequencing technology (Roche/454 and Illumina/Solexa) to rapidly scale up genomic and transcriptomic analysis of this species complex. To date we predict 11197 genes in Aspergillus niger, 11624 genes inmore » A. carbonarius, and 10845 genes in A. aculeatus. A. aculeatus is our most recent genome, and was assembled primarily from 454-sequenced reads and annotated with the aid of >2 million 454 ESTs and >300 million Solexa ESTs. To most effectively deploy these very large numbers of ESTs we developed 2 novel methods for clustering the ESTs into assemblies. We have also developed a pipeline to propose orthologies and paralogies among genes in the species complex. In the near future we will apply these methods to additional species of Black Aspergilli that are currently in our sequencing pipeline.« less
Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine
Elsik, Christine G.; Tayal, Aditi; Diesh, Colin M.; Unni, Deepak R.; Emery, Marianne L.; Nguyen, Hung N.; Hagen, Darren E.
2016-01-01
We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564
The Importance of Biological Databases in Biological Discovery.
Baxevanis, Andreas D; Bateman, Alex
2015-06-19
Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, The Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource, are also covered. Non-sequence-centric databases, such as Online Mendelian Inheritance in Man (OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and Genomes (KEGG), are also discussed. Copyright © 2015 John Wiley & Sons, Inc.
CyanoClust: comparative genome resources of cyanobacteria and plastids.
Sasaki, Naobumi V; Sato, Naoki
2010-01-01
Cyanobacteria, which perform oxygen-evolving photosynthesis as do chloroplasts of plants and algae, are one of the best-studied prokaryotic phyla and one from which many representative genomes have been sequenced. Lack of a suitable comparative genomic database has been a problem in cyanobacterial genomics because many proteins involved in physiological functions such as photosynthesis and nitrogen fixation are not catalogued in commonly used databases, such as Clusters of Orthologous Proteins (COG). CyanoClust is a database of homolog groups in cyanobacteria and plastids that are produced by the program Gclust. We have developed a web-server system for the protein homology database featuring cyanobacteria and plastids. Database URL: http://cyanoclust.c.u-tokyo.ac.jp/.
Wiley, Laura K.; Sivley, R. Michael; Bush, William S.
2013-01-01
Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist PMID:23894185
Wiley, Laura K; Sivley, R Michael; Bush, William S
2013-01-01
Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist.
Tripathi, Himanshu; Luqman, Suaib; Meena, Abha; Khan, Feroz
2014-01-01
Despite of modern antifungal therapy, the mortality rates of invasive infection with human fungal pathogen Candida albicans are up to 40%. Studies suggest that drug resistance in the three most common species of human fungal pathogens viz., C. albicans, Aspergillus fumigatus (causing mortality rate up to 90%) and Cryptococcus neoformans (causing mortality rate up to 70%) is due to mutations in the target enzymes or high expression of drug transporter genes. Drug resistance in human fungal pathogens has led to an imperative need for the identification of new targets unique to fungal pathogens. In the present study, we have used a comparative genomics approach to find out potential target proteins unique to C. albicans, an opportunistic fungus responsible for severe infection in immune-compromised human. Interestingly, many target proteins of existing antifungal agents showed orthologs in human cells. To identify unique proteins, we have compared proteome of C. albicans [SC5314] i.e., 14,633 total proteins retrieved from the RefSeq database of NCBI, USA with proteome of human and non-pathogenic yeast Saccharomyces cerevisiae. Results showed that 4,568 proteins were identified unique to C. albicans as compared to those of human and later when these unique proteins were compared with S. cerevisiae proteome, finally 2,161 proteins were identified as unique proteins and after removing repeats total 1,618 unique proteins (42 functionally known, 1,566 hypothetical and 10 unknown) were selected as potential antifungal drug targets unique to C. albicans.
The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide
Liolios, Konstantinos; Tavernarakis, Nektarios; Hugenholtz, Philip; Kyrpides, Nikos C.
2006-01-01
The Genomes On Line Database (GOLD) is a web resource for comprehensive access to information regarding complete and ongoing genome sequencing projects worldwide. The database currently incorporates information on over 1500 sequencing projects, of which 294 have been completed and the data deposited in the public databases. GOLD v.2 has been expanded to provide information related to organism properties such as phenotype, ecotype and disease. Furthermore, project relevance and availability information is now included. GOLD is available at . It is also mirrored at the Institute of Molecular Biology and Biotechnology, Crete, Greece at PMID:16381880
Reddy, T.B.K.; Thomas, Alex D.; Stamatis, Dimitri; Bertsch, Jon; Isbandi, Michelle; Jansson, Jakob; Mallajosyula, Jyothi; Pagani, Ioanna; Lobos, Elizabeth A.; Kyrpides, Nikos C.
2015-01-01
The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Here we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards. PMID:25348402
Moroz, Olga V.; Maranta, Michelle; Shaghasi, Tarana; Harris, Paul V.; Wilson, Keith S.; Davies, Gideon J.
2015-01-01
The enzymatic degradation of plant cell-wall cellulose is central to many industrial processes, including second-generation biofuel production. Key players in this deconstruction are the fungal cellobiohydrolases (CBHs), notably those from family GH7 of the carbohydrate-active enzymes (CAZY) database, which are generally known as CBHI enzymes. Here, three-dimensional structures are reported of the Aspergillus fumigatus CBHI Cel7A solved in uncomplexed and disaccharide-bound forms at resolutions of 1.8 and 1.5 Å, respectively. The product complex with a disaccharide in the +1 and +2 subsites adds to the growing three-dimensional insight into this family of industrially relevant biocatalysts. PMID:25615982
Winsor, Geoffrey L; Van Rossum, Thea; Lo, Raymond; Khaira, Bhavjinder; Whiteside, Matthew D; Hancock, Robert E W; Brinkman, Fiona S L
2009-01-01
Pseudomonas aeruginosa is a well-studied opportunistic pathogen that is particularly known for its intrinsic antimicrobial resistance, diverse metabolic capacity, and its ability to cause life threatening infections in cystic fibrosis patients. The Pseudomonas Genome Database (http://www.pseudomonas.com) was originally developed as a resource for peer-reviewed, continually updated annotation for the Pseudomonas aeruginosa PAO1 reference strain genome. In order to facilitate cross-strain and cross-species genome comparisons with other Pseudomonas species of importance, we have now expanded the database capabilities to include all Pseudomonas species, and have developed or incorporated methods to facilitate high quality comparative genomics. The database contains robust assessment of orthologs, a novel ortholog clustering method, and incorporates five views of the data at the sequence and annotation levels (Gbrowse, Mauve and custom views) to facilitate genome comparisons. A choice of simple and more flexible user-friendly Boolean search features allows researchers to search and compare annotations or sequences within or between genomes. Other features include more accurate protein subcellular localization predictions and a user-friendly, Boolean searchable log file of updates for the reference strain PAO1. This database aims to continue to provide a high quality, annotated genome resource for the research community and is available under an open source license.
Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin
2011-01-01
The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
Martin, Tiphaine; Sherman, David J; Durrens, Pascal
2011-01-01
The Génolevures online database (URL: http://www.genolevures.org) stores and provides the data and results obtained by the Génolevures Consortium through several campaigns of genome annotation of the yeasts in the Saccharomycotina subphylum (hemiascomycetes). This database is dedicated to large-scale comparison of these genomes, storing not only the different chromosomal elements detected in the sequences, but also the logical relations between them. The database is divided into a public part, accessible to anyone through Internet, and a private part where the Consortium members make genome annotations with our Magus annotation system; this system is used to annotate several related genomes in parallel. The public database is widely consulted and offers structured data, organized using a REST web site architecture that allows for automated requests. The implementation of the database, as well as its associated tools and methods, is evolving to cope with the influx of genome sequences produced by Next Generation Sequencing (NGS). Copyright © 2011 Académie des sciences. Published by Elsevier SAS. All rights reserved.
Han, Guomin; Shao, Qian; Li, Cuiping; Zhao, Kai; Jiang, Li; Fan, Jun; Jiang, Haiyang; Tao, Fang
2018-05-01
Aspergillus flavus often invade many important corps and produce harmful aflatoxins both in preharvest and during storage stages. The regulation mechanism of aflatoxin biosynthesis in this fungus has not been well explored mainly due to the lack of an efficient transformation method for constructing a genome-wide gene mutant library. This challenge was resolved in this study, where a reliable and efficient Agrobacterium tumefaciens-mediated transformation (ATMT) protocol for A. flavus NRRL 3357 was established. The results showed that removal of multinucleate conidia, to collect a homogenous sample of uninucleate conidia for use as the transformation material, is the key step in this procedure. A. tumefaciens strain AGL-1 harboring the ble gene for zeocin resistance under the control of the gpdA promoter from A. nidulans is suitable for genetic transformation of this fungus. We successfully generated A. flavus transformants with an efficiency of ∼ 60 positive transformants per 10 6 conidia using our protocol. A small-scale insertional mutant library (∼ 1,000 mutants) was constructed using this method and the resulting several mutants lacked both production of conidia and aflatoxin biosynthesis capacity. Southern blotting analysis demonstrated that the majority of the transformants contained a single T-DNA insert on the genome. To the best of our knowledge, this is the first report of genetic transformation of A. flavus via ATMT and our protocol provides an effective tool for construction of genome-wide gene mutant libraries for functional analysis of important genes in A. flavus.
Yin, Xian; Shin, Hyun-dong; Li, Jianghua; Du, Guocheng; Liu, Long; Chen, Jian
2017-01-01
Despite a long and successful history of citrate production in Aspergillus niger, the molecular mechanism of citrate accumulation is only partially understood. In this study, we used comparative genomics and transcriptome analysis of citrate-producing strains—namely, A. niger H915-1 (citrate titer: 157 g L−1), A1 (117 g L−1), and L2 (76 g L−1)—to gain a genome-wide view of the mechanism of citrate accumulation. Compared with A. niger A1 and L2, A. niger H915-1 contained 92 mutated genes, including a succinate-semialdehyde dehydrogenase in the γ-aminobutyric acid shunt pathway and an aconitase family protein involved in citrate synthesis. Furthermore, transcriptome analysis of A. niger H915-1 revealed that the transcription levels of 479 genes changed between the cell growth stage (6 h) and the citrate synthesis stage (12 h, 24 h, 36 h, and 48 h). In the glycolysis pathway, triosephosphate isomerase was up-regulated, whereas pyruvate kinase was down-regulated. Two cytosol ATP-citrate lyases, which take part in the cycle of citrate synthesis, were up-regulated, and may coordinate with the alternative oxidases in the alternative respiratory pathway for energy balance. Finally, deletion of the oxaloacetate acetylhydrolase gene in H915-1 eliminated oxalate formation but neither influence on pH decrease nor difference in citrate production were observed. PMID:28106122
Prieto, R; Yousibova, G L; Woloshuk, C P
1996-01-01
Aspergillus flavus mutant strain 649, which has a genomic DNA deletion of at least 120 kb covering the aflatoxin biosynthesis cluster, was transformed with a series of overlapping cosmids that contained DNA harboring the cluster of genes. The mutant phenotype of strain 649 was rescued by transformation with a combination of cosmid clones 5E6, 8B9, and 13B9, indicating that the cluster of genes involved in aflatoxin biosynthesis resides in the 90 kb of A. flavus genomic DNA carried by these clones. Transformants 5E6 and 20B11 and transformants 5E6 and 8B9 accumulated intermediate metabolites of the aflatoxin pathway, which were identified as averufanin and/or averufin, respectively.These data suggest that avf1, which is involved in the conversion of averufin to versiconal hemiacetal acetate, was present in the cosmid 13B9. Deletion analysis of 13B9 located the gene on a 7-kb DNA fragment of the cosmid. Transformants containing cosmid 8B9 converted exogenously supplied O-methylsterigmatocystin to aflatoxin, indicating that the oxidoreductase gene (ord1), which mediates the conversion of O-methylsterigmatocystin to aflatoxin, is carried by this cosmid. The analysis of transformants containing deletions of 8B9 led to the localization of ord1 on a 3.3-kb A. flavus genomic DNA fragment of the cosmid. PMID:8967772
Braaksma, Machtelt; Martens-Uzunova, Elena S; Punt, Peter J; Schaap, Peter J
2010-10-19
The ecological niche occupied by a fungal species, its pathogenicity and its usefulness as a microbial cell factory to a large degree depends on its secretome. Protein secretion usually requires the presence of a N-terminal signal peptide (SP) and by scanning for this feature using available highly accurate SP-prediction tools, the fraction of potentially secreted proteins can be directly predicted. However, prediction of a SP does not guarantee that the protein is actually secreted and current in silico prediction methods suffer from gene-model errors introduced during genome annotation. A majority rule based classifier that also evaluates signal peptide predictions from the best homologs of three neighbouring Aspergillus species was developed to create an improved list of potential signal peptide containing proteins encoded by the Aspergillus niger genome. As a complement to these in silico predictions, the secretome associated with growth and upon carbon source depletion was determined using a shotgun proteomics approach. Overall, some 200 proteins with a predicted signal peptide were identified to be secreted proteins. Concordant changes in the secretome state were observed as a response to changes in growth/culture conditions. Additionally, two proteins secreted via a non-classical route operating in A. niger were identified. We were able to improve the in silico inventory of A. niger secretory proteins by combining different gene-model predictions from neighbouring Aspergilli and thereby avoiding prediction conflicts associated with inaccurate gene-models. The expected accuracy of signal peptide prediction for proteins that lack homologous sequences in the proteomes of related species is 85%. An experimental validation of the predicted proteome confirmed in silico predictions.
2010-01-01
Background The ecological niche occupied by a fungal species, its pathogenicity and its usefulness as a microbial cell factory to a large degree depends on its secretome. Protein secretion usually requires the presence of a N-terminal signal peptide (SP) and by scanning for this feature using available highly accurate SP-prediction tools, the fraction of potentially secreted proteins can be directly predicted. However, prediction of a SP does not guarantee that the protein is actually secreted and current in silico prediction methods suffer from gene-model errors introduced during genome annotation. Results A majority rule based classifier that also evaluates signal peptide predictions from the best homologs of three neighbouring Aspergillus species was developed to create an improved list of potential signal peptide containing proteins encoded by the Aspergillus niger genome. As a complement to these in silico predictions, the secretome associated with growth and upon carbon source depletion was determined using a shotgun proteomics approach. Overall, some 200 proteins with a predicted signal peptide were identified to be secreted proteins. Concordant changes in the secretome state were observed as a response to changes in growth/culture conditions. Additionally, two proteins secreted via a non-classical route operating in A. niger were identified. Conclusions We were able to improve the in silico inventory of A. niger secretory proteins by combining different gene-model predictions from neighbouring Aspergilli and thereby avoiding prediction conflicts associated with inaccurate gene-models. The expected accuracy of signal peptide prediction for proteins that lack homologous sequences in the proteomes of related species is 85%. An experimental validation of the predicted proteome confirmed in silico predictions. PMID:20959013
Purification and enzymatic characterization of a novel β-1,6-glucosidase from Aspergillus oryzae.
Watanabe, Akira; Suzuki, Moe; Ujiie, Seiryu; Gomi, Katsuya
2016-03-01
In this study, among the 10 genes that encode putative β-glucosidases in the glycoside hydrolase family 3 (GH3) with a signal peptide in the Aspergillus oryzae genome, we found a novel gene (AO090038000425) encoding β-1,6-glucosidase with a substrate specificity for gentiobiose. The transformant harboring AO090038000425, which we named bglH, was overexpressed under the control of the improved glaA gene promoter to form a small clear zone around the colony in a plate assay using 4-methylumbelliferyl β-d-glucopyranoside as the fluorogenic substrate for β-glucosidase. We purified BglH to homogeneity and enzymatically characterize this enzyme. The thermal and pH stabilities of BglH were higher than those of other previously studied A. oryzae β-glucosidases, and BglH was stable over a wide temperature range (4°C-60°C). BglH was inhibited by Hg(2+), Zn(2+), glucono-δ-lactone, glucose, dimethyl sulfoxide, and ethanol, but not by ethylenediaminetetraacetic acid. Interestingly, BglH preferentially hydrolyzed gentiobiose rather than other oligosaccharides and aryl β-glucosides, thereby demonstrating that this enzyme is a β-1,6-glucosidase. To the best of our knowledge, this is the first report of the purification and characterization of β-1,6-glucosidase from Aspergillus fungi or from other eukaryotes. This study suggests that it may be possible to find a more suitable β-glucosidase such as BglH for reducing the bitter taste of gentiobiose, and thus for controlling the sweetness of starch hydrolysates in the food industry via genome mining. Copyright © 2015 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
BGD: a database of bat genomes.
Fang, Jianfei; Wang, Xuan; Mu, Shuo; Zhang, Shuyi; Dong, Dong
2015-01-01
Bats account for ~20% of mammalian species, and are the only mammals with true powered flight. For the sake of their specialized phenotypic traits, many researches have been devoted to examine the evolution of bats. Until now, some whole genome sequences of bats have been assembled and annotated, however, a uniform resource for the annotated bat genomes is still unavailable. To make the extensive data associated with the bat genomes accessible to the general biological communities, we established a Bat Genome Database (BGD). BGD is an open-access, web-available portal that integrates available data of bat genomes and genes. It hosts data from six bat species, including two megabats and four microbats. Users can query the gene annotations using efficient searching engine, and it offers browsable tracks of bat genomes. Furthermore, an easy-to-use phylogenetic analysis tool was also provided to facilitate online phylogeny study of genes. To the best of our knowledge, BGD is the first database of bat genomes. It will extend our understanding of the bat evolution and be advantageous to the bat sequences analysis. BGD is freely available at: http://donglab.ecnu.edu.cn/databases/BatGenome/.
The UCSC Genome Browser database: extensions and updates 2013.
Meyer, Laurence R; Zweig, Ann S; Hinrichs, Angie S; Karolchik, Donna; Kuhn, Robert M; Wong, Matthew; Sloan, Cricket A; Rosenbloom, Kate R; Roe, Greg; Rhead, Brooke; Raney, Brian J; Pohl, Andy; Malladi, Venkat S; Li, Chin H; Lee, Brian T; Learned, Katrina; Kirkup, Vanessa; Hsu, Fan; Heitner, Steve; Harte, Rachel A; Haeussler, Maximilian; Guruvadoo, Luvina; Goldman, Mary; Giardine, Belinda M; Fujita, Pauline A; Dreszer, Timothy R; Diekhans, Mark; Cline, Melissa S; Clawson, Hiram; Barber, Galt P; Haussler, David; Kent, W James
2013-01-01
The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic datasets. As of September 2012, genomic sequence and a basic set of annotation 'tracks' are provided for 63 organisms, including 26 mammals, 13 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms, yeast and sea hare. In the past year 19 new genome assemblies have been added, and we anticipate releasing another 28 in early 2013. Further, a large number of annotation tracks have been either added, updated by contributors or remapped to the latest human reference genome. Among these are an updated UCSC Genes track for human and mouse assemblies. We have also introduced several features to improve usability, including new navigation menus. This article provides an update to the UCSC Genome Browser database, which has been previously featured in the Database issue of this journal.
The Yak genome database: an integrative database for studying yak biology and high-altitude adaption
2012-01-01
Background The yak (Bos grunniens) is a long-haired bovine that lives at high altitudes and is an important source of milk, meat, fiber and fuel. The recent sequencing, assembly and annotation of its genome are expected to further our understanding of the means by which it has adapted to life at high altitudes and its ecologically important traits. Description The Yak Genome Database (YGD) is an internet-based resource that provides access to genomic sequence data and predicted functional information concerning the genes and proteins of Bos grunniens. The curated data stored in the YGD includes genome sequences, predicted genes and associated annotations, non-coding RNA sequences, transposable elements, single nucleotide variants, and three-way whole-genome alignments between human, cattle and yak. YGD offers useful searching and data mining tools, including the ability to search for genes by name or using function keywords as well as GBrowse genome browsers and/or BLAST servers, which can be used to visualize genome regions and identify similar sequences. Sequence data from the YGD can also be downloaded to perform local searches. Conclusions A new yak genome database (YGD) has been developed to facilitate studies on high-altitude adaption and bovine genomics. The database will be continuously updated to incorporate new information such as transcriptome data and population resequencing data. The YGD can be accessed at http://me.lzu.edu.cn/yak. PMID:23134687
Schmedes, Sarah E; King, Jonathan L; Budowle, Bruce
2015-01-01
Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.
Assembly: a resource for assembled genomes at NCBI
Kitts, Paul A.; Church, Deanna M.; Thibaud-Nissen, Françoise; Choi, Jinna; Hem, Vichet; Sapojnikov, Victor; Smith, Robert G.; Tatusova, Tatiana; Xiang, Charlie; Zherikov, Andrey; DiCuccio, Michael; Murphy, Terence D.; Pruitt, Kim D.; Kimchi, Avi
2016-01-01
The NCBI Assembly database (www.ncbi.nlm.nih.gov/assembly/) provides stable accessioning and data tracking for genome assembly data. The model underlying the database can accommodate a range of assembly structures, including sets of unordered contig or scaffold sequences, bacterial genomes consisting of a single complete chromosome, or complex structures such as a human genome with modeled allelic variation. The database provides an assembly accession and version to unambiguously identify the set of sequences that make up a particular version of an assembly, and tracks changes to updated genome assemblies. The Assembly database reports metadata such as assembly names, simple statistical reports of the assembly (number of contigs and scaffolds, contiguity metrics such as contig N50, total sequence length and total gap length) as well as the assembly update history. The Assembly database also tracks the relationship between an assembly submitted to the International Nucleotide Sequence Database Consortium (INSDC) and the assembly represented in the NCBI RefSeq project. Users can find assemblies of interest by querying the Assembly Resource directly or by browsing available assemblies for a particular organism. Links in the Assembly Resource allow users to easily download sequence and annotations for current versions of genome assemblies from the NCBI genomes FTP site. PMID:26578580
WheatGenome.info: an integrated database and portal for wheat genome information.
Lai, Kaitao; Berkman, Paul J; Lorenc, Michal Tadeusz; Duran, Chris; Smits, Lars; Manoli, Sahana; Stiller, Jiri; Edwards, David
2012-02-01
Bread wheat (Triticum aestivum) is one of the most important crop plants, globally providing staple food for a large proportion of the human population. However, improvement of this crop has been limited due to its large and complex genome. Advances in genomics are supporting wheat crop improvement. We provide a variety of web-based systems hosting wheat genome and genomic data to support wheat research and crop improvement. WheatGenome.info is an integrated database resource which includes multiple web-based applications. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second-generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This system includes links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Courteau, J.
1991-10-11
Since the Genome Project began several years ago, a plethora of databases have been developed or are in the works. They range from the massive Genome Data Base at Johns Hopkins University, the central repository of all gene mapping information, to small databases focusing on single chromosomes or organisms. Some are publicly available, others are essentially private electronic lab notebooks. Still others limit access to a consortium of researchers working on, say, a single human chromosome. An increasing number incorporate sophisticated search and analytical software, while others operate as little more than data lists. In consultation with numerous experts inmore » the field, a list has been compiled of some key genome-related databases. The list was not limited to map and sequence databases but also included the tools investigators use to interpret and elucidate genetic data, such as protein sequence and protein structure databases. Because a major goal of the Genome Project is to map and sequence the genomes of several experimental animals, including E. coli, yeast, fruit fly, nematode, and mouse, the available databases for those organisms are listed as well. The author also includes several databases that are still under development - including some ambitious efforts that go beyond data compilation to create what are being called electronic research communities, enabling many users, rather than just one or a few curators, to add or edit the data and tag it as raw or confirmed.« less
HOWDY: an integrated database system for human genome research
Hirakawa, Mika
2002-01-01
HOWDY is an integrated database system for accessing and analyzing human genomic information (http://www-alis.tokyo.jst.go.jp/HOWDY/). HOWDY stores information about relationships between genetic objects and the data extracted from a number of databases. HOWDY consists of an Internet accessible user interface that allows thorough searching of the human genomic databases using the gene symbols and their aliases. It also permits flexible editing of the sequence data. The database can be searched using simple words and the search can be restricted to a specific cytogenetic location. Linear maps displaying markers and genes on contig sequences are available, from which an object can be chosen. Any search starting point identifies all the information matching the query. HOWDY provides a convenient search environment of human genomic data for scientists unsure which database is most appropriate for their search. PMID:11752279
Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal
2013-01-01
We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql –h database.nencki-genomics.org –u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface. Database URL: http://www.nencki-genomics.org. PMID:24089456
Specialized microbial databases for inductive exploration of microbial genome sequences
Fang, Gang; Ho, Christine; Qiu, Yaowu; Cubas, Virginie; Yu, Zhou; Cabau, Cédric; Cheung, Frankie; Moszer, Ivan; Danchin, Antoine
2005-01-01
Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore , a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya) has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis) associated to related organisms for comparison. PMID:15698474
iMETHYL: an integrative database of human DNA methylation, gene expression, and genomic variation.
Komaki, Shohei; Shiwa, Yuh; Furukawa, Ryohei; Hachiya, Tsuyoshi; Ohmomo, Hideki; Otomo, Ryo; Satoh, Mamoru; Hitomi, Jiro; Sobue, Kenji; Sasaki, Makoto; Shimizu, Atsushi
2018-01-01
We launched an integrative multi-omics database, iMETHYL (http://imethyl.iwate-megabank.org). iMETHYL provides whole-DNA methylation (~24 million autosomal CpG sites), whole-genome (~9 million single-nucleotide variants), and whole-transcriptome (>14 000 genes) data for CD4 + T-lymphocytes, monocytes, and neutrophils collected from approximately 100 subjects. These data were obtained from whole-genome bisulfite sequencing, whole-genome sequencing, and whole-transcriptome sequencing, making iMETHYL a comprehensive database.
Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.
Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E
2016-01-04
We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Shinohara, Yasutomo; Kawatani, Makoto; Futamura, Yushi; Osada, Hiroyuki; Koyama, Yasuji
2016-01-01
The filamentous fungus Aspergillus oryzae is an important industrial mold. Recent genomic analysis indicated that A. oryzae has a large number of biosynthetic genes for secondary metabolites (SMs), but many of the SMs they produce have not been identified. For better understanding of SMs production by A. oryzae, we screened a gene-disruption library of transcription factors including chromatin-remodeling factors and found two gene disruptions that show similarly altered SM production profiles. One is a homolog of Aspergillus nidulans cclA, a component of the histone 3 lysine 4 (H3K4) methyltransferase complex of proteins associated with Set1 complex, and the other, sppA, is an ortholog of Saccharomyces cerevisiae SPP1, another component of a complex of proteins associated with Set1 complex. The cclA and sppA disruptions in A. oryzae are deficient in trimethylation of H3K4. Furthermore, one of the SMs that increased in the cclA disruptant was identified as astellolide F (14-deacetyl astellolide B). These data indicate that both cclA and sppA affect production of SMs including astellolides by affecting the methylation status of H3K4 in A. oryzae.
Tong, Xunliang; Xu, Hongtao; Zou, Lihui; Cai, Meng; Xu, Xuefeng; Zhao, Zuotao; Xiao, Fei; Li, Yanming
2017-01-01
Invasive fungal infections acquired in the hospital have progressively emerged as an important cause of life-threatening infection. In particular, airborne fungi in hospitals are considered critical pathogens of hospital-associated infections. To identify the causative airborne microorganisms, high-volume air samplers were utilized for collection, and species identification was performed using a culture-based method and DNA sequencing analysis with the Illumina MiSeq and HiSeq 2000 sequencing systems. Few bacteria were grown after cultivation in blood agar. However, using microbiome sequencing, the relative abundance of fungi, Archaea species, bacteria and viruses was determined. The distribution characteristics of fungi were investigated using heat map analysis of four departments, including the Respiratory Intensive Care Unit, Intensive Care Unit, Emergency Room and Outpatient Department. The prevalence of Aspergillus among fungi was the highest at the species level, approximately 17% to 61%, and the prevalence of Aspergillus fumigatus among Aspergillus species was from 34% to 50% in the four departments. Draft genomes of microorganisms isolated from the hospital environment were obtained by sequence analysis, indicating that investigation into the diversity of airborne fungi may provide reliable results for hospital infection control and surveillance. PMID:28045065
NeisseriaBase: a specialised Neisseria genomic resource and analysis platform.
Zheng, Wenning; Mutha, Naresh V R; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah; Choo, Siew Woh
2016-01-01
Background. The gram-negative Neisseria is associated with two of the most potent human epidemic diseases: meningococcal meningitis and gonorrhoea. In both cases, disease is caused by bacteria colonizing human mucosal membrane surfaces. Overall, the genus shows great diversity and genetic variation mainly due to its ability to acquire and incorporate genetic material from a diverse range of sources through horizontal gene transfer. Although a number of databases exist for the Neisseria genomes, they are mostly focused on the pathogenic species. In this present study we present the freely available NeisseriaBase, a database dedicated to the genus Neisseria encompassing the complete and draft genomes of 15 pathogenic and commensal Neisseria species. Methods. The genomic data were retrieved from National Center for Biotechnology Information (NCBI) and annotated using the RAST server which were then stored into the MySQL database. The protein-coding genes were further analyzed to obtain information such as calculation of GC content (%), predicted hydrophobicity and molecular weight (Da) using in-house Perl scripts. The web application was developed following the secure four-tier web application architecture: (1) client workstation, (2) web server, (3) application server, and (4) database server. The web interface was constructed using PHP, JavaScript, jQuery, AJAX and CSS, utilizing the model-view-controller (MVC) framework. The in-house developed bioinformatics tools implemented in NeisseraBase were developed using Python, Perl, BioPerl and R languages. Results. Currently, NeisseriaBase houses 603,500 Coding Sequences (CDSs), 16,071 RNAs and 13,119 tRNA genes from 227 Neisseria genomes. The database is equipped with interactive web interfaces. Incorporation of the JBrowse genome browser in the database enables fast and smooth browsing of Neisseria genomes. NeisseriaBase includes the standard BLAST program to facilitate homology searching, and for Virulence Factor Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my.
NeisseriaBase: a specialised Neisseria genomic resource and analysis platform
Zheng, Wenning; Mutha, Naresh V.R.; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S.; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah
2016-01-01
Background. The gram-negative Neisseria is associated with two of the most potent human epidemic diseases: meningococcal meningitis and gonorrhoea. In both cases, disease is caused by bacteria colonizing human mucosal membrane surfaces. Overall, the genus shows great diversity and genetic variation mainly due to its ability to acquire and incorporate genetic material from a diverse range of sources through horizontal gene transfer. Although a number of databases exist for the Neisseria genomes, they are mostly focused on the pathogenic species. In this present study we present the freely available NeisseriaBase, a database dedicated to the genus Neisseria encompassing the complete and draft genomes of 15 pathogenic and commensal Neisseria species. Methods. The genomic data were retrieved from National Center for Biotechnology Information (NCBI) and annotated using the RAST server which were then stored into the MySQL database. The protein-coding genes were further analyzed to obtain information such as calculation of GC content (%), predicted hydrophobicity and molecular weight (Da) using in-house Perl scripts. The web application was developed following the secure four-tier web application architecture: (1) client workstation, (2) web server, (3) application server, and (4) database server. The web interface was constructed using PHP, JavaScript, jQuery, AJAX and CSS, utilizing the model-view-controller (MVC) framework. The in-house developed bioinformatics tools implemented in NeisseraBase were developed using Python, Perl, BioPerl and R languages. Results. Currently, NeisseriaBase houses 603,500 Coding Sequences (CDSs), 16,071 RNAs and 13,119 tRNA genes from 227 Neisseria genomes. The database is equipped with interactive web interfaces. Incorporation of the JBrowse genome browser in the database enables fast and smooth browsing of Neisseria genomes. NeisseriaBase includes the standard BLAST program to facilitate homology searching, and for Virulence Factor Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my. PMID:27017950
Genome-scale analysis of the high-efficient protein secretion system of Aspergillus oryzae
2014-01-01
Background The koji mold, Aspergillus oryzae is widely used for the production of industrial enzymes due to its particularly high protein secretion capacity and ability to perform post-translational modifications. However, systemic analysis of its secretion system is lacking, generally due to the poorly annotated proteome. Results Here we defined a functional protein secretory component list of A. oryzae using a previously reported secretory model of S. cerevisiae as scaffold. Additional secretory components were obtained by blast search with the functional components reported in other closely related fungal species such as Aspergillus nidulans and Aspergillus niger. To evaluate the defined component list, we performed transcriptome analysis on three α-amylase over-producing strains with varying levels of secretion capacities. Specifically, secretory components involved in the ER-associated processes (including components involved in the regulation of transport between ER and Golgi) were significantly up-regulated, with many of them never been identified for A. oryzae before. Furthermore, we defined a complete list of the putative A. oryzae secretome and monitored how it was affected by overproducing amylase. Conclusion In combination with the transcriptome data, the most complete secretory component list and the putative secretome, we improved the systemic understanding of the secretory machinery of A. oryzae in response to high levels of protein secretion. The roles of many newly predicted secretory components were experimentally validated and the enriched component list provides a better platform for driving more mechanistic studies of the protein secretory pathway in this industrially important fungus. PMID:24961398
Genome-scale analysis of the high-efficient protein secretion system of Aspergillus oryzae.
Liu, Lifang; Feizi, Amir; Österlund, Tobias; Hjort, Carsten; Nielsen, Jens
2014-06-24
The koji mold, Aspergillus oryzae is widely used for the production of industrial enzymes due to its particularly high protein secretion capacity and ability to perform post-translational modifications. However, systemic analysis of its secretion system is lacking, generally due to the poorly annotated proteome. Here we defined a functional protein secretory component list of A. oryzae using a previously reported secretory model of S. cerevisiae as scaffold. Additional secretory components were obtained by blast search with the functional components reported in other closely related fungal species such as Aspergillus nidulans and Aspergillus niger. To evaluate the defined component list, we performed transcriptome analysis on three α-amylase over-producing strains with varying levels of secretion capacities. Specifically, secretory components involved in the ER-associated processes (including components involved in the regulation of transport between ER and Golgi) were significantly up-regulated, with many of them never been identified for A. oryzae before. Furthermore, we defined a complete list of the putative A. oryzae secretome and monitored how it was affected by overproducing amylase. In combination with the transcriptome data, the most complete secretory component list and the putative secretome, we improved the systemic understanding of the secretory machinery of A. oryzae in response to high levels of protein secretion. The roles of many newly predicted secretory components were experimentally validated and the enriched component list provides a better platform for driving more mechanistic studies of the protein secretory pathway in this industrially important fungus.
CBS Genome Atlas Database: a dynamic storage for bioinformatic results and sequence data.
Hallin, Peter F; Ussery, David W
2004-12-12
Currently, new bacterial genomes are being published on a monthly basis. With the growing amount of genome sequence data, there is a demand for a flexible and easy-to-maintain structure for storing sequence data and results from bioinformatic analysis. More than 150 sequenced bacterial genomes are now available, and comparisons of properties for taxonomically similar organisms are not readily available to many biologists. In addition to the most basic information, such as AT content, chromosome length, tRNA count and rRNA count, a large number of more complex calculations are needed to perform detailed comparative genomics. DNA structural calculations like curvature and stacking energy, DNA compositions like base skews, oligo skews and repeats at the local and global level are just a few of the analysis that are presented on the CBS Genome Atlas Web page. Complex analysis, changing methods and frequent addition of new models are factors that require a dynamic database layout. Using basic tools like the GNU Make system, csh, Perl and MySQL, we have created a flexible database environment for storing and maintaining such results for a collection of complete microbial genomes. Currently, these results counts to more than 220 pieces of information. The backbone of this solution consists of a program package written in Perl, which enables administrators to synchronize and update the database content. The MySQL database has been connected to the CBS web-server via PHP4, to present a dynamic web content for users outside the center. This solution is tightly fitted to existing server infrastructure and the solutions proposed here can perhaps serve as a template for other research groups to solve database issues. A web based user interface which is dynamically linked to the Genome Atlas Database can be accessed via www.cbs.dtu.dk/services/GenomeAtlas/. This paper has a supplemental information page which links to the examples presented: www.cbs.dtu.dk/services/GenomeAtlas/suppl/bioinfdatabase.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reddy, Tatiparthi B. K.; Thomas, Alex D.; Stamatis, Dimitri
The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Within this paper, we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencingmore » projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. Lastly, GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards.« less
Martin, Stanton L; Blackmon, Barbara P; Rajagopalan, Ravi; Houfek, Thomas D; Sceeles, Robert G; Denn, Sheila O; Mitchell, Thomas K; Brown, Douglas E; Wing, Rod A; Dean, Ralph A
2002-01-01
We have created a federated database for genome studies of Magnaporthe grisea, the causal agent of rice blast disease, by integrating end sequence data from BAC clones, genetic marker data and BAC contig assembly data. A library of 9216 BAC clones providing >25-fold coverage of the entire genome was end sequenced and fingerprinted by HindIII digestion. The Image/FPC software package was then used to generate an assembly of 188 contigs covering >95% of the genome. The database contains the results of this assembly integrated with hybridization data of genetic markers to the BAC library. AceDB was used for the core database engine and a MySQL relational database, populated with numerical representations of BAC clones within FPC contigs, was used to create appropriately scaled images. The database is being used to facilitate sequencing efforts. The database also allows researchers mapping known genes or other sequences of interest, rapid and easy access to the fundamental organization of the M.grisea genome. This database, MagnaportheDB, can be accessed on the web at http://www.cals.ncsu.edu/fungal_genomics/mgdatabase/int.htm.
Shao, Wei; Shan, Jigui; Kearney, Mary F; Wu, Xiaolin; Maldarelli, Frank; Mellors, John W; Luke, Brian; Coffin, John M; Hughes, Stephen H
2016-07-04
The NCI Retrovirus Integration Database is a MySql-based relational database created for storing and retrieving comprehensive information about retroviral integration sites, primarily, but not exclusively, HIV-1. The database is accessible to the public for submission or extraction of data originating from experiments aimed at collecting information related to retroviral integration sites including: the site of integration into the host genome, the virus family and subtype, the origin of the sample, gene exons/introns associated with integration, and proviral orientation. Information about the references from which the data were collected is also stored in the database. Tools are built into the website that can be used to map the integration sites to UCSC genome browser, to plot the integration site patterns on a chromosome, and to display provirus LTRs in their inserted genome sequence. The website is robust, user friendly, and allows users to query the database and analyze the data dynamically. https://rid.ncifcrf.gov ; or http://home.ncifcrf.gov/hivdrp/resources.htm .
Phylogenomic and Domain Analysis of Iterative Polyketide Synthases in Aspergillus Species
Lin, Shu-Hsi; Yoshimoto, Miwa; Lyu, Ping-Chiang; Tang, Chuan-Yi; Arita, Masanori
2012-01-01
Aspergillus species are industrially and agriculturally important as fermentors and as producers of various secondary metabolites. Among them, fungal polyketides such as lovastatin and melanin are considered a gold mine for bioactive compounds. We used a phylogenomic approach to investigate the distribution of iterative polyketide synthases (PKS) in eight sequenced Aspergilli and classified over 250 fungal genes. Their genealogy by the conserved ketosynthase (KS) domain revealed three large groups of nonreducing PKS, one group inside bacterial PKS, and more than 9 small groups of reducing PKS. Polyphyly of nonribosomal peptide synthase (NRPS)-PKS genes raised questions regarding the recruitment of the elegant conjugation machinery. High rates of gene duplication and divergence were frequent. All data are accessible through our web database at http://metabolomics.jp/wiki/Category:PK. PMID:22844193
Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L
2016-01-04
The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu
2015-01-01
The microbial genome database for comparative analysis (MBGD) (available at http://mbgd.genome.ad.jp/) is a comprehensive ortholog database for flexible comparative analysis of microbial genomes, where the users are allowed to create an ortholog table among any specified set of organisms. Because of the rapid increase in microbial genome data owing to the next-generation sequencing technology, it becomes increasingly challenging to maintain high-quality orthology relationships while allowing the users to incorporate the latest genomic data available into an analysis. Because many of the recently accumulating genomic data are draft genome sequences for which some complete genome sequences of the same or closely related species are available, MBGD now stores draft genome data and allows the users to incorporate them into a user-specific ortholog database using the MyMBGD functionality. In this function, draft genome data are incorporated into an existing ortholog table created only from the complete genome data in an incremental manner to prevent low-quality draft data from affecting clustering results. In addition, to provide high-quality orthology relationships, the standard ortholog table containing all the representative genomes, which is first created by the rapid classification program DomClust, is now refined using DomRefine, a recently developed program for improving domain-level clustering using multiple sequence alignment information. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
USDA-ARS?s Scientific Manuscript database
The Cool Season Food Legume Genome database (CSFL, www.coolseasonfoodlegume.org) is an online resource for genomics, genetics, and breeding research for chickpea, lentil,pea, and faba bean. The user-friendly and curated website allows for all publicly available map,marker,trait, gene,transcript, ger...
Wang, Xinwang; Wadl, Phillip A; Wood-Jones, Alicia; Windham, Gary; Trigiano, Robert N; Scruggs, Mary; Pilgrim, Candace; Baird, Richard
2012-12-01
Simple sequence repeat (SSR) markers were developed from Aspergillus flavus expressed sequence tag (EST) database to conduct an analysis of genetic relationships of Aspergillus isolates from numerous host species and geographical regions, but primarily from the United States. Twenty-nine primers were designed from 362 tri-nucleotide EST-SSR sequences. Eighteen polymorphic loci were used to genotype 96 Aspergillus species isolates. The number of alleles detected per locus ranged from 2 to 24 with a mean of 8.2 alleles. Haploid diversity ranged from 0.28 to 0.91. Genetic distance matrix was used to perform principal coordinates analysis (PCA) and to generate dendrograms using unweighted pair group method with arithmetic mean (UPGMA). Two principal coordinates explained more than 75 % of the total variation among the isolates. One clade was identified for A. flavus isolates (n = 87) with the other Aspergillus species (n = 7) using PCA, but five distinct clusters were present when the others taxa were excluded from the analysis. Six groups were noted when the EST-SSR data were compared using UPGMA. However, the latter PCA or UPGMA comparison resulted in no direct associations with host species, geographical region or aflatoxin production. Furthermore, there was no direct correlation to visible morphological features such as sclerotial types. The isolates from Mississippi Delta region, which contained the largest percentage of isolates, did not show any unusual clustering except for isolates K32, K55, and 199. Further studies of these three isolates are warranted to evaluate their pathogenicity, aflatoxin production potential, additional gene sequences (e.g., RPB2), and morphological comparisons.
Ogishima, Soichi; Takai, Takako; Shimokawa, Kazuro; Nagaie, Satoshi; Tanaka, Hiroshi; Nakaya, Jun
2015-01-01
The Tohoku Medical Megabank project is a national project to revitalization of the disaster area in the Tohoku region by the Great East Japan Earthquake, and have conducted large-scale prospective genome-cohort study. Along with prospective genome-cohort study, we have developed integrated database and knowledge base which will be key database for realizing personalized prevention and medicine.
Swetha, Rayapadi G; Kala Sekar, Dinesh Kumar; Ramaiah, Sudha; Anbarasu, Anand; Sekar, Kanagaraj
2014-12-01
Haemophilus influenzae (H. Influenzae) is the causative agent of pneumonia, bacteraemia and meningitis. The organism is responsible for large number of deaths in both developed and developing countries. Even-though the first bacterial genome to be sequenced was that of H. Influenzae, there is no exclusive database dedicated for H. Influenzae. This prompted us to develop the Haemophilus influenzae Genome Database (HIGDB). All data of HIGDB are stored and managed in MySQL database. The HIGDB is hosted on Solaris server and developed using PERL modules. Ajax and JavaScript are used for the interface development. The HIGDB contains detailed information on 42,741 proteins, 18,077 genes including 10 whole genome sequences and also 284 three dimensional structures of proteins of H. influenzae. In addition, the database provides "Motif search" and "GBrowse". The HIGDB is freely accessible through the URL: http://bioserver1.physics.iisc.ernet.in/HIGDB/. The HIGDB will be a single point access for bacteriological, clinical, genomic and proteomic information of H. influenzae. The database can also be used to identify DNA motifs within H. influenzae genomes and to compare gene or protein sequences of a particular strain with other strains of H. influenzae. Copyright © 2014 Elsevier Ltd. All rights reserved.
MaizeGDB, the maize model organism database
USDA-ARS?s Scientific Manuscript database
MaizeGDB is the maize research community's database for maize genetic and genomic information. In this seminar I will outline our current endeavors including a full website redesign, the status of maize genome assembly and annotation projects, and work toward genome functional annotation. Mechanis...
Global Metabolic Reconstruction and Metabolic Gene Evolution in the Cattle Genome
Kim, Woonsu; Park, Hyesun; Seo, Seongwon
2016-01-01
The sequence of cattle genome provided a valuable opportunity to systematically link genetic and metabolic traits of cattle. The objectives of this study were 1) to reconstruct genome-scale cattle-specific metabolic pathways based on the most recent and updated cattle genome build and 2) to identify duplicated metabolic genes in the cattle genome for better understanding of metabolic adaptations in cattle. A bioinformatic pipeline of an organism for amalgamating genomic annotations from multiple sources was updated. Using this, an amalgamated cattle genome database based on UMD_3.1, was created. The amalgamated cattle genome database is composed of a total of 33,292 genes: 19,123 consensus genes between NCBI and Ensembl databases, 8,410 and 5,493 genes only found in NCBI or Ensembl, respectively, and 266 genes from NCBI scaffolds. A metabolic reconstruction of the cattle genome and cattle pathway genome database (PGDB) was also developed using Pathway Tools, followed by an intensive manual curation. The manual curation filled or revised 68 pathway holes, deleted 36 metabolic pathways, and added 23 metabolic pathways. Consequently, the curated cattle PGDB contains 304 metabolic pathways, 2,460 reactions including 2,371 enzymatic reactions, and 4,012 enzymes. Furthermore, this study identified eight duplicated genes in 12 metabolic pathways in the cattle genome compared to human and mouse. Some of these duplicated genes are related with specific hormone biosynthesis and detoxifications. The updated genome-scale metabolic reconstruction is a useful tool for understanding biology and metabolic characteristics in cattle. There has been significant improvements in the quality of cattle genome annotations and the MetaCyc database. The duplicated metabolic genes in the cattle genome compared to human and mouse implies evolutionary changes in the cattle genome and provides a useful information for further research on understanding metabolic adaptations of cattle. PMID:26992093
Random Mutagenesis of the Aspergillus oryzae Genome Results in Fungal Antibacterial Activity
Leonard, Cory A.; Brown, Stacy D.; Hayman, J. Russell
2013-01-01
Multidrug-resistant bacteria cause severe infections in hospitals and communities. Development of new drugs to combat resistant microorganisms is needed. Natural products of microbial origin are the source of most currently available antibiotics. We hypothesized that random mutagenesis of Aspergillus oryzae would result in secretion of antibacterial compounds. To address this hypothesis, we developed a screen to identify individual A. oryzae mutants that inhibit the growth of Methicillin-resistant Staphylococcus aureus (MRSA) in vitro. To randomly generate A. oryzae mutant strains, spores were treated with ethyl methanesulfonate (EMS). Over 3000 EMS-treated A. oryzae cultures were tested in the screen, and one isolate, CAL220, exhibited altered morphology and antibacterial activity. Culture supernatant from this isolate showed antibacterial activity against Methicillin-sensitive Staphylococcus aureus, MRSA, and Pseudomonas aeruginosa, but not Klebsiella pneumonia or Proteus vulgaris. The results of this study support our hypothesis and suggest that the screen used is sufficient and appropriate to detect secreted antibacterial fungal compounds resulting from mutagenesis of A. oryzae. Because the genome of A. oryzae has been sequenced and systems are available for genetic transformation of this organism, targeted as well as random mutations may be introduced to facilitate the discovery of novel antibacterial compounds using this system. PMID:23983696
Random Mutagenesis of the Aspergillus oryzae Genome Results in Fungal Antibacterial Activity.
Leonard, Cory A; Brown, Stacy D; Hayman, J Russell
2013-01-01
Multidrug-resistant bacteria cause severe infections in hospitals and communities. Development of new drugs to combat resistant microorganisms is needed. Natural products of microbial origin are the source of most currently available antibiotics. We hypothesized that random mutagenesis of Aspergillus oryzae would result in secretion of antibacterial compounds. To address this hypothesis, we developed a screen to identify individual A. oryzae mutants that inhibit the growth of Methicillin-resistant Staphylococcus aureus (MRSA) in vitro. To randomly generate A. oryzae mutant strains, spores were treated with ethyl methanesulfonate (EMS). Over 3000 EMS-treated A. oryzae cultures were tested in the screen, and one isolate, CAL220, exhibited altered morphology and antibacterial activity. Culture supernatant from this isolate showed antibacterial activity against Methicillin-sensitive Staphylococcus aureus, MRSA, and Pseudomonas aeruginosa, but not Klebsiella pneumonia or Proteus vulgaris. The results of this study support our hypothesis and suggest that the screen used is sufficient and appropriate to detect secreted antibacterial fungal compounds resulting from mutagenesis of A. oryzae. Because the genome of A. oryzae has been sequenced and systems are available for genetic transformation of this organism, targeted as well as random mutations may be introduced to facilitate the discovery of novel antibacterial compounds using this system.
Lightly, Tasia Joy; Phung, Ryan R; Sorensen, John L; Cardona, Silvia T
2017-05-01
Phenylacetic acid (PAA), an intermediate of phenylalanine degradation, is emerging as a signal molecule in microbial interactions with the host. In this work, we explore the presence of phenylalanine and PAA catabolism in 3 microbial pathogens of the cystic fibrosis (CF) lung microbiome: Pseudomonas aeruginosa, Burkholderia cenocepacia, and Aspergillus fumigatus. While in silico analysis of B. cenocepacia J2315 and A. fumigatus Af293 genome sequences showed complete pathways from phenylalanine to PAA, the P. aeruginosa PAO1 genome lacked several coding genes for phenylalanine and PAA catabolic enzymes. High-performance liquid chromatography analysis of supernatants from B. cenocepacia K56-2 detected PAA when grown in Luria-Bertani medium but not in synthetic cystic fibrosis sputum medium (SCFM). However, we were unable to identify PAA production by A. fumigatus or P. aeruginosa in any of the conditions tested. The inhibitory effect of B. cenocepacia on A. fumigatus growth was evaluated using agar plate interaction assays. Inhibition of fungal growth by B. cenocepacia was lessened in SCFM but this effect was not dependent on bacterial production of PAA. In summary, while we demonstrated PAA production by B. cenocepacia, we were not able to link this metabolite with the B. cenocepacia - A. fumigatus microbial interaction in CF nutritional conditions.
Comparative and functional characterization of intragenic tandem repeats in 10 Aspergillus genomes.
Gibbons, John G; Rokas, Antonis
2009-03-01
Intragenic tandem repeats (ITRs) are consecutive repeats of three or more nucleotides found in coding regions. ITRs are the underlying cause of several human genetic diseases and have been associated with phenotypic variation, including pathogenesis, in several clades of the tree of life. We have examined the evolution and functional role of ITRs in 10 genomes spanning the fungal genus Aspergillus, a clade of relevance to medicine, agriculture, and industry. We identified several hundred ITRs in each of the species examined. ITR content varied extensively between species, with an average 79% of ITRs unique to a given species. For the fraction of conserved ITR regions, sequence comparisons within species and between close relatives revealed that they were highly variable. ITR-containing proteins were evolutionarily less conserved, compositionally distinct, and overrepresented for domains associated with cell-surface localization and function relative to the rest of the proteome. Furthermore, ITRs were preferentially found in proteins involved in transcription, cellular communication, and cell-type differentiation but were underrepresented in proteins involved in metabolism and energy. Importantly, although ITRs were evolutionarily labile, their functional associations appeared. To be remarkably conserved across eukaryotes. Fungal ITRs likely participate in a variety of developmental processes and cell-surface-associated functions, suggesting that their contribution to fungal lifestyle and evolution may be more general than previously assumed.
Metabolic pathway reconstruction of eugenol to vanillin bioconversion in Aspergillus niger
Srivastava, Suchita; Luqman, Suaib; Khan, Feroz; Chanotiya, Chandan S; Darokar, Mahendra P
2010-01-01
Identification of missing genes or proteins participating in the metabolic pathways as enzymes are of great interest. One such class of pathway is involved in the eugenol to vanillin bioconversion. Our goal is to develop an integral approach for identifying the topology of a reference or known pathway in other organism. We successfully identify the missing enzymes and then reconstruct the vanillin biosynthetic pathway in Aspergillus niger. The procedure combines enzyme sequence similarity searched through BLAST homology search and orthologs detection through COG & KEGG databases. Conservation of protein domains and motifs was searched through CDD, PFAM & PROSITE databases. Predictions regarding how proteins act in pathway were validated experimentally and also compared with reported data. The bioconversion of vanillin was screened on UV-TLC plates and later confirmed through GC and GC-MS techniques. We applied a procedure for identifying missing enzymes on the basis of conserved functional motifs and later reconstruct the metabolic pathway in target organism. Using the vanillin biosynthetic pathway of Pseudomonas fluorescens as a case study, we indicate how this approach can be used to reconstruct the reference pathway in A. niger and later results were experimentally validated through chromatography and spectroscopy techniques. PMID:20978605
Human Mitochondrial Protein Database
National Institute of Standards and Technology Data Gateway
SRD 131 Human Mitochondrial Protein Database (Web, free access) The Human Mitochondrial Protein Database (HMPDb) provides comprehensive data on mitochondrial and human nuclear encoded proteins involved in mitochondrial biogenesis and function. This database consolidates information from SwissProt, LocusLink, Protein Data Bank (PDB), GenBank, Genome Database (GDB), Online Mendelian Inheritance in Man (OMIM), Human Mitochondrial Genome Database (mtDB), MITOMAP, Neuromuscular Disease Center and Human 2-D PAGE Databases. This database is intended as a tool not only to aid in studying the mitochondrion but in studying the associated diseases.
Design and implementation of the cacao genome database
USDA-ARS?s Scientific Manuscript database
The Cacao Genome Database (CGD, www.cacaogenomedb.org) is being developed to provide a comprehensive data mining resource of genomic, genetic and breeding data for Theobroma cacao. Designed using Chado and a collection of Drupal modules, known as Tripal, CGD currently contains the genetically anchor...
Uniform standards for genome databases in forest and fruit trees
USDA-ARS?s Scientific Manuscript database
TreeGenes and tfGDR serve the international forestry and fruit tree genomics research communities, respectively. These databases hold similar sequence data and provide resources for the submission and recovery of this information in order to enable comparative genomics research. Large-scale genotype...
SoyBase, The USDA-ARS Soybean Genetics and Genomics Database
USDA-ARS?s Scientific Manuscript database
SoyBase, the USDA-ARS soybean genetic database, is a comprehensive repository for professionally curated genetics, genomics and related data resources for soybean. SoyBase contains the most current genetic, physical and genomic sequence maps integrated with qualitative and quantitative traits. The...
Genome-wide association as a means to understanding the mammary gland
USDA-ARS?s Scientific Manuscript database
Next-generation sequencing and related technologies have facilitated the creation of enormous public databases that catalogue genomic variation. These databases have facilitated a variety of approaches to discover new genes that regulate normal biology as well as disease. Genome wide association (...
Meyer, Vera; Wanka, Franziska; van Gent, Janneke; Arentshorst, Mark; van den Hondel, Cees A. M. J. J.; Ram, Arthur F. J.
2011-01-01
Filamentous fungi are the cause of serious human and plant diseases but are also exploited in biotechnology as production platforms. Comparative genomics has documented their genetic diversity, and functional genomics and systems biology approaches are under way to understand the functions and interaction of fungal genes and proteins. In these approaches, gene functions are usually inferred from deletion or overexpression mutants. However, studies at these extreme points give only limited information. Moreover, many overexpression studies use metabolism-dependent promoters, often causing pleiotropic effects and thus limitations in their significance. We therefore established and systematically evaluated a tunable expression system for Aspergillus niger that is independent of carbon and nitrogen metabolism and silent under noninduced conditions. The system consists of two expression modules jointly targeted to a defined genomic locus. One module ensures constitutive expression of the tetracycline-dependent transactivator rtTA2S-M2, and one module harbors the rtTA2S-M2-dependent promoter that controls expression of the gene of interest (the Tet-on system). We show here that the system is tight, responds within minutes after inducer addition, and allows fine-tuning based on the inducer concentration or gene copy number up to expression levels higher than the expression levels of the gpdA promoter. We also validate the Tet-on system for the generation of conditional overexpression mutants and demonstrate its power when combined with a gene deletion approach. Finally, we show that the system is especially suitable when the functions of essential genes must be examined. PMID:21378046
Gibbons, John G.; Beauvais, Anne; Beau, Remi; McGary, Kriston L.
2012-01-01
Aspergillus fumigatus is the most common and deadly pulmonary fungal infection worldwide. In the lung, the fungus usually forms a dense colony of filaments embedded in a polymeric extracellular matrix. To identify candidate genes involved in this biofilm (BF) growth, we used RNA-Seq to compare the transcriptomes of BF and liquid plankton (PL) growth. Sequencing and mapping of tens of millions sequence reads against the A. fumigatus transcriptome identified 3,728 differentially regulated genes in the two conditions. Although many of these genes, including the ones coding for transcription factors, stress response, the ribosome, and the translation machinery, likely reflect the different growth demands in the two conditions, our experiment also identified hundreds of candidate genes for the observed differences in morphology and pathobiology between BF and PL. We found an overrepresentation of upregulated genes in transport, secondary metabolism, and cell wall and surface functions. Furthermore, upregulated genes showed significant spatial structure across the A. fumigatus genome; they were more likely to occur in subtelomeric regions and colocalized in 27 genomic neighborhoods, many of which overlapped with known or candidate secondary metabolism gene clusters. We also identified 1,164 genes that were downregulated. This gene set was not spatially structured across the genome and was overrepresented in genes participating in primary metabolic functions, including carbon and amino acid metabolism. These results add valuable insight into the genetics of biofilm formation in A. fumigatus and other filamentous fungi and identify many relevant, in the context of biofilm biology, candidate genes for downstream functional experiments. PMID:21724936
An ergot alkaloid biosynthesis gene and clustered hypothetical genes from Aspergillus fumigatus.
Coyle, Christine M; Panaccione, Daniel G
2005-06-01
The ergot alkaloids are a family of indole-derived mycotoxins with a variety of significant biological activities. Aspergillus fumigatus, a common airborne fungus and opportunistic human pathogen, and several fungi in the relatively distant taxon Clavicipitaceae (clavicipitaceous fungi) produce different sets of ergot alkaloids. The ergot alkaloids of these divergent fungi share a four-member ergoline ring but differ in the number, type, and position of the side chains. Several genes required for ergot alkaloid production are known in the clavicipitaceous fungi, and these genes are clustered in the genome of the ergot fungus Claviceps purpurea. We investigated whether the ergot alkaloids of A. fumigatus have a common biosynthetic and genetic origin with those of the clavicipitaceous fungi. A homolog of dmaW, the gene controlling the determinant step in the ergot alkaloid pathway of clavicipitaceous fungi, was identified in the A. fumigatus genome. Knockout of dmaW eliminated all known ergot alkaloids from A. fumigatus, and complementation of the mutation restored ergot alkaloid production. Clustered with dmaW in the A. fumigatus genome are sequences corresponding to five genes previously proposed to encode steps in the ergot alkaloid pathway of C. purpurea, as well as additional sequences whose deduced protein products are consistent with their involvement in the ergot alkaloid pathway. The corresponding genes have similarities in their nucleotide sequences, but the orientations and positions within the cluster of several of these genes differ. The data indicate that the ergot alkaloid biosynthetic capabilities in A. fumigatus and the clavicipitaceous fungi had a common origin.
An Ergot Alkaloid Biosynthesis Gene and Clustered Hypothetical Genes from Aspergillus fumigatus†
Coyle, Christine M.; Panaccione, Daniel G.
2005-01-01
The ergot alkaloids are a family of indole-derived mycotoxins with a variety of significant biological activities. Aspergillus fumigatus, a common airborne fungus and opportunistic human pathogen, and several fungi in the relatively distant taxon Clavicipitaceae (clavicipitaceous fungi) produce different sets of ergot alkaloids. The ergot alkaloids of these divergent fungi share a four-member ergoline ring but differ in the number, type, and position of the side chains. Several genes required for ergot alkaloid production are known in the clavicipitaceous fungi, and these genes are clustered in the genome of the ergot fungus Claviceps purpurea. We investigated whether the ergot alkaloids of A. fumigatus have a common biosynthetic and genetic origin with those of the clavicipitaceous fungi. A homolog of dmaW, the gene controlling the determinant step in the ergot alkaloid pathway of clavicipitaceous fungi, was identified in the A. fumigatus genome. Knockout of dmaW eliminated all known ergot alkaloids from A. fumigatus, and complementation of the mutation restored ergot alkaloid production. Clustered with dmaW in the A. fumigatus genome are sequences corresponding to five genes previously proposed to encode steps in the ergot alkaloid pathway of C. purpurea, as well as additional sequences whose deduced protein products are consistent with their involvement in the ergot alkaloid pathway. The corresponding genes have similarities in their nucleotide sequences, but the orientations and positions within the cluster of several of these genes differ. The data indicate that the ergot alkaloid biosynthetic capabilities in A. fumigatus and the clavicipitaceous fungi had a common origin. PMID:15933009
MPD: a pathogen genome and metagenome database
Zhang, Tingting; Miao, Jiaojiao; Han, Na; Qiang, Yujun; Zhang, Wen
2018-01-01
Abstract Advances in high-throughput sequencing have led to unprecedented growth in the amount of available genome sequencing data, especially for bacterial genomes, which has been accompanied by a challenge for the storage and management of such huge datasets. To facilitate bacterial research and related studies, we have developed the Mypathogen database (MPD), which provides access to users for searching, downloading, storing and sharing bacterial genomics data. The MPD represents the first pathogenic database for microbial genomes and metagenomes, and currently covers pathogenic microbial genomes (6604 genera, 11 071 species, 41 906 strains) and metagenomic data from host, air, water and other sources (28 816 samples). The MPD also functions as a management system for statistical and storage data that can be used by different organizations, thereby facilitating data sharing among different organizations and research groups. A user-friendly local client tool is provided to maintain the steady transmission of big sequencing data. The MPD is a useful tool for analysis and management in genomic research, especially for clinical Centers for Disease Control and epidemiological studies, and is expected to contribute to advancing knowledge on pathogenic bacteria genomes and metagenomes. Database URL: http://data.mypathogen.org PMID:29917040
MIPS: a database for protein sequences, homology data and yeast genome information.
Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F
1997-01-01
The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498
MIPS: analysis and annotation of proteins from whole genomes in 2005
Mewes, H. W.; Frishman, D.; Mayer, K. F. X.; Münsterkötter, M.; Noubibou, O.; Pagel, P.; Rattei, T.; Oesterheld, M.; Ruepp, A.; Stümpflen, V.
2006-01-01
The Munich Information Center for Protein Sequences (MIPS at the GSF), Neuherberg, Germany, provides resources related to genome information. Manually curated databases for several reference organisms are maintained. Several of these databases are described elsewhere in this and other recent NAR database issues. In a complementary effort, a comprehensive set of >400 genomes automatically annotated with the PEDANT system are maintained. The main goal of our current work on creating and maintaining genome databases is to extend gene centered information to information on interactions within a generic comprehensive framework. We have concentrated our efforts along three lines (i) the development of suitable comprehensive data structures and database technology, communication and query tools to include a wide range of different types of information enabling the representation of complex information such as functional modules or networks Genome Research Environment System, (ii) the development of databases covering computable information such as the basic evolutionary relations among all genes, namely SIMAP, the sequence similarity matrix and the CABiNet network analysis framework and (iii) the compilation and manual annotation of information related to interactions such as protein–protein interactions or other types of relations (e.g. MPCDB, MPPI, CYGD). All databases described and the detailed descriptions of our projects can be accessed through the MIPS WWW server (). PMID:16381839
MIPS: analysis and annotation of proteins from whole genomes in 2005.
Mewes, H W; Frishman, D; Mayer, K F X; Münsterkötter, M; Noubibou, O; Pagel, P; Rattei, T; Oesterheld, M; Ruepp, A; Stümpflen, V
2006-01-01
The Munich Information Center for Protein Sequences (MIPS at the GSF), Neuherberg, Germany, provides resources related to genome information. Manually curated databases for several reference organisms are maintained. Several of these databases are described elsewhere in this and other recent NAR database issues. In a complementary effort, a comprehensive set of >400 genomes automatically annotated with the PEDANT system are maintained. The main goal of our current work on creating and maintaining genome databases is to extend gene centered information to information on interactions within a generic comprehensive framework. We have concentrated our efforts along three lines (i) the development of suitable comprehensive data structures and database technology, communication and query tools to include a wide range of different types of information enabling the representation of complex information such as functional modules or networks Genome Research Environment System, (ii) the development of databases covering computable information such as the basic evolutionary relations among all genes, namely SIMAP, the sequence similarity matrix and the CABiNet network analysis framework and (iii) the compilation and manual annotation of information related to interactions such as protein-protein interactions or other types of relations (e.g. MPCDB, MPPI, CYGD). All databases described and the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.gsf.de).
Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.
Hiscock, D; Upton, C
2000-05-01
The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .
Choosing a genome browser for a Model Organism Database: surveying the Maize community
Sen, Taner Z.; Harper, Lisa C.; Schaeffer, Mary L.; Andorf, Carson M.; Seigfried, Trent E.; Campbell, Darwin A.; Lawrence, Carolyn J.
2010-01-01
As the B73 maize genome sequencing project neared completion, MaizeGDB began to integrate a graphical genome browser with its existing web interface and database. To ensure that maize researchers would optimally benefit from the potential addition of a genome browser to the existing MaizeGDB resource, personnel at MaizeGDB surveyed researchers’ needs. Collected data indicate that existing genome browsers for maize were inadequate and suggest implementation of a browser with quick interface and intuitive tools would meet most researchers’ needs. Here, we document the survey’s outcomes, review functionalities of available genome browser software platforms and offer our rationale for choosing the GBrowse software suite for MaizeGDB. Because the genome as represented within the MaizeGDB Genome Browser is tied to detailed phenotypic data, molecular marker information, available stocks, etc., the MaizeGDB Genome Browser represents a novel mechanism by which the researchers can leverage maize sequence information toward crop improvement directly. Database URL: http://gbrowse.maizegdb.org/ PMID:20627860
Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal
2013-01-01
We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql -h database.nencki-genomics.org -u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface.
Mackey, Aaron J; Pearson, William R
2004-10-01
Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
BioQ: tracing experimental origins in public genomic databases using a novel data provenance model.
Saccone, Scott F; Quan, Jiaxi; Jones, Peter L
2012-04-15
Public genomic databases, which are often used to guide genetic studies of human disease, are now being applied to genomic medicine through in silico integrative genomics. These databases, however, often lack tools for systematically determining the experimental origins of the data. We introduce a new data provenance model that we have implemented in a public web application, BioQ, for assessing the reliability of the data by systematically tracing its experimental origins to the original subjects and biologics. BioQ allows investigators to both visualize data provenance as well as explore individual elements of experimental process flow using precise tools for detailed data exploration and documentation. It includes a number of human genetic variation databases such as the HapMap and 1000 Genomes projects. BioQ is freely available to the public at http://bioq.saclab.net.
TabSQL: a MySQL tool to facilitate mapping user data to public databases.
Xia, Xiao-Qin; McClelland, Michael; Wang, Yipeng
2010-06-23
With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data.
TabSQL: a MySQL tool to facilitate mapping user data to public databases
2010-01-01
Background With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. Results We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. Conclusions TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data. PMID:20573251
Orthology for comparative genomics in the mouse genome database.
Dolan, Mary E; Baldarelli, Richard M; Bello, Susan M; Ni, Li; McAndrews, Monica S; Bult, Carol J; Kadin, James A; Richardson, Joel E; Ringwald, Martin; Eppig, Janan T; Blake, Judith A
2015-08-01
The mouse genome database (MGD) is the model organism database component of the mouse genome informatics system at The Jackson Laboratory. MGD is the international data resource for the laboratory mouse and facilitates the use of mice in the study of human health and disease. Since its beginnings, MGD has included comparative genomics data with a particular focus on human-mouse orthology, an essential component of the use of mouse as a model organism. Over the past 25 years, novel algorithms and addition of orthologs from other model organisms have enriched comparative genomics in MGD data, extending the use of orthology data to support the laboratory mouse as a model of human biology. Here, we describe current comparative data in MGD and review the history and refinement of orthology representation in this resource.
Analytical and computational approaches to define the Aspergillus niger secretome.
Tsang, Adrian; Butler, Gregory; Powlowski, Justin; Panisko, Ellen A; Baker, Scott E
2009-03-01
We used computational and mass spectrometric approaches to characterize the Aspergillus niger secretome.The 11,200 gene models predicted in the genome of A. niger strain ATCC 1015 were the data source for the analysis. Depending on the computational methods used, 691 to 881 proteins were predicted to be secreted proteins. We cultured A. niger in six different media and analyzed the extracellular proteins produced using mass spectrometry. A total of 222 proteins were identified, with 39 proteins expressed under all six conditions and 74 proteins expressed under only one condition. The secreted proteins identified by mass spectrometry were used to guide the correction of about 20 gene models. Additional analysis focused on extracellular enzymes of interest for biomass processing. Of the 63 glycoside hydrolases predicted to be capable of hydrolyzing cellulose, hemicellulose or pectin, 94% of the exo-acting enzymes and only 18% of the endo-acting enzymes were experimentally detected.
Analytical and computational approaches to define the Aspergillus niger secretome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tsang, Adrian; Butler, Gregory D.; Powlowski, Justin
2009-03-01
We used computational and mass spectrometric approaches to characterize the Aspergillus niger secretome. The 11,200 gene models predicted in the genome of A. niger strain ATCC 1015 were the data source for the analysis. Depending on the computational methods used, 691 to 881 proteins were predicted to be secreted proteins. We cultured A. niger in six different media and analyzed the extracellular proteins produced using mass spectrometry. A total of 222 proteins were identified, with 39 proteins expressed under all six conditions and 74 proteins expressed under only one condition. The secreted proteins identified by mass spectrometry were used tomore » guide the correction of about 20 gene models. Additional analysis focused on extracellular enzymes of interest for biomass processing. Of the 63 glycoside hydrolases predicted to be capable of hydrolyzing cellulose, hemicellulose or pectin, 94% of the exo-acting enzymes and only 18% of the endo-acting enzymes were experimentally detected.« less
The MaizeGDB Genome Browser tutorial: one example of database outreach to biologists via video.
Harper, Lisa C; Schaeffer, Mary L; Thistle, Jordan; Gardiner, Jack M; Andorf, Carson M; Campbell, Darwin A; Cannon, Ethalinda K S; Braun, Bremen L; Birkett, Scott M; Lawrence, Carolyn J; Sen, Taner Z
2011-01-01
Video tutorials are an effective way for researchers to quickly learn how to use online tools offered by biological databases. At MaizeGDB, we have developed a number of video tutorials that demonstrate how to use various tools and explicitly outline the caveats researchers should know to interpret the information available to them. One such popular video currently available is 'Using the MaizeGDB Genome Browser', which describes how the maize genome was sequenced and assembled as well as how the sequence can be visualized and interacted with via the MaizeGDB Genome Browser. Database
A Ruby API to query the Ensembl database for genomic features.
Strozzi, Francesco; Aerts, Jan
2011-04-01
The Ensembl database makes genomic features available via its Genome Browser. It is also possible to access the underlying data through a Perl API for advanced querying. We have developed a full-featured Ruby API to the Ensembl databases, providing the same functionality as the Perl interface with additional features. A single Ruby API is used to access different releases of the Ensembl databases and is also able to query multi-species databases. Most functionality of the API is provided using the ActiveRecord pattern. The library depends on introspection to make it release independent. The API is available through the Rubygem system and can be installed with the command gem install ruby-ensembl-api.
Damaging Effect of Low Energy N+ Implantation on Aspergillus niger Spores
NASA Astrophysics Data System (ADS)
Wang, Lisheng; Cai, Kezhou; Cheng, Maoji; Chen, Lijuan; Liu, Xuelan; Zhang, Shuqing; Yu, Zengliang
2007-06-01
The mutant effects of a keV range nitrogen ion (N+) beam on enzyme-producing probiotics were studied, particularly with regard to the induction in the genome. The electron spin resonance (ESR) results showed that the signal of ESR spectrum existed in both implanted and non-implanted spores, and the yields of free radicals increased in a dose-dependent manner. The ionic etching and dilapidation of cell wall could be observed distinctly through the scanning electron microscope (SEM). The mutagenic effect on genome indicated that N+ implantation could make base mutation. This study provided an insight into the roles low-energy ions might play in inducing mutagenesis of micro-organisms.
Mycobacteriophage genome database.
Joseph, Jerrine; Rajendran, Vasanthi; Hassan, Sameer; Kumar, Vanaja
2011-01-01
Mycobacteriophage genome database (MGDB) is an exclusive repository of the 64 completely sequenced mycobacteriophages with annotated information. It is a comprehensive compilation of the various gene parameters captured from several databases pooled together to empower mycobacteriophage researchers. The MGDB (Version No.1.0) comprises of 6086 genes from 64 mycobacteriophages classified into 72 families based on ACLAME database. Manual curation was aided by information available from public databases which was enriched further by analysis. Its web interface allows browsing as well as querying the classification. The main objective is to collect and organize the complexity inherent to mycobacteriophage protein classification in a rational way. The other objective is to browse the existing and new genomes and describe their functional annotation. The database is available for free at http://mpgdb.ibioinformatics.org/mpgdb.php.
CycADS: an annotation database system to ease the development and update of BioCyc databases
Vellozo, Augusto F.; Véron, Amélie S.; Baa-Puyoulet, Patrice; Huerta-Cepas, Jaime; Cottret, Ludovic; Febvay, Gérard; Calevro, Federica; Rahbé, Yvan; Douglas, Angela E.; Gabaldón, Toni; Sagot, Marie-France; Charles, Hubert; Colella, Stefano
2011-01-01
In recent years, genomes from an increasing number of organisms have been sequenced, but their annotation remains a time-consuming process. The BioCyc databases offer a framework for the integrated analysis of metabolic networks. The Pathway tool software suite allows the automated construction of a database starting from an annotated genome, but it requires prior integration of all annotations into a specific summary file or into a GenBank file. To allow the easy creation and update of a BioCyc database starting from the multiple genome annotation resources available over time, we have developed an ad hoc data management system that we called Cyc Annotation Database System (CycADS). CycADS is centred on a specific database model and on a set of Java programs to import, filter and export relevant information. Data from GenBank and other annotation sources (including for example: KAAS, PRIAM, Blast2GO and PhylomeDB) are collected into a database to be subsequently filtered and extracted to generate a complete annotation file. This file is then used to build an enriched BioCyc database using the PathoLogic program of Pathway Tools. The CycADS pipeline for annotation management was used to build the AcypiCyc database for the pea aphid (Acyrthosiphon pisum) whose genome was recently sequenced. The AcypiCyc database webpage includes also, for comparative analyses, two other metabolic reconstruction BioCyc databases generated using CycADS: TricaCyc for Tribolium castaneum and DromeCyc for Drosophila melanogaster. Linked to its flexible design, CycADS offers a powerful software tool for the generation and regular updating of enriched BioCyc databases. The CycADS system is particularly suited for metabolic gene annotation and network reconstruction in newly sequenced genomes. Because of the uniform annotation used for metabolic network reconstruction, CycADS is particularly useful for comparative analysis of the metabolism of different organisms. Database URL: http://www.cycadsys.org PMID:21474551
Maeda, Hiroshi; Sakai, Daisuke; Kobayashi, Takuji; Morita, Hiroto; Okamoto, Ayako; Takeuchi, Michio; Kusumoto, Ken-Ichi; Amano, Hitoshi; Ishida, Hiroki; Yamagata, Youhei
2016-06-01
Three extracellular dipeptidyl peptidase genes, dppB, dppE, and dppF, were unveiled by sequence analysis of the Aspergillus oryzae genome. We investigated their differential enzymatic profiles, in order to gain an understanding of the diversity of these genes. The three dipeptidyl peptidases were expressed using Aspergillus nidulans as the host. Each recombinant enzyme was purified and subsequently characterized. The enzymes displayed similar optimum pH values, but optimum temperatures, pH stabilities, and substrate specificities varied. DppB was identified as a Xaa-Prolyl dipeptidyl peptidase, while DppE scissile substrates were similar to the substrates for Aspergillus fumigatus DPPV (AfDPPV). DppF was found to be a novel enzyme that could digest both substrates for A. fumigatus DPPIV and AfDPPV. Semi-quantitative PCR revealed that the transcription of dppB in A. oryzae was induced by protein substrates and repressed by the addition of an inorganic nitrogen source, despite the presence of protein substrates. The transcription of dppE depended on its growth time, while the transcription of dppF was not affected by the type of the nitrogen source in the medium, and it started during the early stage of the fungal growth. Based on these results, we conclude that these enzymes may represent the nutrition acquisition enzymes. Additionally, DppF may be one of the sensor peptidases responsible for the detection of the protein substrates in A. oryzae environment. DppB may be involved in nitrogen assimilation control, since the transcription of dppB was repressed by NaNO3, despite the presence of protein substrates.
DroSpeGe: rapid access database for new Drosophila species genomes.
Gilbert, Donald G
2007-01-01
The Drosophila species comparative genome database DroSpeGe (http://insects.eugenes.org/DroSpeGe/) provides genome researchers with rapid, usable access to 12 new and old Drosophila genomes, since its inception in 2004. Scientists can use, with minimal computing expertise, the wealth of new genome information for developing new insights into insect evolution. New genome assemblies provided by several sequencing centers have been annotated with known model organism gene homologies and gene predictions to provided basic comparative data. TeraGrid supplies the shared cyberinfrastructure for the primary computations. This genome database includes homologies to Drosophila melanogaster and eight other eukaryote model genomes, and gene predictions from several groups. BLAST searches of the newest assemblies are integrated with genome maps. GBrowse maps provide detailed views of cross-species aligned genomes. BioMart provides for data mining of annotations and sequences. Common chromosome maps identify major synteny among species. Potential gain and loss of genes is suggested by Gene Ontology groupings for genes of the new species. Summaries of essential genome statistics include sizes, genes found and predicted, homology among genomes, phylogenetic trees of species and comparisons of several gene predictions for sensitivity and specificity in finding new and known genes.
Bio Warfare and Terrorism: Toxins and Other Mid-Spectrum Agents
2005-01-01
biotechnology, toxicogenomics, toxin, tetrodotoxin, and others. Once an agent has and proteomics may also help to open the door to the 276 Bio Warfare...also interferon gamma, interleukin-6, and tumor alsointrfern gmma intrlekin6, ad tmor by the mold Aspergillus flavus and commonly conta- necrosis factor...as bullets. No the new sciences of genomics and proteomics to alter toxoid or antitoxin is available, genetic code and to affect the expression of
CottonGen: a genomics, genetics and breeding database for cotton research
USDA-ARS?s Scientific Manuscript database
CottonGen (http://www.cottongen.org) is a curated and integrated web-based relational database providing access to publicly available genomic, genetic and breeding data for cotton. CottonGen supercedes CottonDB and the Cotton Marker Database, with enhanced tools for easier data sharing, mining, vis...
Use of Genomic Databases for Inquiry-Based Learning about Influenza
ERIC Educational Resources Information Center
Ledley, Fred; Ndung'u, Eric
2011-01-01
The genome projects of the past decades have created extensive databases of biological information with applications in both research and education. We describe an inquiry-based exercise that uses one such database, the National Center for Biotechnology Information Influenza Virus Resource, to advance learning about influenza. This database…
USDA-ARS?s Scientific Manuscript database
The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are...
Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui
2016-01-01
The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. © The Author(s) 2016. Published by Oxford University Press.
Exploration of the Chemical Space of Public Genomic Databases
The current project aims to chemically index the content of public genomic databases to make these data accessible in relation to other publicly available, chemically-indexed toxicological information.
Genomics Community Resources | Informatics Technology for Cancer Research (ITCR)
To facilitate genomic research and the dissemination of its products, National Human Genome Research Institute (NHGRI) supports genomic resources that are crucial for basic research, disease studies, model organism studies, and other biomedical research. Awards under this FOA will support the development and distribution of genomic resources that will be valuable for the broad research community, using cost-effective approaches. Such resources include (but are not limited to) databases and informatics resources (such as human and model organism databases, ontologies, and analysi
Damak, Naourez; Abdeljalil, Salma; Taeib, Noomen Hadj; Gargouri, Ali
2015-08-01
The rhg gene encoding a rhamnogalacturonase was isolated from the novel strain A1 of Aspergillus niger. It consists of an ORF of 1.505 kb encoding a putative protein of 446 amino acids with a predicted molecular mass of 47 kDa, belonging to the family 28 of glycosyl hydrolases. The nature and position of amino acids comprising the active site as well as the three-dimensional structure were well conserved between the A. niger CTM10548 and fungal rhamnogalacturonases. The coding region of the rhg gene is interrupted by three short introns of 56 (introns 1 and 3) and 52 (intron 2) bp in length. The comparison of the peptide sequence with A. niger rhg sequences revealed that the A1 rhg should be an endo-rhamnogalacturonases, more homologous to rhg A than rhg B A. niger known enzymes. The comparison of rhg nucleotide sequence from A. niger A1 with rhg A from A. niger shows several base changes. Most of these changes (59 %) are located at the third base of codons suggesting maintaining the same enzyme function. We used the rhamnogalacturonase A from Aspergillus aculeatus as a template to build a structural model of rhg A1 that adopted a right-handed parallel β-helix.
Weirick, Tyler; John, David; Uchida, Shizuka
2017-03-01
Maintaining the consistency of genomic annotations is an increasingly complex task because of the iterative and dynamic nature of assembly and annotation, growing numbers of biological databases and insufficient integration of annotations across databases. As information exchange among databases is poor, a 'novel' sequence from one reference annotation could be annotated in another. Furthermore, relationships to nearby or overlapping annotated transcripts are even more complicated when using different genome assemblies. To better understand these problems, we surveyed current and previous versions of genomic assemblies and annotations across a number of public databases containing long noncoding RNA. We identified numerous discrepancies of transcripts regarding their genomic locations, transcript lengths and identifiers. Further investigation showed that the positional differences between reference annotations of essentially the same transcript could lead to differences in its measured expression at the RNA level. To aid in resolving these problems, we present the algorithm 'Universal Genomic Accession Hash (UGAHash)' and created an open source web tool to encourage the usage of the UGAHash algorithm. The UGAHash web tool (http://ugahash.uni-frankfurt.de) can be accessed freely without registration. The web tool allows researchers to generate Universal Genomic Accessions for genomic features or to explore annotations deposited in the public databases of the past and present versions. We anticipate that the UGAHash web tool will be a valuable tool to check for the existence of transcripts before judging the newly discovered transcripts as novel. © The Author 2016. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
WGE: a CRISPR database for genome engineering.
Hodgkins, Alex; Farne, Anna; Perera, Sajith; Grego, Tiago; Parry-Smith, David J; Skarnes, William C; Iyer, Vivek
2015-09-15
The rapid development of CRISPR-Cas9 mediated genome editing techniques has given rise to a number of online and stand-alone tools to find and score CRISPR sites for whole genomes. Here we describe the Wellcome Trust Sanger Institute Genome Editing database (WGE), which uses novel methods to compute, visualize and select optimal CRISPR sites in a genome browser environment. The WGE database currently stores single and paired CRISPR sites and pre-calculated off-target information for CRISPRs located in the mouse and human exomes. Scoring and display of off-target sites is simple, and intuitive, and filters can be applied to identify high-quality CRISPR sites rapidly. WGE also provides a tool for the design and display of gene targeting vectors in the same genome browser, along with gene models, protein translation and variation tracks. WGE is open, extensible and can be set up to compute and present CRISPR sites for any genome. The WGE database is freely available at www.sanger.ac.uk/htgt/wge : vvi@sanger.ac.uk or skarnes@sanger.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
VCGDB: a dynamic genome database of the Chinese population
2014-01-01
Background The data released by the 1000 Genomes Project contain an increasing number of genome sequences from different nations and populations with a large number of genetic variations. As a result, the focus of human genome studies is changing from single and static to complex and dynamic. The currently available human reference genome (GRCh37) is based on sequencing data from 13 anonymous Caucasian volunteers, which might limit the scope of genomics, transcriptomics, epigenetics, and genome wide association studies. Description We used the massive amount of sequencing data published by the 1000 Genomes Project Consortium to construct the Virtual Chinese Genome Database (VCGDB), a dynamic genome database of the Chinese population based on the whole genome sequencing data of 194 individuals. VCGDB provides dynamic genomic information, which contains 35 million single nucleotide variations (SNVs), 0.5 million insertions/deletions (indels), and 29 million rare variations, together with genomic annotation information. VCGDB also provides a highly interactive user-friendly virtual Chinese genome browser (VCGBrowser) with functions like seamless zooming and real-time searching. In addition, we have established three population-specific consensus Chinese reference genomes that are compatible with mainstream alignment software. Conclusions VCGDB offers a feasible strategy for processing big data to keep pace with the biological data explosion by providing a robust resource for genomics studies; in particular, studies aimed at finding regions of the genome associated with diseases. PMID:24708222
Public variant databases: liability?
Thorogood, Adrian; Cook-Deegan, Robert; Knoppers, Bartha Maria
2017-07-01
Public variant databases support the curation, clinical interpretation, and sharing of genomic data, thus reducing harmful errors or delays in diagnosis. As variant databases are increasingly relied on in the clinical context, there is concern that negligent variant interpretation will harm patients and attract liability. This article explores the evolving legal duties of laboratories, public variant databases, and physicians in clinical genomics and recommends a governance framework for databases to promote responsible data sharing.Genet Med advance online publication 15 December 2016.
Lee, Taein; Cheng, Chun-Huai; Ficklin, Stephen; Yu, Jing; Humann, Jodi; Main, Dorrie
2017-01-01
Abstract Tripal is an open-source database platform primarily used for development of genomic, genetic and breeding databases. We report here on the release of the Chado Loader, Chado Data Display and Chado Search modules to extend the functionality of the core Tripal modules. These new extension modules provide additional tools for (1) data loading, (2) customized visualization and (3) advanced search functions for supported data types such as organism, marker, QTL/Mendelian Trait Loci, germplasm, map, project, phenotype, genotype and their respective metadata. The Chado Loader module provides data collection templates in Excel with defined metadata and data loaders with front end forms. The Chado Data Display module contains tools to visualize each data type and the metadata which can be used as is or customized as desired. The Chado Search module provides search and download functionality for the supported data types. Also included are the tools to visualize map and species summary. The use of materialized views in the Chado Search module enables better performance as well as flexibility of data modeling in Chado, allowing existing Tripal databases with different metadata types to utilize the module. These Tripal Extension modules are implemented in the Genome Database for Rosaceae (rosaceae.org), CottonGen (cottongen.org), Citrus Genome Database (citrusgenomedb.org), Genome Database for Vaccinium (vaccinium.org) and the Cool Season Food Legume Database (coolseasonfoodlegume.org). Database URL: https://www.citrusgenomedb.org/, https://www.coolseasonfoodlegume.org/, https://www.cottongen.org/, https://www.rosaceae.org/, https://www.vaccinium.org/
PlantRGDB: A Database of Plant Retrocopied Genes.
Wang, Yi
2017-01-01
RNA-based gene duplication, known as retrocopy, plays important roles in gene origination and genome evolution. The genomes of many plants have been sequenced, offering an opportunity to annotate and mine the retrocopies in plant genomes. However, comprehensive and unified annotation of retrocopies in these plants is still lacking. In this study I constructed the PlantRGDB (Plant Retrocopied Gene DataBase), the first database of plant retrocopies, to provide a putatively complete centralized list of retrocopies in plant genomes. The database is freely accessible at http://probes.pw.usda.gov/plantrgdb or http://aegilops.wheat.ucdavis.edu/plantrgdb. It currently integrates 49 plant species and 38,997 retrocopies along with characterization information. PlantRGDB provides a user-friendly web interface for searching, browsing and downloading the retrocopies in the database. PlantRGDB also offers graphical viewer-integrated sequence information for displaying the structure of each retrocopy. The attributes of the retrocopies of each species are reported using a browse function. In addition, useful tools, such as an advanced search and BLAST, are available to search the database more conveniently. In conclusion, the database will provide a web platform for obtaining valuable insight into the generation of retrocopies and will supplement research on gene duplication and genome evolution in plants. © The Author 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Sakai, Hiroaki; Lee, Sung Shin; Tanaka, Tsuyoshi; Numa, Hisataka; Kim, Jungsok; Kawahara, Yoshihiro; Wakimoto, Hironobu; Yang, Ching-chia; Iwamoto, Masao; Abe, Takashi; Yamada, Yuko; Muto, Akira; Inokuchi, Hachiro; Ikemura, Toshimichi; Matsumoto, Takashi; Sasaki, Takuji; Itoh, Takeshi
2013-02-01
The Rice Annotation Project Database (RAP-DB, http://rapdb.dna.affrc.go.jp/) has been providing a comprehensive set of gene annotations for the genome sequence of rice, Oryza sativa (japonica group) cv. Nipponbare. Since the first release in 2005, RAP-DB has been updated several times along with the genome assembly updates. Here, we present our newest RAP-DB based on the latest genome assembly, Os-Nipponbare-Reference-IRGSP-1.0 (IRGSP-1.0), which was released in 2011. We detected 37,869 loci by mapping transcript and protein sequences of 150 monocot species. To provide plant researchers with highly reliable and up to date rice gene annotations, we have been incorporating literature-based manually curated data, and 1,626 loci currently incorporate literature-based annotation data, including commonly used gene names or gene symbols. Transcriptional activities are shown at the nucleotide level by mapping RNA-Seq reads derived from 27 samples. We also mapped the Illumina reads of a Japanese leading japonica cultivar, Koshihikari, and a Chinese indica cultivar, Guangluai-4, to the genome and show alignments together with the single nucleotide polymorphisms (SNPs) and gene functional annotations through a newly developed browser, Short-Read Assembly Browser (S-RAB). We have developed two satellite databases, Plant Gene Family Database (PGFD) and Integrative Database of Cereal Gene Phylogeny (IDCGP), which display gene family and homologous gene relationships among diverse plant species. RAP-DB and the satellite databases offer simple and user-friendly web interfaces, enabling plant and genome researchers to access the data easily and facilitating a broad range of plant research topics.
The MaizeGDB Genome Browser tutorial: one example of database outreach to biologists via video
Harper, Lisa C.; Schaeffer, Mary L.; Thistle, Jordan; Gardiner, Jack M.; Andorf, Carson M.; Campbell, Darwin A.; Cannon, Ethalinda K.S.; Braun, Bremen L.; Birkett, Scott M.; Lawrence, Carolyn J.; Sen, Taner Z.
2011-01-01
Video tutorials are an effective way for researchers to quickly learn how to use online tools offered by biological databases. At MaizeGDB, we have developed a number of video tutorials that demonstrate how to use various tools and explicitly outline the caveats researchers should know to interpret the information available to them. One such popular video currently available is ‘Using the MaizeGDB Genome Browser’, which describes how the maize genome was sequenced and assembled as well as how the sequence can be visualized and interacted with via the MaizeGDB Genome Browser. Database URL: http://www.maizegdb.org/ PMID:21565781
Engel, Stacia R.; Cherry, J. Michael
2013-01-01
The first completed eukaryotic genome sequence was that of the yeast Saccharomyces cerevisiae, and the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the original model organism database. SGD remains the authoritative community resource for the S. cerevisiae reference genome sequence and its annotation, and continues to provide comprehensive biological information correlated with S. cerevisiae genes and their products. A diverse set of yeast strains have been sequenced to explore commercial and laboratory applications, and a brief history of those strains is provided. The publication of these new genomes has motivated the creation of new tools, and SGD will annotate and provide comparative analyses of these sequences, correlating changes with variations in strain phenotypes and protein function. We are entering a new era at SGD, as we incorporate these new sequences and make them accessible to the scientific community, all in an effort to continue in our mission of educating researchers and facilitating discovery. Database URL: http://www.yeastgenome.org/ PMID:23487186
Dhanasekaran, A Ranjitha; Pearson, Jon L; Ganesan, Balasubramanian; Weimer, Bart C
2015-02-25
Mass spectrometric analysis of microbial metabolism provides a long list of possible compounds. Restricting the identification of the possible compounds to those produced by the specific organism would benefit the identification process. Currently, identification of mass spectrometry (MS) data is commonly done using empirically derived compound databases. Unfortunately, most databases contain relatively few compounds, leaving long lists of unidentified molecules. Incorporating genome-encoded metabolism enables MS output identification that may not be included in databases. Using an organism's genome as a database restricts metabolite identification to only those compounds that the organism can produce. To address the challenge of metabolomic analysis from MS data, a web-based application to directly search genome-constructed metabolic databases was developed. The user query returns a genome-restricted list of possible compound identifications along with the putative metabolic pathways based on the name, formula, SMILES structure, and the compound mass as defined by the user. Multiple queries can be done simultaneously by submitting a text file created by the user or obtained from the MS analysis software. The user can also provide parameters specific to the experiment's MS analysis conditions, such as mass deviation, adducts, and detection mode during the query so as to provide additional levels of evidence to produce the tentative identification. The query results are provided as an HTML page and downloadable text file of possible compounds that are restricted to a specific genome. Hyperlinks provided in the HTML file connect the user to the curated metabolic databases housed in ProCyc, a Pathway Tools platform, as well as the KEGG Pathway database for visualization and metabolic pathway analysis. Metabolome Searcher, a web-based tool, facilitates putative compound identification of MS output based on genome-restricted metabolic capability. This enables researchers to rapidly extend the possible identifications of large data sets for metabolites that are not in compound databases. Putative compound names with their associated metabolic pathways from metabolomics data sets are returned to the user for additional biological interpretation and visualization. This novel approach enables compound identification by restricting the possible masses to those encoded in the genome.
Exploring Genetic, Genomic, and Phenotypic Data at the Rat Genome Database
Laulederkind, Stanley J. F.; Hayman, G. Thomas; Wang, Shur-Jen; Lowry, Timothy F.; Nigam, Rajni; Petri, Victoria; Smith, Jennifer R.; Dwinell, Melinda R.; Jacob, Howard J.; Shimoyama, Mary
2013-01-01
The laboratory rat, Rattus norvegicus, is an important model of human health and disease, and experimental findings in the rat have relevance to human physiology and disease. The Rat Genome Database (RGD, http://rgd.mcw.edu) is a model organism database that provides access to a wide variety of curated rat data including disease associations, phenotypes, pathways, molecular functions, biological processes and cellular components for genes, quantitative trait loci, and strains. We present an overview of the database followed by specific examples that can be used to gain experience in employing RGD to explore the wealth of functional data available for the rat. PMID:23255149
Choque, Elodie; Klopp, Christophe; Valiere, Sophie; Raynal, José; Mathieu, Florence
2018-03-15
Black Aspergilli represent one of the most important fungal resources of primary and secondary metabolites for biotechnological industry. Having several black Aspergilli sequenced genomes should allow targeting the production of certain metabolites with bioactive properties. In this study, we report the draft genome of a black Aspergilli, A. tubingensis G131, isolated from a French Mediterranean vineyard. This 35 Mb genome includes 10,994 predicted genes. A genomic-based discovery identifies 80 secondary metabolites biosynthetic gene clusters. Genomic sequences of these clusters were blasted on 3 chosen black Aspergilli genomes: A. tubingensis CBS 134.48, A. niger CBS 513.88 and A. kawachii IFO 4308. This comparison highlights different levels of clusters conservation between the four strains. It also allows identifying seven unique clusters in A. tubingensis G131. Moreover, the putative secondary metabolites clusters for asperazine and naphtho-gamma-pyrones production were proposed based on this genomic analysis. Key biosynthetic genes required for the production of 2 mycotoxins, ochratoxin A and fumonisin, are absent from this draft genome. Even if intergenic sequences of these mycotoxins biosynthetic pathways are present, this could not lead to the production of those mycotoxins by A. tubingensis G131. Functional and bioinformatics analyses of A. tubingensis G131 genome highlight its potential for metabolites production in particular for TAN-1612, asperazine and naphtho-gamma-pyrones presenting antioxidant, anticancer or antibiotic properties.
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency.
Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio
2015-01-01
Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.
Lu, Hongzhong; Cao, Weiqiang; Ouyang, Liming; Xia, Jianye; Huang, Mingzhi; Chu, Ju; Zhuang, Yingping; Zhang, Siliang; Noorman, Henk
2017-03-01
Aspergillus niger is one of the most important cell factories for industrial enzymes and organic acids production. A comprehensive genome-scale metabolic network model (GSMM) with high quality is crucial for efficient strain improvement and process optimization. The lack of accurate reaction equations and gene-protein-reaction associations (GPRs) in the current best model of A. niger named GSMM iMA871, however, limits its application scope. To overcome these limitations, we updated the A. niger GSMM by combining the latest genome annotation and literature mining technology. Compared with iMA871, the number of reactions in iHL1210 was increased from 1,380 to 1,764, and the number of unique ORFs from 871 to 1,210. With the aid of our transcriptomics analysis, the existence of 63% ORFs and 68% reactions in iHL1210 can be verified when glucose was used as the only carbon source. Physiological data from chemostat cultivations, 13 C-labeled and molecular experiments from the published literature were further used to check the performance of iHL1210. The average correlation coefficients between the predicted fluxes and estimated fluxes from 13 C-labeling data were sufficiently high (above 0.89) and the prediction of cell growth on most of the reported carbon and nitrogen sources was consistent. Using the updated genome-scale model, we evaluated gene essentiality on synthetic and yeast extract medium, as well as the effects of NADPH supply on glucoamylase production in A. niger. In summary, the new A. niger GSMM iHL1210 contains significant improvements with respect to the metabolic coverage and prediction performance, which paves the way for systematic metabolic engineering of A. niger. Biotechnol. Bioeng. 2017;114: 685-695. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
PoMaMo--a comprehensive database for potato genome data.
Meyer, Svenja; Nagel, Axel; Gebhardt, Christiane
2005-01-01
A database for potato genome data (PoMaMo, Potato Maps and More) was established. The database contains molecular maps of all twelve potato chromosomes with about 1000 mapped elements, sequence data, putative gene functions, results from BLAST analysis, SNP and InDel information from different diploid and tetraploid potato genotypes, publication references, links to other public databases like GenBank (http://www.ncbi.nlm.nih.gov/) or SGN (Solanaceae Genomics Network, http://www.sgn.cornell.edu/), etc. Flexible search and data visualization interfaces enable easy access to the data via internet (https://gabi.rzpd.de/PoMaMo.html). The Java servlet tool YAMB (Yet Another Map Browser) was designed to interactively display chromosomal maps. Maps can be zoomed in and out, and detailed information about mapped elements can be obtained by clicking on an element of interest. The GreenCards interface allows a text-based data search by marker-, sequence- or genotype name, by sequence accession number, gene function, BLAST Hit or publication reference. The PoMaMo database is a comprehensive database for different potato genome data, and to date the only database containing SNP and InDel data from diploid and tetraploid potato genotypes.
PoMaMo—a comprehensive database for potato genome data
Meyer, Svenja; Nagel, Axel; Gebhardt, Christiane
2005-01-01
A database for potato genome data (PoMaMo, Potato Maps and More) was established. The database contains molecular maps of all twelve potato chromosomes with about 1000 mapped elements, sequence data, putative gene functions, results from BLAST analysis, SNP and InDel information from different diploid and tetraploid potato genotypes, publication references, links to other public databases like GenBank (http://www.ncbi.nlm.nih.gov/) or SGN (Solanaceae Genomics Network, http://www.sgn.cornell.edu/), etc. Flexible search and data visualization interfaces enable easy access to the data via internet (https://gabi.rzpd.de/PoMaMo.html). The Java servlet tool YAMB (Yet Another Map Browser) was designed to interactively display chromosomal maps. Maps can be zoomed in and out, and detailed information about mapped elements can be obtained by clicking on an element of interest. The GreenCards interface allows a text-based data search by marker-, sequence- or genotype name, by sequence accession number, gene function, BLAST Hit or publication reference. The PoMaMo database is a comprehensive database for different potato genome data, and to date the only database containing SNP and InDel data from diploid and tetraploid potato genotypes. PMID:15608284
A searchable database for the genome of Phomopsis longicolla (isolate MSPL 10-6).
Darwish, Omar; Li, Shuxian; May, Zane; Matthews, Benjamin; Alkharouf, Nadim W
2016-01-01
Phomopsis longicolla (syn. Diaporthe longicolla) is an important seed-borne fungal pathogen that primarily causes Phomopsis seed decay (PSD) in most soybean production areas worldwide. This disease severely decreases soybean seed quality by reducing seed viability and oil quality, altering seed composition, and increasing frequencies of moldy and/or split beans. To facilitate investigation of the genetic base of fungal virulence factors and understand the mechanism of disease development, we designed and developed a database for P. longicolla isolate MSPL 10-6 that contains information about the genome assemblies (contigs), gene models, gene descriptions and GO functional ontologies. A web-based front end to the database was built using ASP.NET, which allows researchers to search and mine the genome of this important fungus. This database represents the first reported genome database for a seed borne fungal pathogen in the Diaporthe- Phomopsis complex. The database will also be a valuable resource for research and agricultural communities. It will aid in the development of new control strategies for this pathogen. http://bioinformatics.towson.edu/Phomopsis_longicolla/HomePage.aspx.
A searchable database for the genome of Phomopsis longicolla (isolate MSPL 10-6)
May, Zane; Matthews, Benjamin; Alkharouf, Nadim W.
2016-01-01
Phomopsis longicolla (syn. Diaporthe longicolla) is an important seed-borne fungal pathogen that primarily causes Phomopsis seed decay (PSD) in most soybean production areas worldwide. This disease severely decreases soybean seed quality by reducing seed viability and oil quality, altering seed composition, and increasing frequencies of moldy and/or split beans. To facilitate investigation of the genetic base of fungal virulence factors and understand the mechanism of disease development, we designed and developed a database for P. longicolla isolate MSPL 10-6 that contains information about the genome assemblies (contigs), gene models, gene descriptions and GO functional ontologies. A web-based front end to the database was built using ASP.NET, which allows researchers to search and mine the genome of this important fungus. This database represents the first reported genome database for a seed borne fungal pathogen in the Diaporthe– Phomopsis complex. The database will also be a valuable resource for research and agricultural communities. It will aid in the development of new control strategies for this pathogen. Availability: http://bioinformatics.towson.edu/Phomopsis_longicolla/HomePage.aspx PMID:28197060
THGS: a web-based database of Transmembrane Helices in Genome Sequences
Fernando, S. A.; Selvarani, P.; Das, Soma; Kumar, Ch. Kiran; Mondal, Sukanta; Ramakumar, S.; Sekar, K.
2004-01-01
Transmembrane Helices in Genome Sequences (THGS) is an interactive web-based database, developed to search the transmembrane helices in the user-interested gene sequences available in the Genome Database (GDB). The proposed database has provision to search sequence motifs in transmembrane and globular proteins. In addition, the motif can be searched in the other sequence databases (Swiss-Prot and PIR) or in the macromolecular structure database, Protein Data Bank (PDB). Further, the 3D structure of the corresponding queried motif, if it is available in the solved protein structures deposited in the Protein Data Bank, can also be visualized using the widely used graphics package RASMOL. All the sequence databases used in the present work are updated frequently and hence the results produced are up to date. The database THGS is freely available via the world wide web and can be accessed at http://pranag.physics.iisc.ernet.in/thgs/ or http://144.16.71.10/thgs/. PMID:14681375
Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants.
Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki
2014-09-01
In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops.
2013-01-01
Background Though India has sequenced water buffalo genome but its draft assembly is based on cattle genome BTau 4.0, thus de novo chromosome wise assembly is a major pending issue for global community. The existing radiation hybrid of buffalo and these reported STR can be used further in final gap plugging and “finishing” expected in de novo genome assembly. QTL and gene mapping needs mining of putative STR from buffalo genome at equal interval on each and every chromosome. Such markers have potential role in improvement of desirable characteristics, such as high milk yields, resistance to diseases, high growth rate. The STR mining from whole genome and development of user friendly database is yet to be done to reap the benefit of whole genome sequence. Description By in silico microsatellite mining of whole genome, we have developed first STR database of water buffalo, BuffSatDb (Buffalo MicroSatellite Database (http://cabindb.iasri.res.in/buffsatdb/) which is a web based relational database of 910529 microsatellite markers, developed using PHP and MySQL database. Microsatellite markers have been generated using MIcroSAtellite tool. It is simple and systematic web based search for customised retrieval of chromosome wise and genome-wide microsatellites. Search has been enabled based on chromosomes, motif type (mono-hexa), repeat motif and repeat kind (simple and composite). The search may be customised by limiting location of STR on chromosome as well as number of markers in that range. This is a novel approach and not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of the selected markers enabling researcher to select markers of choice at desired interval over the chromosome. The unique add-on of degenerate bases further helps in resolving presence of degenerate bases in current buffalo assembly. Conclusion Being first buffalo STR database in the world , this would not only pave the way in resolving current assembly problem but shall be of immense use for global community in QTL/gene mapping critically required to increase knowledge in the endeavour to increase buffalo productivity, especially for third world country where rural economy is significantly dependent on buffalo productivity. PMID:23336431
Sarika; Arora, Vasu; Iquebal, Mir Asif; Rai, Anil; Kumar, Dinesh
2013-01-19
Though India has sequenced water buffalo genome but its draft assembly is based on cattle genome BTau 4.0, thus de novo chromosome wise assembly is a major pending issue for global community. The existing radiation hybrid of buffalo and these reported STR can be used further in final gap plugging and "finishing" expected in de novo genome assembly. QTL and gene mapping needs mining of putative STR from buffalo genome at equal interval on each and every chromosome. Such markers have potential role in improvement of desirable characteristics, such as high milk yields, resistance to diseases, high growth rate. The STR mining from whole genome and development of user friendly database is yet to be done to reap the benefit of whole genome sequence. By in silico microsatellite mining of whole genome, we have developed first STR database of water buffalo, BuffSatDb (Buffalo MicroSatellite Database (http://cabindb.iasri.res.in/buffsatdb/) which is a web based relational database of 910529 microsatellite markers, developed using PHP and MySQL database. Microsatellite markers have been generated using MIcroSAtellite tool. It is simple and systematic web based search for customised retrieval of chromosome wise and genome-wide microsatellites. Search has been enabled based on chromosomes, motif type (mono-hexa), repeat motif and repeat kind (simple and composite). The search may be customised by limiting location of STR on chromosome as well as number of markers in that range. This is a novel approach and not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of the selected markers enabling researcher to select markers of choice at desired interval over the chromosome. The unique add-on of degenerate bases further helps in resolving presence of degenerate bases in current buffalo assembly. Being first buffalo STR database in the world , this would not only pave the way in resolving current assembly problem but shall be of immense use for global community in QTL/gene mapping critically required to increase knowledge in the endeavour to increase buffalo productivity, especially for third world country where rural economy is significantly dependent on buffalo productivity.
The Innate Immune Database (IIDB)
Korb, Martin; Rust, Aistair G; Thorsson, Vesteinn; Battail, Christophe; Li, Bin; Hwang, Daehee; Kennedy, Kathleen A; Roach, Jared C; Rosenberger, Carrie M; Gilchrist, Mark; Zak, Daniel; Johnson, Carrie; Marzolf, Bruz; Aderem, Alan; Shmulevich, Ilya; Bolouri, Hamid
2008-01-01
Background As part of a National Institute of Allergy and Infectious Diseases funded collaborative project, we have performed over 150 microarray experiments measuring the response of C57/BL6 mouse bone marrow macrophages to toll-like receptor stimuli. These microarray expression profiles are available freely from our project web site . Here, we report the development of a database of computationally predicted transcription factor binding sites and related genomic features for a set of over 2000 murine immune genes of interest. Our database, which includes microarray co-expression clusters and a host of web-based query, analysis and visualization facilities, is available freely via the internet. It provides a broad resource to the research community, and a stepping stone towards the delineation of the network of transcriptional regulatory interactions underlying the integrated response of macrophages to pathogens. Description We constructed a database indexed on genes and annotations of the immediate surrounding genomic regions. To facilitate both gene-specific and systems biology oriented research, our database provides the means to analyze individual genes or an entire genomic locus. Although our focus to-date has been on mammalian toll-like receptor signaling pathways, our database structure is not limited to this subject, and is intended to be broadly applicable to immunology. By focusing on selected immune-active genes, we were able to perform computationally intensive expression and sequence analyses that would currently be prohibitive if applied to the entire genome. Using six complementary computational algorithms and methodologies, we identified transcription factor binding sites based on the Position Weight Matrices available in TRANSFAC. For one example transcription factor (ATF3) for which experimental data is available, over 50% of our predicted binding sites coincide with genome-wide chromatin immnuopreciptation (ChIP-chip) results. Our database can be interrogated via a web interface. Genomic annotations and binding site predictions can be automatically viewed with a customized version of the Argo genome browser. Conclusion We present the Innate Immune Database (IIDB) as a community resource for immunologists interested in gene regulatory systems underlying innate responses to pathogens. The database website can be freely accessed at . PMID:18321385
BioQ: tracing experimental origins in public genomic databases using a novel data provenance model
Saccone, Scott F.; Quan, Jiaxi; Jones, Peter L.
2012-01-01
Motivation: Public genomic databases, which are often used to guide genetic studies of human disease, are now being applied to genomic medicine through in silico integrative genomics. These databases, however, often lack tools for systematically determining the experimental origins of the data. Results: We introduce a new data provenance model that we have implemented in a public web application, BioQ, for assessing the reliability of the data by systematically tracing its experimental origins to the original subjects and biologics. BioQ allows investigators to both visualize data provenance as well as explore individual elements of experimental process flow using precise tools for detailed data exploration and documentation. It includes a number of human genetic variation databases such as the HapMap and 1000 Genomes projects. Availability and implementation: BioQ is freely available to the public at http://bioq.saclab.net Contact: ssaccone@wustl.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22426342
Park, Jeongbin; Bae, Sangsu
2018-03-15
Following the type II CRISPR-Cas9 system, type V CRISPR-Cpf1 endonucleases have been found to be applicable for genome editing in various organisms in vivo. However, there are as yet no web-based tools capable of optimally selecting guide RNAs (gRNAs) among all possible genome-wide target sites. Here, we present Cpf1-Database, a genome-wide gRNA library design tool for LbCpf1 and AsCpf1, which have DNA recognition sequences of 5'-TTTN-3' at the 5' ends of target sites. Cpf1-Database provides a sophisticated but simple way to design gRNAs for AsCpf1 nucleases on the genome scale. One can easily access the data using a straightforward web interface, and using the powerful collections feature one can easily design gRNAs for thousands of genes in short time. Free access at http://www.rgenome.net/cpf1-database/. sangsubae@hanyang.ac.kr.
Watson, Douglas S.; Feng, Xizhi; Askew, David S.; Jambunathan, Kalyani; Kodukula, Krishna; Galande, Amit K.
2011-01-01
Background The filamentous fungus Aspergillus fumigatus (AF) can cause devastating infections in immunocompromised individuals. Early diagnosis improves patient outcomes but remains challenging because of the limitations of current methods. To augment the clinician's toolkit for rapid diagnosis of AF infections, we are investigating AF secreted proteases as novel diagnostic targets. The AF genome encodes up to 100 secreted proteases, but fewer than 15 of these enzymes have been characterized thus far. Given the large number of proteases in the genome, studies focused on individual enzymes may overlook potential diagnostic biomarkers. Methodology and Principal Findings As an alternative, we employed a combinatorial library of internally quenched fluorogenic probes (IQFPs) to profile the global proteolytic secretome of an AF clinical isolate in vitro. Comparative protease activity profiling revealed 212 substrate sequences that were cleaved by AF secreted proteases but not by normal human serum. A central finding was that isoleucine, leucine, phenylalanine, and tyrosine predominated at each of the three variable positions of the library (44.1%, 59.1%, and 57.0%, respectively) among substrate sequences cleaved by AF secreted proteases. In contrast, fewer than 10% of the residues at each position of cleaved sequences were cationic or anionic. Consensus substrate motifs were cleaved by thermostable serine proteases that retained activity up to 50°C. Precise proteolytic cleavage sites were reliably determined by a simple, rapid mass spectrometry-based method, revealing predominantly non-prime side specificity. A comparison of the secreted protease activities of three AF clinical isolates revealed consistent protease substrate specificity fingerprints. However, secreted proteases of A. flavus, A. nidulans, and A. terreus strains exhibited striking differences in their proteolytic signatures. Conclusions This report provides proof-of-principle for the use of protease substrate specificity profiling to define the proteolytic secretome of Aspergillus fumigatus. Expansion of this technique to protease secretion during infection could lead to development of novel approaches to fungal diagnosis. PMID:21695046
Public variant databases: liability?
Thorogood, Adrian; Cook-Deegan, Robert; Knoppers, Bartha Maria
2017-01-01
Public variant databases support the curation, clinical interpretation, and sharing of genomic data, thus reducing harmful errors or delays in diagnosis. As variant databases are increasingly relied on in the clinical context, there is concern that negligent variant interpretation will harm patients and attract liability. This article explores the evolving legal duties of laboratories, public variant databases, and physicians in clinical genomics and recommends a governance framework for databases to promote responsible data sharing. Genet Med advance online publication 15 December 2016 PMID:27977006
Reconstruction of metabolic pathways for the cattle genome
Seo, Seongwon; Lewin, Harris A
2009-01-01
Background Metabolic reconstruction of microbial, plant and animal genomes is a necessary step toward understanding the evolutionary origins of metabolism and species-specific adaptive traits. The aims of this study were to reconstruct conserved metabolic pathways in the cattle genome and to identify metabolic pathways with missing genes and proteins. The MetaCyc database and PathwayTools software suite were chosen for this work because they are widely used and easy to implement. Results An amalgamated cattle genome database was created using the NCBI and Ensembl cattle genome databases (based on build 3.1) as data sources. PathwayTools was used to create a cattle-specific pathway genome database, which was followed by comprehensive manual curation for the reconstruction of metabolic pathways. The curated database, CattleCyc 1.0, consists of 217 metabolic pathways. A total of 64 mammalian-specific metabolic pathways were modified from the reference pathways in MetaCyc, and two pathways previously identified but missing from MetaCyc were added. Comparative analysis of metabolic pathways revealed the absence of mammalian genes for 22 metabolic enzymes whose activity was reported in the literature. We also identified six human metabolic protein-coding genes for which the cattle ortholog is missing from the sequence assembly. Conclusion CattleCyc is a powerful tool for understanding the biology of ruminants and other cetartiodactyl species. In addition, the approach used to develop CattleCyc provides a framework for the metabolic reconstruction of other newly sequenced mammalian genomes. It is clear that metabolic pathway analysis strongly reflects the quality of the underlying genome annotations. Thus, having well-annotated genomes from many mammalian species hosted in BioCyc will facilitate the comparative analysis of metabolic pathways among different species and a systems approach to comparative physiology. PMID:19284618
Guhlin, Joseph; Silverstein, Kevin A T; Zhou, Peng; Tiffin, Peter; Young, Nevin D
2017-08-10
Rapid generation of omics data in recent years have resulted in vast amounts of disconnected datasets without systemic integration and knowledge building, while individual groups have made customized, annotated datasets available on the web with few ways to link them to in-lab datasets. With so many research groups generating their own data, the ability to relate it to the larger genomic and comparative genomic context is becoming increasingly crucial to make full use of the data. The Omics Database Generator (ODG) allows users to create customized databases that utilize published genomics data integrated with experimental data which can be queried using a flexible graph database. When provided with omics and experimental data, ODG will create a comparative, multi-dimensional graph database. ODG can import definitions and annotations from other sources such as InterProScan, the Gene Ontology, ENZYME, UniPathway, and others. This annotation data can be especially useful for studying new or understudied species for which transcripts have only been predicted, and rapidly give additional layers of annotation to predicted genes. In better studied species, ODG can perform syntenic annotation translations or rapidly identify characteristics of a set of genes or nucleotide locations, such as hits from an association study. ODG provides a web-based user-interface for configuring the data import and for querying the database. Queries can also be run from the command-line and the database can be queried directly through programming language hooks available for most languages. ODG supports most common genomic formats as well as generic, easy to use tab-separated value format for user-provided annotations. ODG is a user-friendly database generation and query tool that adapts to the supplied data to produce a comparative genomic database or multi-layered annotation database. ODG provides rapid comparative genomic annotation and is therefore particularly useful for non-model or understudied species. For species for which more data are available, ODG can be used to conduct complex multi-omics, pattern-matching queries.
Gramene database in 2010: updates and extensions.
Youens-Clark, Ken; Buckler, Ed; Casstevens, Terry; Chen, Charles; Declerck, Genevieve; Derwent, Paul; Dharmawardhana, Palitha; Jaiswal, Pankaj; Kersey, Paul; Karthikeyan, A S; Lu, Jerry; McCouch, Susan R; Ren, Liya; Spooner, William; Stein, Joshua C; Thomason, Jim; Wei, Sharon; Ware, Doreen
2011-01-01
Now in its 10th year, the Gramene database (http://www.gramene.org) has grown from its primary focus on rice, the first fully-sequenced grass genome, to become a resource for major model and crop plants including Arabidopsis, Brachypodium, maize, sorghum, poplar and grape in addition to several species of rice. Gramene began with the addition of an Ensembl genome browser and has expanded in the last decade to become a robust resource for plant genomics hosting a wide array of data sets including quantitative trait loci (QTL), metabolic pathways, genetic diversity, genes, proteins, germplasm, literature, ontologies and a fully-structured markers and sequences database integrated with genome browsers and maps from various published studies (genetic, physical, bin, etc.). In addition, Gramene now hosts a variety of web services including a Distributed Annotation Server (DAS), BLAST and a public MySQL database. Twice a year, Gramene releases a major build of the database and makes interim releases to correct errors or to make important updates to software and/or data.
Benchmarking database performance for genomic data.
Khushi, Matloob
2015-06-01
Genomic regions represent features such as gene annotations, transcription factor binding sites and epigenetic modifications. Performing various genomic operations such as identifying overlapping/non-overlapping regions or nearest gene annotations are common research needs. The data can be saved in a database system for easy management, however, there is no comprehensive database built-in algorithm at present to identify overlapping regions. Therefore I have developed a novel region-mapping (RegMap) SQL-based algorithm to perform genomic operations and have benchmarked the performance of different databases. Benchmarking identified that PostgreSQL extracts overlapping regions much faster than MySQL. Insertion and data uploads in PostgreSQL were also better, although general searching capability of both databases was almost equivalent. In addition, using the algorithm pair-wise, overlaps of >1000 datasets of transcription factor binding sites and histone marks, collected from previous publications, were reported and it was found that HNF4G significantly co-locates with cohesin subunit STAG1 (SA1).Inc. © 2015 Wiley Periodicals, Inc.
Accessing the SEED genome databases via Web services API: tools for programmers.
Disz, Terry; Akhter, Sajia; Cuevas, Daniel; Olson, Robert; Overbeek, Ross; Vonstein, Veronika; Stevens, Rick; Edwards, Robert A
2010-06-14
The SEED integrates many publicly available genome sequences into a single resource. The database contains accurate and up-to-date annotations based on the subsystems concept that leverages clustering between genomes and other clues to accurately and efficiently annotate microbial genomes. The backend is used as the foundation for many genome annotation tools, such as the Rapid Annotation using Subsystems Technology (RAST) server for whole genome annotation, the metagenomics RAST server for random community genome annotations, and the annotation clearinghouse for exchanging annotations from different resources. In addition to a web user interface, the SEED also provides Web services based API for programmatic access to the data in the SEED, allowing the development of third-party tools and mash-ups. The currently exposed Web services encompass over forty different methods for accessing data related to microbial genome annotations. The Web services provide comprehensive access to the database back end, allowing any programmer access to the most consistent and accurate genome annotations available. The Web services are deployed using a platform independent service-oriented approach that allows the user to choose the most suitable programming platform for their application. Example code demonstrate that Web services can be used to access the SEED using common bioinformatics programming languages such as Perl, Python, and Java. We present a novel approach to access the SEED database. Using Web services, a robust API for access to genomics data is provided, without requiring large volume downloads all at once. The API ensures timely access to the most current datasets available, including the new genomes as soon as they come online.
Analysis of the Genome and Chromium Metabolism-Related Genes of Serratia sp. S2.
Dong, Lanlan; Zhou, Simin; He, Yuan; Jia, Yan; Bai, Qunhua; Deng, Peng; Gao, Jieying; Li, Yingli; Xiao, Hong
2018-05-01
This study is to investigate the genome sequence of Serratia sp. S2. The genomic DNA of Serratia sp. S2 was extracted and the sequencing library was constructed. The sequencing was carried out by Illumina 2000 and complete genomic sequences were obtained. Gene function annotation and bioinformatics analysis were performed by comparing with the known databases. The genome size of Serratia sp. S2 was 5,604,115 bp and the G+C content was 57.61%. There were 5373 protein coding genes, and 3732, 3614, and 3942 genes were respectively annotated into the GO, KEGG, and COG databases. There were 12 genes related to chromium metabolism in the Serratia sp. S2 genome. The whole genome sequence of Serratia sp. S2 is submitted to the GenBank database with gene accession number of LNRP00000000. Our findings may provide theoretical basis for the subsequent development of new biotechnology to repair environmental chromium pollution.
Heterologous Production of a Novel Cyclic Peptide Compound, KK-1, in Aspergillus oryzae.
Yoshimi, Akira; Yamaguchi, Sigenari; Fujioka, Tomonori; Kawai, Kiyoshi; Gomi, Katsuya; Machida, Masayuki; Abe, Keietsu
2018-01-01
A novel cyclic peptide compound, KK-1, was originally isolated from the plant-pathogenic fungus Curvularia clavata . It consists of 10 amino acid residues, including five N -methylated amino acid residues, and has potent antifungal activity. Recently, the genome-sequencing analysis of C. clavata was completed, and the biosynthetic genes involved in KK-1 production were predicted by using a novel gene cluster mining tool, MIDDAS-M. These genes form an approximately 75-kb cluster, which includes nine open reading frames, containing a non-ribosomal peptide synthetase (NRPS) gene. To determine whether the predicted genes were responsible for the biosynthesis of KK-1, we performed heterologous production of KK-1 in Aspergillus oryzae by introduction of the cluster genes into the genome of A. oryzae . The NRPS gene was split in two fragments and then reconstructed in the A. oryzae genome, because the gene was quite large (approximately 40 kb). The remaining seven genes in the cluster, excluding the regulatory gene kkR , were simultaneously introduced into the strain of A. oryzae in which NRPS had already been incorporated. To evaluate the heterologous production of KK-1 in A. oryzae , gene expression was analyzed by RT-PCR and KK-1 productivity was quantified by HPLC. KK-1 was produced in variable quantities by a number of transformed strains, along with expression of the cluster genes. The amount of KK-1 produced by the strain with the greatest expression of all genes was lower than that produced by the original producer, C. clavata . Therefore, expression of the cluster genes is necessary and sufficient for the heterologous production of KK-1 in A. oryzae , although there may be unknown factors limiting productivity in this species.
Heterologous Production of a Novel Cyclic Peptide Compound, KK-1, in Aspergillus oryzae
Yoshimi, Akira; Yamaguchi, Sigenari; Fujioka, Tomonori; Kawai, Kiyoshi; Gomi, Katsuya; Machida, Masayuki; Abe, Keietsu
2018-01-01
A novel cyclic peptide compound, KK-1, was originally isolated from the plant-pathogenic fungus Curvularia clavata. It consists of 10 amino acid residues, including five N-methylated amino acid residues, and has potent antifungal activity. Recently, the genome-sequencing analysis of C. clavata was completed, and the biosynthetic genes involved in KK-1 production were predicted by using a novel gene cluster mining tool, MIDDAS-M. These genes form an approximately 75-kb cluster, which includes nine open reading frames, containing a non-ribosomal peptide synthetase (NRPS) gene. To determine whether the predicted genes were responsible for the biosynthesis of KK-1, we performed heterologous production of KK-1 in Aspergillus oryzae by introduction of the cluster genes into the genome of A. oryzae. The NRPS gene was split in two fragments and then reconstructed in the A. oryzae genome, because the gene was quite large (approximately 40 kb). The remaining seven genes in the cluster, excluding the regulatory gene kkR, were simultaneously introduced into the strain of A. oryzae in which NRPS had already been incorporated. To evaluate the heterologous production of KK-1 in A. oryzae, gene expression was analyzed by RT-PCR and KK-1 productivity was quantified by HPLC. KK-1 was produced in variable quantities by a number of transformed strains, along with expression of the cluster genes. The amount of KK-1 produced by the strain with the greatest expression of all genes was lower than that produced by the original producer, C. clavata. Therefore, expression of the cluster genes is necessary and sufficient for the heterologous production of KK-1 in A. oryzae, although there may be unknown factors limiting productivity in this species. PMID:29686660
Nakamura, Hidetoshi; Katayama, Takuya; Okabe, Tomoya; Iwashita, Kazuhiro; Fujii, Wataru; Kitamoto, Katsuhiko; Maruyama, Jun-Ichi
2017-07-11
Numerous strains of Aspergillus oryzae are industrially used for Japanese traditional fermentation and for the production of enzymes and heterologous proteins. In A. oryzae, deletion of the ku70 or ligD genes involved in non-homologous end joining (NHEJ) has allowed high gene targeting efficiency. However, this strategy has been mainly applied under the genetic background of the A. oryzae wild strain RIB40, and it would be laborious to delete the NHEJ genes in many A. oryzae industrial strains, probably due to their low gene targeting efficiency. In the present study, we generated ligD mutants from the A. oryzae industrial strains by employing the CRISPR/Cas9 system, which we previously developed as a genome editing method. Uridine/uracil auxotrophic strains were generated by deletion of the pyrG gene, which was subsequently used as a selective marker. We examined the gene targeting efficiency with the ecdR gene, of which deletion was reported to induce sclerotia formation under the genetic background of the strain RIB40. As expected, the deletion efficiencies were high, around 60~80%, in the ligD mutants of industrial strains. Intriguingly, the effects of the ecdR deletion on sclerotia formation varied depending on the strains, and we found sclerotia-like structures under the background of the industrial strains, which have never been reported to form sclerotia. The present study demonstrates that introducing ligD mutation by genome editing is an effective method allowing high gene targeting efficiency in A. oryzae industrial strains.
Bayram, Özgür; Biesemann, Christoph; Krappmann, Sven; Galland, Paul
2008-01-01
Cryptochromes are blue-light receptors that have presumably evolved from the DNA photolyase protein family, and the genomes of many organisms contain genes for both types of molecules. Both protein structures resemble each other, which suggests that light control and light protection share a common ancient origin. In the genome of the filamentous fungus Aspergillus nidulans, however, only one cryptochrome/photolyase-encoding gene, termed cryA, was identified. Deletion of the cryA gene triggers sexual differentiation under inappropriate culture conditions and results in up-regulation of transcripts encoding regulators of fruiting body formation. CryA is a protein whose N- and C-terminal synthetic green fluorescent protein fusions localize to the nucleus. CryA represses sexual development under UVA350-370 nm light both on plates and in submerged culture. Strikingly, CryA exhibits photorepair activity as demonstrated by heterologous complementation of a DNA repair-deficient Escherichia coli strain as well as overexpression in an A. nidulans uvsBΔ genetic background. This is in contrast to the single deletion cryAΔ strain, which does not show increased sensitivity toward UV-induced damage. In A. nidulans, cryA encodes a novel type of cryptochrome/photolyase that exhibits a regulatory function during light-dependent development and DNA repair activity. This represents a paradigm for the evolutionary transition between photolyases and cryptochromes. PMID:18495868
Tang, Cun-Duo; Shi, Hong-Ling; Tang, Qing-Hai; Zhou, Jun-Shi; Yao, Lun-Guang; Jiao, Zhu-Jin; Kan, Yun-Chao
2016-11-01
Two novel glycosyl hydrolase family 5 (GH5) β-mannanases (AoMan5A and AoMan5B) were identified from Aspergillus oryzae RIB40 by genome mining. The AoMan5A contains a predicted family 1 carbohydrate binding module (CBM-1), located at its N-terminal. The AoMan5A, AoMan5B and truncated mutant AoMan5AΔCL (truncating the N-terminal CBM and linker of AoMan5A) were expressed retaining the N-terminus of the native protein in Pichia pastoris GS115 by pPIC9K M . The specific enzyme activity of the purified reAoMan5A, reAoMan5B and reAoMan5AΔCL towards locust bean gum at pH 3.6 and 40°C for 10min, was 8.3, 104.2 and 15.8U/mg, respectively. The temperature properties of the reAoMan5AΔCL were improved by truncating CBM. They can degrade the pretreated konjac flour and produce prebiotics. In addition, they had excellent stability under simulative gastric fluid and simulative prilling process. All these properties make these recombinant β-mannanases potential additives for use in the food and feed industries. Copyright © 2016. Published by Elsevier Inc.
Characterization of AFLAV, a Tf1/Sushi retrotransposon from Aspergillus flavus.
Hua, Sui-Sheng T; Tarun, Alice S; Pandey, Sonal N; Chang, Leo; Chang, Perng-Kuang
2007-02-01
The plasmid, pAF28, a genomic clone from Aspergillus flavus NRRL 6541, has been used as a hybridization probe to fingerprint A. flavus strains isolated in corn and peanut fields. The insert of pAF28 contains a 4.5 kb region which encodes a truncated retrotransposon (AfRTL-1). In search for a full-length and intact copy of retrotransposon, we exploited a novel PCR cloning strategy by amplifying a 3.4 kb region from the genomic DNA of A. flavus NRRL 6541. The fragment was cloned into pCR 4-TOPO. Sequence analysis confirmed that this region encoded putative domains of partial reverse transcriptase, RNase H, and integrase of the predicted retrotransposon. The two flanking long terminal repeats (LTRs) and the sequence between them comprise a putative full-length LTR retrotransposon of 7799 bp in length. This intact retrotransposon sequence is named AFLAV (A. flavus Retrotransposon). The order of the predicted catalytic domains in the polyprotein (Pol) placed AFLAV in the Tf1/sushi subgroup of the Ty3/gypsy retrotransposon family. Primers derived from AFLAV sequence were used to screen this retrotransposon in other strains of A. flavus. More than fifty strains of A. flavus isolated from different geological origins were surveyed and the results show that many strains have extensive deletions in the regions encoding the capsid (Gag) and Pol.
Pemberton, T J; Jakobsson, M; Conrad, D F; Coop, G; Wall, J D; Pritchard, J K; Patel, P I; Rosenberg, N A
2008-07-01
When performing association studies in populations that have not been the focus of large-scale investigations of haplotype variation, it is often helpful to rely on genomic databases in other populations for study design and analysis - such as in the selection of tag SNPs and in the imputation of missing genotypes. One way of improving the use of these databases is to rely on a mixture of database samples that is similar to the population of interest, rather than using the single most similar database sample. We demonstrate the effectiveness of the mixture approach in the application of African, European, and East Asian HapMap samples for tag SNP selection in populations from India, a genetically intermediate region underrepresented in genomic studies of haplotype variation.
Bhawna; Bonthala, V.S.; Gajula, MNV Prasad
2016-01-01
The common bean [Phaseolus vulgaris (L.)] is one of the essential proteinaceous vegetables grown in developing countries. However, its production is challenged by low yields caused by numerous biotic and abiotic stress conditions. Regulatory transcription factors (TFs) symbolize a key component of the genome and are the most significant targets for producing stress tolerant crop and hence functional genomic studies of these TFs are important. Therefore, here we have constructed a web-accessible TFs database for P. vulgaris, called PvTFDB, which contains 2370 putative TF gene models in 49 TF families. This database provides a comprehensive information for each of the identified TF that includes sequence data, functional annotation, SSRs with their primer sets, protein physical properties, chromosomal location, phylogeny, tissue-specific gene expression data, orthologues, cis-regulatory elements and gene ontology (GO) assignment. Altogether, this information would be used in expediting the functional genomic studies of a specific TF(s) of interest. The objectives of this database are to understand functional genomics study of common bean TFs and recognize the regulatory mechanisms underlying various stress responses to ease breeding strategy for variety production through a couple of search interfaces including gene ID, functional annotation and browsing interfaces including by family and by chromosome. This database will also serve as a promising central repository for researchers as well as breeders who are working towards crop improvement of legume crops. In addition, this database provide the user unrestricted public access and the user can download entire data present in the database freely. Database URL: http://www.multiomics.in/PvTFDB/ PMID:27465131
The MaizeGDB Genome Browser Tutorial: One example of database outreach to biologists via video
USDA-ARS?s Scientific Manuscript database
Video tutorials are an effective way for researchers to quickly learn how to use online tools offered by biological databases. At the Maize Genetics and Genomics Database (MaizeGDB), we have developed a number of video tutorials that aim to demonstrate how to use various tools as well as to explici...
Toward the automated generation of genome-scale metabolic networks in the SEED.
DeJongh, Matthew; Formsma, Kevin; Boillot, Paul; Gould, John; Rycenga, Matthew; Best, Aaron
2007-04-26
Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis). We have implemented our tools and database within the SEED, an open-source software environment for comparative genome annotation and analysis. Our method sets the stage for the automated generation of substantially complete metabolic networks for over 400 complete genome sequences currently in the SEED. With each genome that is processed using our tools, the database of common components grows to cover more of the diversity of metabolic pathways. This increases the likelihood that components of reaction networks for subsequently processed genomes can be retrieved from the database, rather than assembled and verified manually.
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency
Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio
2015-01-01
Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB. PMID:26558254
SalmonDB: a bioinformatics resource for Salmo salar and Oncorhynchus mykiss
Di Génova, Alex; Aravena, Andrés; Zapata, Luis; González, Mauricio; Maass, Alejandro; Iturra, Patricia
2011-01-01
SalmonDB is a new multiorganism database containing EST sequences from Salmo salar, Oncorhynchus mykiss and the whole genome sequence of Danio rerio, Gasterosteus aculeatus, Tetraodon nigroviridis, Oryzias latipes and Takifugu rubripes, built with core components from GMOD project, GOPArc system and the BioMart project. The information provided by this resource includes Gene Ontology terms, metabolic pathways, SNP prediction, CDS prediction, orthologs prediction, several precalculated BLAST searches and domains. It also provides a BLAST server for matching user-provided sequences to any of the databases and an advanced query tool (BioMart) that allows easy browsing of EST databases with user-defined criteria. These tools make SalmonDB database a valuable resource for researchers searching for transcripts and genomic information regarding S. salar and other salmonid species. The database is expected to grow in the near feature, particularly with the S. salar genome sequencing project. Database URL: http://genomicasalmones.dim.uchile.cl/ PMID:22120661
SalmonDB: a bioinformatics resource for Salmo salar and Oncorhynchus mykiss.
Di Génova, Alex; Aravena, Andrés; Zapata, Luis; González, Mauricio; Maass, Alejandro; Iturra, Patricia
2011-01-01
SalmonDB is a new multiorganism database containing EST sequences from Salmo salar, Oncorhynchus mykiss and the whole genome sequence of Danio rerio, Gasterosteus aculeatus, Tetraodon nigroviridis, Oryzias latipes and Takifugu rubripes, built with core components from GMOD project, GOPArc system and the BioMart project. The information provided by this resource includes Gene Ontology terms, metabolic pathways, SNP prediction, CDS prediction, orthologs prediction, several precalculated BLAST searches and domains. It also provides a BLAST server for matching user-provided sequences to any of the databases and an advanced query tool (BioMart) that allows easy browsing of EST databases with user-defined criteria. These tools make SalmonDB database a valuable resource for researchers searching for transcripts and genomic information regarding S. salar and other salmonid species. The database is expected to grow in the near feature, particularly with the S. salar genome sequencing project. Database URL: http://genomicasalmones.dim.uchile.cl/
WheatGenome.info: A Resource for Wheat Genomics Resource.
Lai, Kaitao
2016-01-01
An integrated database with a variety of Web-based systems named WheatGenome.info hosting wheat genome and genomic data has been developed to support wheat research and crop improvement. The resource includes multiple Web-based applications, which are implemented as a variety of Web-based systems. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This portal provides links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/ .
Kristensen, David M.; Wolf, Yuri I.; Koonin, Eugene V.
2017-01-01
The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbial world. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of ‘index’ orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html. PMID:28053163
Investigation of mutations in the HBB gene using the 1,000 genomes database.
Carlice-Dos-Reis, Tânia; Viana, Jaime; Moreira, Fabiano Cordeiro; Cardoso, Greice de Lemos; Guerreiro, João; Santos, Sidney; Ribeiro-Dos-Santos, Ândrea
2017-01-01
Mutations in the HBB gene are responsible for several serious hemoglobinopathies, such as sickle cell anemia and β-thalassemia. Sickle cell anemia is one of the most common monogenic diseases worldwide. Due to its prevalence, diverse strategies have been developed for a better understanding of its molecular mechanisms. In silico analysis has been increasingly used to investigate the genotype-phenotype relationship of many diseases, and the sequences of healthy individuals deposited in the 1,000 Genomes database appear to be an excellent tool for such analysis. The objective of this study is to analyze the variations in the HBB gene in the 1,000 Genomes database, to describe the mutation frequencies in the different population groups, and to investigate the pattern of pathogenicity. The computational tool SNPEFF was used to align the data from 2,504 samples of the 1,000 Genomes database with the HG19 genome reference. The pathogenicity of each amino acid change was investigated using the databases CLINVAR, dbSNP and HbVar and five different predictors. Twenty different mutations were found in 209 healthy individuals. The African group had the highest number of individuals with mutations, and the European group had the lowest number. Thus, it is concluded that approximately 8.3% of phenotypically healthy individuals from the 1,000 Genomes database have some mutation in the HBB gene. The frequency of mutated genes was estimated at 0.042, so that the expected frequency of being homozygous or compound heterozygous for these variants in the next generation is approximately 0.002. In total, 193 subjects had a non-synonymous mutation, which 186 (7.4%) have a deleterious mutation. Considering that the 1,000 Genomes database is representative of the world's population, it can be estimated that fourteen out of every 10,000 individuals in the world will have a hemoglobinopathy in the next generation.
Warburton, Marilyn L; Williams, William Paul; Hawkins, Leigh; Bridges, Susan; Gresham, Cathy; Harper, Jonathan; Ozkan, Seval; Mylroie, J Erik; Shan, Xueyan
2011-07-01
A public candidate gene testing pipeline for resistance to aflatoxin accumulation or Aspergillus flavus infection in maize is presented here. The pipeline consists of steps for identifying, testing, and verifying the association of selected maize gene sequences with resistance under field conditions. Resources include a database of genetic and protein sequences associated with the reduction in aflatoxin contamination from previous studies; eight diverse inbred maize lines for polymorphism identification within any maize gene sequence; four Quantitative Trait Loci (QTL) mapping populations and one association mapping panel, all phenotyped for aflatoxin accumulation resistance and associated phenotypes; and capacity for Insertion/Deletion (InDel) and SNP genotyping in the population(s) for mapping. To date, ten genes have been identified as possible candidate genes and put through the candidate gene testing pipeline, and results are presented here to demonstrate the utility of the pipeline.
Gutiérrez-Sánchez, Gerardo; Atwood, James; Kolli, V S Kumar; Roussos, Sévastianos; Augur, Christopher
2012-04-01
Caffeine is toxic to most microorganisms. However, some filamentous fungi, such as Aspergillus tamarii, are able to metabolize this alkaloid when fed caffeine as the sole nitrogen source. The aim of the present work was to identify intracellular A. tamarii proteins, regulated by caffeine, using fluorescence difference two-dimensional gel electrophoresis. Specific proteins from two culture media of A. tamarii grown either on ammonium sulfate or caffeine as the sole nitrogen source were analysed by mass spectrometry. Thirteen out of a total of 85 differentially expressed spots were identified after database search. Identified up-regulated proteins include phosphoglycerate kinase, malate dehydrogenase, dyp-type peroxidase family protein, heat shock protein, Cu, Zn superoxidase dismutase and xanthine dehydrogenase. Some of the proteins identified in this study are involved in the caffeine degradation pathway as well as in stress response, suggesting that stress proteins could be involved in caffeine metabolism in filamentous fungi.
Taj-Aldeen, Saad J.; Rammaert, Blandine; Gamaletsou, Maria; Sipsas, Nikolaos V.; Zeller, Valerie; Roilides, Emmanuel; Kontoyiannis, Dimitrios P.; Miller, Andy O.; Petraitis, Vidmantas; Walsh, Thomas J.; Lortholary, Olivier
2015-01-01
Abstract Osteoarticular mycoses due to non-Aspergillus moulds are uncommon and challenging infections. A systematic literature review of non-Aspergillus osteoarticular mycoses was performed using PUBMED and EMBASE databases from 1970 to 2013. Among 145 patients were 111 adults (median age 48.5 [16–92 y]) and 34 pediatric patients (median age 7.5 [3–15 y]); 114 (79.7%) were male and 88 (61.9%) were immunocompromised. Osteomyelitis was due to direct inoculation in 54.5%. Trauma and puncture wounds were more frequent in children (73.5% vs 43.5%; P = 0.001). Prior surgery was more frequent in adults (27.7% vs 5.9%; P = 0.025). Vertebral (23.2%) and craniofacial osteomyelitis (13.1%) with neurological deficits predominated in adults. Lower limb osteomyelitis (47.7%) and knee arthritis (67.8%) were predominantly seen in children. Hyalohyphomycosis represented 64.8% of documented infections with Scedosporium apiospermum (33.1%) and Lomentospora prolificans (15.8%) as the most common causes. Combined antifungal therapy and surgery was used in 69% of cases with overall response in 85.8%. Median duration of therapy was 115 days (range 5–730). When voriconazole was used as single agent for treatment of hyalohyphomycosis and phaeohyphomycosis, an overall response rate was achieved in 94.1% of cases. Non-Aspergillus osteoarticular mycoses occur most frequently in children after injury and in adults after surgery. Accurate early diagnosis and long-course therapy (median 6 mo) with a combined medical-surgical approach may result in favorable outcome. PMID:26683917
Fungal Infections of the Spine.
Ganesh, Devin; Gottlieb, Jonathan; Chan, Sherilynn; Martinez, Octavio; Eismont, Frank
2015-06-15
Review of the literature. To retrospectively examine the frequency of published fungal infections by species and the treatment algorithms used to eradicate the disease. Fungal infections of the spine present unique challenges to the modern multispecialty treatment team. Although rare in comparison with bacterial infections, fungal infections have been increasing in incidence over the past several decades. Evidences-based practice is limited to referencing smaller case series. MEDLINE, Scopus, and EMBASE searches were carried out by one of the authors as well as by the research desk at the University of Miami/Calder Memorial Library. We included peer-reviewed articles published between 1948 and September 2010; case reports, series, and reviews were all examined and compiled into a database. A total of 130 articles, representing 157 cases, were included in the review. Aspergillus (60 cases, 38.2% of the total) and Candida species (36 cases, 22.9% of the total) were the 2 most common organisms. Surgery was associated with a greater survival rate than medical management alone in patients with Aspergillus (26.9% mortality in surgical patients; 60% in medically treated patients) and Candida (0% vs. 28.6%). Overall mortality was 19.3%. The overall recurrence rate was 7.4%. Amphotericin use was associated with a higher mortality rate than azoles. Aspergillus is the most common published pathogen in fungal infections of the spine. Recent publications depicting the use of newer antifungal medications such as azoles report higher survival rates. Surgically treated patients in combination with antifungal therapy showed highest frequencies of patient survival in Aspergillus and Candida infections. 3.
Klee, Kathrin; Ernst, Rebecca; Spannagl, Manuel; Mayer, Klaus F X
2007-08-30
Apollo, a genome annotation viewer and editor, has become a widely used genome annotation and visualization tool for distributed genome annotation projects. When using Apollo for annotation, database updates are carried out by uploading intermediate annotation files into the respective database. This non-direct database upload is laborious and evokes problems of data synchronicity. To overcome these limitations we extended the Apollo data adapter with a generic, configurable web service client that is able to retrieve annotation data in a GAME-XML-formatted string and pass it on to Apollo's internal input routine. This Apollo web service adapter, Apollo2Go, simplifies the data exchange in distributed projects and aims to render the annotation process more comfortable. The Apollo2Go software is freely available from ftp://ftpmips.gsf.de/plants/apollo_webservice.
Klee, Kathrin; Ernst, Rebecca; Spannagl, Manuel; Mayer, Klaus FX
2007-01-01
Background Apollo, a genome annotation viewer and editor, has become a widely used genome annotation and visualization tool for distributed genome annotation projects. When using Apollo for annotation, database updates are carried out by uploading intermediate annotation files into the respective database. This non-direct database upload is laborious and evokes problems of data synchronicity. Results To overcome these limitations we extended the Apollo data adapter with a generic, configurable web service client that is able to retrieve annotation data in a GAME-XML-formatted string and pass it on to Apollo's internal input routine. Conclusion This Apollo web service adapter, Apollo2Go, simplifies the data exchange in distributed projects and aims to render the annotation process more comfortable. The Apollo2Go software is freely available from . PMID:17760972
Liolios, Konstantinos; Mavromatis, Konstantinos; Tavernarakis, Nektarios; Kyrpides, Nikos C
2008-01-01
The Genomes On Line Database (GOLD) is a comprehensive resource that provides information on genome and metagenome projects worldwide. Complete and ongoing projects and their associated metadata can be accessed in GOLD through pre-computed lists and a search page. As of September 2007, GOLD contains information on more than 2900 sequencing projects, out of which 639 have been completed and their sequence data deposited in the public databases. GOLD continues to expand with the goal of providing metadata information related to the projects and the organisms/environments towards the Minimum Information about a Genome Sequence' (MIGS) guideline. GOLD is available at http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece at http://gold.imbb.forth.gr/
Liolios, Konstantinos; Mavromatis, Konstantinos; Tavernarakis, Nektarios; Kyrpides, Nikos C.
2008-01-01
The Genomes On Line Database (GOLD) is a comprehensive resource that provides information on genome and metagenome projects worldwide. Complete and ongoing projects and their associated metadata can be accessed in GOLD through pre-computed lists and a search page. As of September 2007, GOLD contains information on more than 2900 sequencing projects, out of which 639 have been completed and their sequence data deposited in the public databases. GOLD continues to expand with the goal of providing metadata information related to the projects and the organisms/environments towards the Minimum Information about a Genome Sequence’ (MIGS) guideline. GOLD is available at http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece at http://gold.imbb.forth.gr/ PMID:17981842
Zhao, Zuotao; Li, Lili; Wan, Zhe; Chen, Wei; Liu, Honggang; Li, Ruoyu
2011-01-01
Rapid detection and differentiation of Aspergillus and Mucorales species in fungal rhinosinusitis diagnosis are desirable, since the clinical management and prognosis associated with the two taxa are fundamentally different. We describe an assay based on a combination of broad-range PCR amplification and reverse line blot hybridization (PCR/RLB) to detect and differentiate the pathogens causing fungal rhinosinusitis, which include five Aspergillus species (A. fumigatus, A. flavus, A. niger, A. terreus, and A. nidulans) and seven Mucorales species (Mucor heimalis, Mucor racemosus, Mucor cercinelloidea, Rhizopus arrhizus, Rhizopus microsporus, Rhizomucor pusillus, and Absidia corymbifera). The assay was validated with 98 well-characterized clinical isolates and 41 clinical tissue specimens. PCR/RLB showed high sensitivity and specificity, with 100% correct identifications of 98 clinical isolates and no cross-hybridization between the species-specific probes. Results for five control isolates, Candida albicans, Fusarium solani, Scedosporium apiospermum, Penicillium marneffei, and Exophiala verrucosa, were negative as judged by PCR/RLB. The analytical sensitivity of PCR/RLB was found to be 1.8 × 10−3 ng/μl by 10-fold serial dilution of Aspergillus genomic DNA. The assay identified 35 of 41 (85.4%) clinical specimens, exhibiting a higher sensitivity than fungal culture (22 of 41; 53.7%) and direct sequencing (18 of 41; 43.9%). PCR/RLB similarly showed high specificity, with correct identification 16 of 18 specimens detected by internal transcribed spacer (ITS) sequencing and 16 of 22 detected by fungal culture, but it also has the additional advantage of being able to detect mixed infection in a single clinical specimen. The PCR/RLB assay thus provides a rapid and reliable option for laboratory diagnosis of fungal rhinosinusitis. PMID:21325541
Zhao, Zuotao; Li, Lili; Wan, Zhe; Chen, Wei; Liu, Honggang; Li, Ruoyu
2011-04-01
Rapid detection and differentiation of Aspergillus and Mucorales species in fungal rhinosinusitis diagnosis are desirable, since the clinical management and prognosis associated with the two taxa are fundamentally different. We describe an assay based on a combination of broad-range PCR amplification and reverse line blot hybridization (PCR/RLB) to detect and differentiate the pathogens causing fungal rhinosinusitis, which include five Aspergillus species (A. fumigatus, A. flavus, A. niger, A. terreus, and A. nidulans) and seven Mucorales species (Mucor heimalis, Mucor racemosus, Mucor cercinelloidea, Rhizopus arrhizus, Rhizopus microsporus, Rhizomucor pusillus, and Absidia corymbifera). The assay was validated with 98 well-characterized clinical isolates and 41 clinical tissue specimens. PCR/RLB showed high sensitivity and specificity, with 100% correct identifications of 98 clinical isolates and no cross-hybridization between the species-specific probes. Results for five control isolates, Candida albicans, Fusarium solani, Scedosporium apiospermum, Penicillium marneffei, and Exophiala verrucosa, were negative as judged by PCR/RLB. The analytical sensitivity of PCR/RLB was found to be 1.8 × 10(-3) ng/μl by 10-fold serial dilution of Aspergillus genomic DNA. The assay identified 35 of 41 (85.4%) clinical specimens, exhibiting a higher sensitivity than fungal culture (22 of 41; 53.7%) and direct sequencing (18 of 41; 43.9%). PCR/RLB similarly showed high specificity, with correct identification 16 of 18 specimens detected by internal transcribed spacer (ITS) sequencing and 16 of 22 detected by fungal culture, but it also has the additional advantage of being able to detect mixed infection in a single clinical specimen. The PCR/RLB assay thus provides a rapid and reliable option for laboratory diagnosis of fungal rhinosinusitis.
Pfannenstiel, Brandon T.; Zhao, Xixi; Wortman, Jennifer; Throckmorton, Kurt; Spraker, Joseph E.; Luo, Xingyu; Lindner, Daniel L.; Lim, Fang Yun; Knox, Benjamin P.; Haas, Brian; Fischer, Gregory J.; Choera, Tsokyi; Butchko, Robert A. E.; Bok, Jin-Woo; Affeldt, Katharyn J.
2017-01-01
ABSTRACT The study of aflatoxin in Aspergillus spp. has garnered the attention of many researchers due to aflatoxin’s carcinogenic properties and frequency as a food and feed contaminant. Significant progress has been made by utilizing the model organism Aspergillus nidulans to characterize the regulation of sterigmatocystin (ST), the penultimate precursor of aflatoxin. A previous forward genetic screen identified 23 A. nidulans mutants involved in regulating ST production. Six mutants were characterized from this screen using classical mapping (five mutations in mcsA) and complementation with a cosmid library (one mutation in laeA). The remaining mutants were backcrossed and sequenced using Illumina and Ion Torrent sequencing platforms. All but one mutant contained one or more sequence variants in predicted open reading frames. Deletion of these genes resulted in identification of mutant alleles responsible for the loss of ST production in 12 of the 17 remaining mutants. Eight of these mutations were in genes already known to affect ST synthesis (laeA, mcsA, fluG, and stcA), while the remaining four mutations (in laeB, sntB, and hamI) were in previously uncharacterized genes not known to be involved in ST production. Deletion of laeB, sntB, and hamI in A. flavus results in loss of aflatoxin production, confirming that these regulators are conserved in the aflatoxigenic aspergilli. This report highlights the multifaceted regulatory mechanisms governing secondary metabolism in Aspergillus. Additionally, these data contribute to the increasing number of studies showing that forward genetic screens of fungi coupled with whole-genome resequencing is a robust and cost-effective technique. PMID:28874473
Non-B DB: a database of predicted non-B DNA-forming motifs in mammalian genomes.
Cer, Regina Z; Bruce, Kevin H; Mudunuri, Uma S; Yi, Ming; Volfovsky, Natalia; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M
2011-01-01
Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine•pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.
Osypov, Alexander A; Krutinin, Gleb G; Krutinina, Eugenia A; Kamzolova, Svetlana G
2012-04-01
Electrostatic properties of genome DNA are important to its interactions with different proteins, in particular, related to transcription. DEPPDB - DNA Electrostatic Potential (and other Physical) Properties Database - provides information on the electrostatic and other physical properties of genome DNA combined with its sequence and annotation of biological and structural properties of genomes and their elements. Genomes are organized on taxonomical basis, supporting comparative and evolutionary studies. Currently, DEPPDB contains all completely sequenced bacterial, viral, mitochondrial, and plastids genomes according to the NCBI RefSeq, and some model eukaryotic genomes. Data for promoters, regulation sites, binding proteins, etc., are incorporated from established DBs and literature. The database is complemented by analytical tools. User sequences calculations are available. Case studies discovered electrostatics complementing DNA bending in E.coli plasmid BNT2 promoter functioning, possibly affecting host-environment metabolic switch. Transcription factors binding sites gravitate to high potential regions, confirming the electrostatics universal importance in protein-DNA interactions beyond the classical promoter-RNA polymerase recognition and regulation. Other genome elements, such as terminators, also show electrostatic peculiarities. Most intriguing are gene starts, exhibiting taxonomic correlations. The necessity of the genome electrostatic properties studies is discussed.
Entamoeba histolytica: construction and applications of subgenomic databases.
Hofer, Margit; Duchêne, Michael
2005-07-01
Knowledge about the influence of environmental stress such as the action of chemotherapeutic agents on gene expression in Entamoeba histolytica is limited. We plan to use oligonucleotide microarray hybridization to approach these questions. As the basis for our array, sequence data from the genome project carried out by the Institute for Genomic Research (TIGR) and the Sanger Institute were used to annotate parts of the parasite genome. Three subgenomic databases containing enzymes, cytoskeleton genes, and stress genes were compiled with the help of the ExPASy proteomics website and the BLAST servers at the two genome project sites. The known sequences from reference species, mostly human and Escherichia coli, were searched against TIGR and Sanger E. histolytica sequence contigs and the homologs were copied into a Microsoft Access database. In a similar way, two additional databases of cytoskeletal genes and stress genes were generated. Metabolic pathways could be assembled from our enzyme database, but sometimes they were incomplete as is the case for the sterol biosynthesis pathway. The raw databases contained a significant number of duplicate entries which were merged to obtain curated non-redundant databases. This procedure revealed that some E. histolytica genes may have several putative functions. Representative examples such as the case of the delta-aminolevulinate synthase/serine palmitoyltransferase are discussed.
RICD: a rice indica cDNA database resource for rice functional genomics.
Lu, Tingting; Huang, Xuehui; Zhu, Chuanrang; Huang, Tao; Zhao, Qiang; Xie, Kabing; Xiong, Lizhong; Zhang, Qifa; Han, Bin
2008-11-26
The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. Rice Indica cDNA Database (RICD) is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB) and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.
A Novel Extracellular Multicopper Oxidase from Phanerochaete chrysosporium with Ferroxidase Activity
Larrondo, Luis F.; Salas, Loreto; Melo, Francisco; Vicuña, Rafael; Cullen, Daniel
2003-01-01
Lignin degradation by the white rot basidiomycete Phanerochaete chrysosporium involves various extracellular oxidative enzymes, including lignin peroxidase, manganese peroxidase, and a peroxide-generating enzyme, glyoxal oxidase. Recent studies have suggested that laccases also may be produced by this fungus, but these conclusions have been controversial. We identified four sequences related to laccases and ferroxidases (Fet3) in a search of the publicly available P. chrysosporium database. One gene, designated mco1, has a typical eukaryotic secretion signal and is transcribed in defined media and in colonized wood. Structural analysis and multiple alignments identified residues common to laccase and Fet3 sequences. A recombinant MCO1 (rMCO1) protein expressed in Aspergillus nidulans had a molecular mass of 78 kDa, as determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis, and the copper I-type center was confirmed by the UV-visible spectrum. rMCO1 oxidized various compounds, including 2,2′-azino(bis-3-ethylbenzthiazoline-6-sulfonate) (ABTS) and aromatic amines, although phenolic compounds were poor substrates. The best substrate was Fe2+, with a Km close to 2 μM. Collectively, these results suggest that the P. chrysosporium genome does not encode a typical laccase but rather encodes a unique extracellular multicopper oxidase with strong ferroxidase activity. PMID:14532088
DOE Office of Scientific and Technical Information (OSTI.GOV)
Poliakov, Alexander; Couronne, Olivier
2002-11-04
Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less
Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A
2011-01-01
PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
GANESH: software for customized annotation of genome regions.
Huntley, Derek; Hummerich, Holger; Smedley, Damian; Kittivoravitkul, Sasivimol; McCarthy, Mark; Little, Peter; Sergot, Marek
2003-09-01
GANESH is a software package designed to support the genetic analysis of regions of human and other genomes. It provides a set of components that may be assembled to construct a self-updating database of DNA sequence, mapping data, and annotations of possible genome features. Once one or more remote sources of data for the target region have been identified, all sequences for that region are downloaded, assimilated, and subjected to a (configurable) set of standard database-searching and genome-analysis packages. The results are stored in compressed form in a relational database, and are updated automatically on a regular schedule so that they are always immediately available in their most up-to-date versions. A Java front-end, executed as a stand alone application or web applet, provides a graphical interface for navigating the database and for viewing the annotations. There are facilities for importing and exporting data in the format of the Distributed Annotation System (DAS), enabling a GANESH database to be used as a component of a DAS configuration. The system has been used to construct databases for about a dozen regions of human chromosomes and for three regions of mouse chromosomes.
Ye, Chao; Xu, Nan; Dong, Chuan; Ye, Yuannong; Zou, Xuan; Chen, Xiulai; Guo, Fengbiao; Liu, Liming
2017-04-07
Genome-scale metabolic models (GSMMs) constitute a platform that combines genome sequences and detailed biochemical information to quantify microbial physiology at the system level. To improve the unity, integrity, correctness, and format of data in published GSMMs, a consensus IMGMD database was built in the LAMP (Linux + Apache + MySQL + PHP) system by integrating and standardizing 328 GSMMs constructed for 139 microorganisms. The IMGMD database can help microbial researchers download manually curated GSMMs, rapidly reconstruct standard GSMMs, design pathways, and identify metabolic targets for strategies on strain improvement. Moreover, the IMGMD database facilitates the integration of wet-lab and in silico data to gain an additional insight into microbial physiology. The IMGMD database is freely available, without any registration requirements, at http://imgmd.jiangnan.edu.cn/database.
Using SQL Databases for Sequence Similarity Searching and Analysis.
Pearson, William R; Mackey, Aaron J
2017-09-13
Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
Lenobel, R; Sebela, M; Frébort, I
2005-01-01
The amino acid sequence of methylamine oxidase (MeAO) from the fungus Aspergillus niger was analyzed using mass spectrometry (MS). First, MeAO was characterized by an accurate molar mass of 72.4 kDa of the monomer measured using MALDI-TOF-MS and by a pI value of 5.8 determined by isoelectric focusing. MALDI-TOF-MS revealed a clear peptide mass fingerprint after tryptic digestion, which did not provide any relevant hit when searched against a nonredundant protein database and was different from that of A. niger amine oxidase AO-I. Tandem mass spectrometry with electrospray ionization coupled to liquid chromatography allowed unambiguous reading of six peptide sequences (11-19 amino acids) and seven sequence tags (4-15 amino acids), which were used for MS BLAST homology searching. MeAO was found to be largely homologous to a hypothetical protein AN7641.2 (EMBL/GenBank protein-accession code EAA61827) from Aspergillus nidulans FGSC A4 with a theoretical molar mass of 76.46 kDa and pI 6.14, which belongs to the superfamily of copper amine oxidases. The protein AN7641.2 is only little homologous to the amine oxidase AO-I (32% identity, 49 % similarity).
Genomics Portals: integrative web-platform for mining genomics data.
Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario
2010-01-13
A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
Genomics Portals: integrative web-platform for mining genomics data
2010-01-01
Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org. PMID:20070909
Kim, Hyun Soo
2018-01-01
Aged population is increasing worldwide due to the aging process that is inevitable. Accordingly, longevity and healthy aging have been spotlighted to promote social contribution of aged population. Many studies in the past few decades have reported the process of aging and longevity, emphasizing the importance of maintaining genomic stability in exceptionally long-lived population. Underlying reason of longevity remains unclear due to its complexity involving multiple factors. With advances in sequencing technology and human genome-associated approaches, studies based on population-based genomic studies are increasing. In this review, we summarize recent longevity and healthy aging studies of human population focusing on DNA repair as a major factor in maintaining genome integrity. To keep pace with recent growth in genomic research, aging- and longevity-associated genomic databases are also briefly introduced. To suggest novel approaches to investigate longevity-associated genetic variants related to DNA repair using genomic databases, gene set analysis was conducted, focusing on DNA repair- and longevity-associated genes. Their biological networks were additionally analyzed to grasp major factors containing genetic variants of human longevity and healthy aging in DNA repair mechanisms. In summary, this review emphasizes DNA repair activity in human longevity and suggests approach to conduct DNA repair-associated genomic study on human healthy aging.
PGSB PlantsDB: updates to the database framework for comparative plant genome research.
Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai C; Martis, Mihaela M; Seidel, Michael; Kugler, Karl G; Gundlach, Heidrun; Mayer, Klaus F X
2016-01-04
PGSB (Plant Genome and Systems Biology: formerly MIPS) PlantsDB (http://pgsb.helmholtz-muenchen.de/plant/index.jsp) is a database framework for the comparative analysis and visualization of plant genome data. The resource has been updated with new data sets and types as well as specialized tools and interfaces to address user demands for intuitive access to complex plant genome data. In its latest incarnation, we have re-worked both the layout and navigation structure and implemented new keyword search options and a new BLAST sequence search functionality. Actively involved in corresponding sequencing consortia, PlantsDB has dedicated special efforts to the integration and visualization of complex triticeae genome data, especially for barley, wheat and rye. We enhanced CrowsNest, a tool to visualize syntenic relationships between genomes, with data from the wheat sub-genome progenitor Aegilops tauschii and added functionality to the PGSB RNASeqExpressionBrowser. GenomeZipper results were integrated for the genomes of barley, rye, wheat and perennial ryegrass and interactive access is granted through PlantsDB interfaces. Data exchange and cross-linking between PlantsDB and other plant genome databases is stimulated by the transPLANT project (http://transplantdb.eu/). © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Bhawna; Bonthala, V S; Gajula, Mnv Prasad
2016-01-01
The common bean [Phaseolus vulgaris (L.)] is one of the essential proteinaceous vegetables grown in developing countries. However, its production is challenged by low yields caused by numerous biotic and abiotic stress conditions. Regulatory transcription factors (TFs) symbolize a key component of the genome and are the most significant targets for producing stress tolerant crop and hence functional genomic studies of these TFs are important. Therefore, here we have constructed a web-accessible TFs database for P. vulgaris, called PvTFDB, which contains 2370 putative TF gene models in 49 TF families. This database provides a comprehensive information for each of the identified TF that includes sequence data, functional annotation, SSRs with their primer sets, protein physical properties, chromosomal location, phylogeny, tissue-specific gene expression data, orthologues, cis-regulatory elements and gene ontology (GO) assignment. Altogether, this information would be used in expediting the functional genomic studies of a specific TF(s) of interest. The objectives of this database are to understand functional genomics study of common bean TFs and recognize the regulatory mechanisms underlying various stress responses to ease breeding strategy for variety production through a couple of search interfaces including gene ID, functional annotation and browsing interfaces including by family and by chromosome. This database will also serve as a promising central repository for researchers as well as breeders who are working towards crop improvement of legume crops. In addition, this database provide the user unrestricted public access and the user can download entire data present in the database freely.Database URL: http://www.multiomics.in/PvTFDB/. © The Author(s) 2016. Published by Oxford University Press.
NASA Astrophysics Data System (ADS)
Velazquez, Enrique Israel
Improvements in medical and genomic technologies have dramatically increased the production of electronic data over the last decade. As a result, data management is rapidly becoming a major determinant, and urgent challenge, for the development of Precision Medicine. Although successful data management is achievable using Relational Database Management Systems (RDBMS), exponential data growth is a significant contributor to failure scenarios. Growing amounts of data can also be observed in other sectors, such as economics and business, which, together with the previous facts, suggests that alternate database approaches (NoSQL) may soon be required for efficient storage and management of big databases. However, this hypothesis has been difficult to test in the Precision Medicine field since alternate database architectures are complex to assess and means to integrate heterogeneous electronic health records (EHR) with dynamic genomic data are not easily available. In this dissertation, we present a novel set of experiments for identifying NoSQL database approaches that enable effective data storage and management in Precision Medicine using patients' clinical and genomic information from the cancer genome atlas (TCGA). The first experiment draws on performance and scalability from biologically meaningful queries with differing complexity and database sizes. The second experiment measures performance and scalability in database updates without schema changes. The third experiment assesses performance and scalability in database updates with schema modifications due dynamic data. We have identified two NoSQL approach, based on Cassandra and Redis, which seems to be the ideal database management systems for our precision medicine queries in terms of performance and scalability. We present NoSQL approaches and show how they can be used to manage clinical and genomic big data. Our research is relevant to the public health since we are focusing on one of the main challenges to the development of Precision Medicine and, consequently, investigating a potential solution to the progressively increasing demands on health care.
Terabayashi, Yasunobu; Sano, Motoaki; Yamane, Noriko; Marui, Junichiro; Tamano, Koichi; Sagara, Junichi; Dohmoto, Mitsuko; Oda, Ken; Ohshima, Eiji; Tachibana, Kuniharu; Higa, Yoshitaka; Ohashi, Shinichi; Koike, Hideaki; Machida, Masayuki
2010-12-01
Kojic acid is produced in large amounts by Aspergillus oryzae as a secondary metabolite and is widely used in the cosmetic industry. Glucose can be converted to kojic acid, perhaps by only a few steps, but no genes for the conversion have thus far been revealed. Using a DNA microarray, gene expression profiles under three pairs of conditions significantly affecting kojic acid production were compared. All genes were ranked using an index parameter reflecting both high amounts of transcription and a high induction ratio under producing conditions. After disruption of nine candidate genes selected from the top of the list, two genes of unknown function were found to be responsible for kojic acid biosynthesis, one having an oxidoreductase motif and the other a transporter motif. These two genes are closely associated in the genome, showing typical characteristics of genes involved in secondary metabolism. Copyright © 2010 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guo, Chun-Jun; Yeh, Hsu-Hua; Chiang, Yi Ming
2013-04-15
Abstract Epipolythiodioxopiperazines (ETPs) are a class of fungal secondary metabolites derived from cyclic peptides. Acetylaranotin belongs to one structural subgroup of ETPs characterized by the presence of a seven-membered dihydrooxepine ring. Defining the genes involved in acetylaranotin biosynthesis should provide a means to increase production of these compounds and facilitate the engineering of second-generation molecules. The filamentous fungus Aspergillus terreus produces acetylaranotin and related natural products. Using targeted gene deletions, we have identified a cluster of 9 genes including one nonribosomal peptide synthase gene, ataP, that is required for acetylaranotin biosynthesis. Chemical analysis of the wild type and mutant strainsmore » enabled us to isolate seventeen natural products that are either intermediates in the normal biosynthetic pathway or shunt products that are produced when the pathway is interrupted through mutation. Nine of the compounds identified in this study are novel natural products. Our data allow us to propose a complete biosynthetic pathway for acetylaranotin and related natural products.« less
Aspergillus fumigatus and Aspergillosis
Latgé, Jean-Paul
1999-01-01
Aspergillus fumigatus is one of the most ubiquitous of the airborne saprophytic fungi. Humans and animals constantly inhale numerous conidia of this fungus. The conidia are normally eliminated in the immunocompetent host by innate immune mechanisms, and aspergilloma and allergic bronchopulmonary aspergillosis, uncommon clinical syndromes, are the only infections observed in such hosts. Thus, A. fumigatus was considered for years to be a weak pathogen. With increases in the number of immunosuppressed patients, however, there has been a dramatic increase in severe and usually fatal invasive aspergillosis, now the most common mold infection worldwide. In this review, the focus is on the biology of A. fumigatus and the diseases it causes. Included are discussions of (i) genomic and molecular characterization of the organism, (ii) clinical and laboratory methods available for the diagnosis of aspergillosis in immunocompetent and immunocompromised hosts, (iii) identification of host and fungal factors that play a role in the establishment of the fungus in vivo, and (iv) problems associated with antifungal therapy. PMID:10194462
Cell biology of the Koji mold Aspergillus oryzae.
Kitamoto, Katsuhiko
2015-01-01
Koji mold, Aspergillus oryzae, has been used for the production of sake, miso, and soy sauce for more than one thousand years in Japan. Due to the importance, A. oryzae has been designated as the national micro-organism of Japan (Koku-kin). A. oryzae has been intensively studied in the past century, with most investigations focusing on breeding techniques and developing methods for Koji making for sake brewing. However, the understanding of fundamental biology of A. oryzae remains relatively limited compared with the yeast Saccharomyces cerevisiae. Therefore, we have focused on studying the cell biology including live cell imaging of organelles, protein vesicular trafficking, autophagy, and Woronin body functions using the available genomic information. In this review, I describe essential findings of cell biology of A. oryzae obtained in our study for a quarter of century. Understanding of the basic biology will be critical for not its biotechnological application, but also for an understanding of the fundamental biology of other filamentous fungi.
Kuo, Wen-Hua
2011-10-01
This paper compares the development of genomics as a form of state project in Japan and Taiwan. Broadening the concepts of genomic sovereignty and bionationalism, I argue that the establishment and use of genomic databases vary according to techno-political context. While both Japan and Taiwan hold population-based databases to be necessary for scientific advance and competitiveness, they differ in how they have attempted to transform the information produced by databases into regulatory schemes for drug approval. The effectiveness of Taiwan's biobank is severely limited by the IRB reviewing process. By contrast, while updating its regulations for drug approval, Japan, is using pharmacogenomics to deal with matters relating to ethnic identity. By analysing genomic initiatives in the political context that nurtures them, this paper seeks to capture how global science and local societies interact and offers insight into the assessment of state-sponsored science in East Asia as they become transnational. Copyright © 2011 Elsevier Ltd. All rights reserved.
Significance of genome-wide association studies in molecular anthropology.
Gupta, Vipin; Khadgawat, Rajesh; Sachdeva, Mohinder Pal
2009-12-01
The successful advent of a genome-wide approach in association studies raises the hopes of human geneticists for solving a genetic maze of complex traits especially the disorders. This approach, which is replete with the application of cutting-edge technology and supported by big science projects (like Human Genome Project; and even more importantly the International HapMap Project) and various important databases (SNP database, CNV database, etc.), has had unprecedented success in rapidly uncovering many of the genetic determinants of complex disorders. The magnitude of this approach in the genetics of classical anthropological variables like height, skin color, eye color, and other genome diversity projects has certainly expanded the horizons of molecular anthropology. Therefore, in this article we have proposed a genome-wide association approach in molecular anthropological studies by providing lessons from the exemplary study of the Wellcome Trust Case Control Consortium. We have also highlighted the importance and uniqueness of Indian population groups in facilitating the design and finding optimum solutions for other genome-wide association-related challenges.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grigoriev, Igor V.; Baker, Scott E.; Andersen, Mikael R.
2011-04-28
The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compels additional exploration. We therefore undertook whole genome sequencing of the acidogenic A. niger wild type strain (ATCC 1015), and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence and half the telomeric regionsmore » have been elucidated. Moreover, sequence information from ATCC 1015 was utilized to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 megabase of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis revealed up-regulation of the electron transport chain, specifically the alternative oxidative pathway in ATCC 1015, while CBS 513.88 showed significant up-regulation of genes relevant to glucoamylase A production, such as tRNA-synthases and protein transporters. Our results and datasets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi.[Supplemental materials (10 figures, three text documents and 16 tables) have been made available. The whole genome sequence for A. niger ATCC 1015 is available from NBCI under acc. no ACJE00000000. The up-dated sequence for A. niger CBS 513.88 is available from EMBL under acc. no AM269948-AM270415. The sequence data from the phylogeny study has been submitted to NCBI (GU296686-296739). Microarray data from this study is submitted to GEO as series GSE10983. Accession for reviewers is possible through: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi token GSE10983] The dsmM_ANIGERa_coll511030F library and platform information is deposited at GEO under number GPL6758« less
Childs, Kevin L; Konganti, Kranti; Buell, C Robin
2012-01-01
Major feedstock sources for future biofuel production are likely to be high biomass producing plant species such as poplar, pine, switchgrass, sorghum and maize. One active area of research in these species is genome-enabled improvement of lignocellulosic biofuel feedstock quality and yield. To facilitate genomic-based investigations in these species, we developed the Biofuel Feedstock Genomic Resource (BFGR), a database and web-portal that provides high-quality, uniform and integrated functional annotation of gene and transcript assembly sequences from species of interest to lignocellulosic biofuel feedstock researchers. The BFGR includes sequence data from 54 species and permits researchers to view, analyze and obtain annotation at the gene, transcript, protein and genome level. Annotation of biochemical pathways permits the identification of key genes and transcripts central to the improvement of lignocellulosic properties in these species. The integrated nature of the BFGR in terms of annotation methods, orthologous/paralogous relationships and linkage to seven species with complete genome sequences allows comparative analyses for biofuel feedstock species with limited sequence resources. Database URL: http://bfgr.plantbiology.msu.edu.
AoS28D, a proline-Xaa carboxypeptidase secreted by Aspergillus oryzae.
Salamin, Karine; Eugster, Philippe J; Jousson, Olivier; Waridel, Patrice; Grouzmann, Eric; Monod, Michel
2017-05-01
Prolyl peptidases of the MEROPS S28 family are of particular interest because they are key enzymes in the digestion of proline-rich peptides. A BLAST analysis of the Aspergillus oryzae genome revealed sequences coding for four proteases of the S28 family. Three of these proteases, AoS28A, AoS28B, and AoS28C, were previously characterized as acidic prolyl endopeptidases. The fourth protease, AoS28D, showed high sequence divergence with other S28 proteases and belongs to a phylogenetically distinct cluster together with orthologous proteases from other Aspergillus species. The objective of the present paper was to characterize AoS28D protease in terms of substrate specificity and activity. AoS28D produced by gene overexpression in A. oryzae and in Pichia pastoris was a 70-kDa glycoprotein with a 10-kDa sugar moiety. In contrast with other S28 proteases, AoS28D did not hydrolyze internal Pro-Xaa bonds of several tested peptides. Similarly, to human lysosomal Pro-Xaa carboxypeptidase, AoS28D demonstrated selectivity for cleaving C-terminal Pro-Xaa bonds which are resistant to carboxypeptidases of the S10 family concomitantly secreted by A. oryzae. Therefore, AoS28D could act in synergy with these enzymes during sequential degradation of a peptide from its C-terminus.
Koseki, Takuya; Miwa, Yozo; Akao, Takeshi; Akita, Osamu; Hashizume, Katsumi
2006-02-10
We screened 20,000 clones of an expressed sequence tag (EST) library from Aspergillus oryzae (http://www.nrib.go.jp/ken/EST/db/index.html) and obtained one cDNA clone encoding a protein with similarity to fungal acetyl xylan esterase. We also cloned the corresponding gene, designated as Aoaxe, from the genomic DNA. The deduced amino acid sequence consisted of a putative signal peptide of 31-amino acids and a mature protein of 276-amino acids. We engineered Aoaxe for heterologous expression in P. pastoris. Recombinant AoAXE (rAoAXE) was secreted by the aid of fused alpha-factor secretion signal peptide and accumulated as an active enzyme in the culture medium to a final level of 190 mg/l after 5 days. Purified rAoAXEA before and after treatment with endoglycosidase H migrated by SDS-PAGE with a molecular mass of 31 and 30 kDa, respectively. Purified rAoAXE displayed the greatest hydrolytic activity toward alpha-naphthylacetate (C2), lower activity toward alpha-naphthylpropionate (C3) and no detectable activity toward acyl-chain substrates containing four or more carbon atoms. The recombinant enzyme catalyzed the release of acetic acid from birchwood xylan. No activity was detectable using methyl esters of ferulic, caffeic or sinapic acids. rAoAXE was thermolabile in comparison to other AXEs from Aspergillus.
Tamano, Koichi; Bruno, Kenneth S; Koike, Hideaki; Ishii, Tomoko; Miura, Ai; Umemura, Myco; Culley, David E; Baker, Scott E; Machida, Masayuki
2015-04-01
Fatty acids are attractive molecules as source materials for the production of biodiesel fuel. Previously, we attained a 2.4-fold increase in fatty acid production by increasing the expression of fatty acid synthesis-related genes in Aspergillus oryzae. In this study, we achieved an additional increase in the production of fatty acids by disrupting a predicted acyl-CoA synthetase gene in A. oryzae. The A. oryzae genome is predicted to encode six acyl-CoA synthetase genes and disruption of AO090011000642, one of the six genes, resulted in a 9.2-fold higher accumulation (corresponding to an increased production of 0.23 mmol/g dry cell weight) of intracellular fatty acid in comparison to the wild-type strain. Furthermore, by introducing a niaD marker from Aspergillus nidulans to the disruptant, as well as changing the concentration of nitrogen in the culture medium from 10 to 350 mM, fatty acid productivity reached 0.54 mmol/g dry cell weight. Analysis of the relative composition of the major intracellular free fatty acids caused by disruption of AO090011000642 in comparison to the wild-type strain showed an increase in stearic acid (7 to 26 %), decrease in linoleic acid (50 to 27 %), and no significant changes in palmitic or oleic acid (each around 20-25 %).
Shankar, Jata; Cerqueira, Gustavo C; Wortman, Jennifer R; Clemons, Karl V; Stevens, David A
2018-03-02
With the increasing numbers of immunocompromised hosts, Aspergillus fumigatus emerges as a lethal opportunistic fungal pathogen. Understanding innate and acquired immunity responses of the host is important for a better therapeutic strategy to deal with aspergillosis patients. To determine the transcriptome in the kidneys in aspergillosis, we employed RNA-Seq to obtain single 76-base reads of whole-genome transcripts of murine kidneys on a temporal basis (days 0; uninfected, 1, 2, 3 and 8) during invasive aspergillosis. A total of 6284 transcripts were downregulated, and 5602 were upregulated compared to baseline expression. Gene ontology enrichment analysis identified genes involved in innate and adaptive immune response, as well as iron binding and homeostasis, among others. Our results showed activation of pathogen recognition receptors, e.g., β-defensins, C-type lectins (e.g., dectin-1), Toll-like receptors (TLR-2, TLR-3, TLR-8, TLR-9 and TLR-13), as well as Ptx-3 and C-reactive protein among the soluble receptors. Upregulated transcripts encoding various differentiating cytokines and effector proinflammatory cytokines, as well as those encoding for chemokines and chemokine receptors, revealed Th-1 and Th-17-type immune responses. These studies form a basic dataset for experimental prioritization, including other target organs, to determine the global response of the host against Aspergillus infection.
Human Ageing Genomic Resources: new and updated databases
Tacutu, Robi; Thornton, Daniel; Johnson, Emily; Budovsky, Arie; Barardo, Diogo; Craig, Thomas; Diana, Eugene; Lehmann, Gilad; Toren, Dmitri; Wang, Jingwei; Fraifeld, Vadim E
2018-01-01
Abstract In spite of a growing body of research and data, human ageing remains a poorly understood process. Over 10 years ago we developed the Human Ageing Genomic Resources (HAGR), a collection of databases and tools for studying the biology and genetics of ageing. Here, we present HAGR’s main functionalities, highlighting new additions and improvements. HAGR consists of six core databases: (i) the GenAge database of ageing-related genes, in turn composed of a dataset of >300 human ageing-related genes and a dataset with >2000 genes associated with ageing or longevity in model organisms; (ii) the AnAge database of animal ageing and longevity, featuring >4000 species; (iii) the GenDR database with >200 genes associated with the life-extending effects of dietary restriction; (iv) the LongevityMap database of human genetic association studies of longevity with >500 entries; (v) the DrugAge database with >400 ageing or longevity-associated drugs or compounds; (vi) the CellAge database with >200 genes associated with cell senescence. All our databases are manually curated by experts and regularly updated to ensure a high quality data. Cross-links across our databases and to external resources help researchers locate and integrate relevant information. HAGR is freely available online (http://genomics.senescence.info/). PMID:29121237
MIPSPlantsDB—plant database resource for integrative and comparative plant genome research
Spannagl, Manuel; Noubibou, Octave; Haase, Dirk; Yang, Li; Gundlach, Heidrun; Hindemitt, Tobias; Klee, Kathrin; Haberer, Georg; Schoof, Heiko; Mayer, Klaus F. X.
2007-01-01
Genome-oriented plant research delivers rapidly increasing amount of plant genome data. Comprehensive and structured information resources are required to structure and communicate genome and associated analytical data for model organisms as well as for crops. The increase in available plant genomic data enables powerful comparative analysis and integrative approaches. PlantsDB aims to provide data and information resources for individual plant species and in addition to build a platform for integrative and comparative plant genome research. PlantsDB is constituted from genome databases for Arabidopsis, Medicago, Lotus, rice, maize and tomato. Complementary data resources for cis elements, repetive elements and extensive cross-species comparisons are implemented. The PlantsDB portal can be reached at . PMID:17202173
The COG database: new developments in phylogenetic classification of proteins from complete genomes
Tatusov, Roman L.; Natale, Darren A.; Garkavtsev, Igor V.; Tatusova, Tatiana A.; Shankavaram, Uma T.; Rao, Bachoti S.; Kiryutin, Boris; Galperin, Michael Y.; Fedorova, Natalie D.; Koonin, Eugene V.
2001-01-01
The database of Clusters of Orthologous Groups of proteins (COGs), which represents an attempt on a phylogenetic classification of the proteins encoded in complete genomes, currently consists of 2791 COGs including 45 350 proteins from 30 genomes of bacteria, archaea and the yeast Saccharomyces cerevisiae (http://www.ncbi.nlm.nih.gov/COG). In addition, a supplement to the COGs is available, in which proteins encoded in the genomes of two multicellular eukaryotes, the nematode Caenorhabditis elegans and the fruit fly Drosophila melanogaster, and shared with bacteria and/or archaea were included. The new features added to the COG database include information pages with structural and functional details on each COG and literature references, improvements of the COGNITOR program that is used to fit new proteins into the COGs, and classification of genomes and COGs constructed by using principal component analysis. PMID:11125040
Genomic Target Database (GTD): A database of potential targets in human pathogenic bacteria
Barh, Debmalya; Kumar, Anil; Misra, Amarendra Narayana
2009-01-01
A Genomic Target Database (GTD) has been developed having putative genomic drug targets for human bacterial pathogens. The selected pathogens are either drug resistant or vaccines are yet to be developed against them. The drug targets have been identified using subtractive genomics approaches and these are subsequently classified into Drug targets in pathogen specific unique metabolic pathways,Drug targets in host-pathogen common metabolic pathways, andMembrane localized drug targets. HTML code is used to link each target to its various properties and other available public resources. Essential resources and tools for subtractive genomic analysis, sub-cellular localization, vaccine and drug designing are also mentioned. To the best of authors knowledge, no such database (DB) is presently available that has listed metabolic pathways and membrane specific genomic drug targets based on subtractive genomics. Listed targets in GTD are readily available resource in developing drug and vaccine against the respective pathogen, its subtypes, and other family members. Currently GTD contains 58 drug targets for four pathogens. Shortly, drug targets for six more pathogens will be listed. Availability GTD is available at IIOAB website http://www.iioab.webs.com/GTD.htm. It can also be accessed at http://www.iioabdgd.webs.com.GTD is free for academic research and non-commercial use only. Commercial use is strictly prohibited without prior permission from IIOAB. PMID:20011153
The YeastGenome app: the Saccharomyces Genome Database at your fingertips.
Wong, Edith D; Karra, Kalpana; Hitz, Benjamin C; Hong, Eurie L; Cherry, J Michael
2013-01-01
The Saccharomyces Genome Database (SGD) is a scientific database that provides researchers with high-quality curated data about the genes and gene products of Saccharomyces cerevisiae. To provide instant and easy access to this information on mobile devices, we have developed YeastGenome, a native application for the Apple iPhone and iPad. YeastGenome can be used to quickly find basic information about S. cerevisiae genes and chromosomal features regardless of internet connectivity. With or without network access, you can view basic information and Gene Ontology annotations about a gene of interest by searching gene names and gene descriptions or by browsing the database within the app to find the gene of interest. With internet access, the app provides more detailed information about the gene, including mutant phenotypes, references and protein and genetic interactions, as well as provides hyperlinks to retrieve detailed information by showing SGD pages and views of the genome browser. SGD provides online help describing basic ways to navigate the mobile version of SGD, highlights key features and answers frequently asked questions related to the app. The app is available from iTunes (http://itunes.com/apps/yeastgenome). The YeastGenome app is provided freely as a service to our community, as part of SGD's mission to provide free and open access to all its data and annotations.
Deppdb--DNA electrostatic potential properties database: electrostatic properties of genome DNA.
Osypov, Alexander A; Krutinin, Gleb G; Kamzolova, Svetlana G
2010-06-01
The electrostatic properties of genome DNA influence its interactions with different proteins, in particular, the regulation of transcription by RNA-polymerases. DEPPDB--DNA Electrostatic Potential Properties Database--was developed to hold and provide all available information on the electrostatic properties of genome DNA combined with its sequence and annotation of biological and structural properties of genome elements and whole genomes. Genomes in DEPPDB are organized on a taxonomical basis. Currently, the database contains all the completely sequenced bacterial and viral genomes according to NCBI RefSeq. General properties of the genome DNA electrostatic potential profile and principles of its formation are revealed. This potential correlates with the GC content but does not correspond to it exactly and strongly depends on both the sequence arrangement and its context (flanking regions). Analysis of the promoter regions for bacterial and viral RNA polymerases revealed a correspondence between the scale of these proteins' physical properties and electrostatic profile patterns. We also discovered a direct correlation between the potential value and the binding frequency of RNA polymerase to DNA, supporting the idea of the role of electrostatics in these interactions. This matches a pronounced tendency of the promoter regions to possess higher values of the electrostatic potential.
MaizeGDB: The Maize Genetics and Genomics Database.
USDA-ARS?s Scientific Manuscript database
MaizeGDB is the community database for biological information about the crop plant Zea mays. Genomic, genetic, sequence, gene product, functional characterization, literature reference, and person/organization contact information are among the datatypes stored at MaizeGDB. At the project’s website...
Virus Database and Online Inquiry System Based on Natural Vectors.
Dong, Rui; Zheng, Hui; Tian, Kun; Yau, Shek-Chung; Mao, Weiguang; Yu, Wenping; Yin, Changchuan; Yu, Chenglong; He, Rong Lucy; Yang, Jie; Yau, Stephen St
2017-01-01
We construct a virus database called VirusDB (http://yaulab.math.tsinghua.edu.cn/VirusDB/) and an online inquiry system to serve people who are interested in viral classification and prediction. The database stores all viral genomes, their corresponding natural vectors, and the classification information of the single/multiple-segmented viral reference sequences downloaded from National Center for Biotechnology Information. The online inquiry system serves the purpose of computing natural vectors and their distances based on submitted genomes, providing an online interface for accessing and using the database for viral classification and prediction, and back-end processes for automatic and manual updating of database content to synchronize with GenBank. Submitted genomes data in FASTA format will be carried out and the prediction results with 5 closest neighbors and their classifications will be returned by email. Considering the one-to-one correspondence between sequence and natural vector, time efficiency, and high accuracy, natural vector is a significant advance compared with alignment methods, which makes VirusDB a useful database in further research.
Strategies to improve reference databases for soil microbiomes
Choi, Jinlyung; Yang, Fan; Stepanauskas, Ramunas; ...
2016-12-09
A database of curated genomes is needed to better assess soil microbial communities and their processes associated with differing land management and environmental impacts. Interpreting soil metagenomic datasets with existing sequence databases is challenging because these datasets are biased towards medical and biotechnology research and can result in misleading annotations. We have curated a database of 928 genomes of soil-associated organisms (888 bacteria, 34 archaea, and 6 fungi). Using this database as a representation of the current state of knowledge of soil microbes that are well-characterized, we evaluated its composition and compared it to broader microbial databases, specifically NCBI’s RefSeq,more » as well as 3,035 publicly available soil amplicon datasets. These comparisons identified phyla and functions that are enriched in soils as well as those that may be underrepresented in RefSoil. For example, RefSoil was observed to have increased representation of Firmicutes despite its low abundance in soil environments and also lacked representation of Acidobacteria and Verrucomicrobia, which are abundant in soils. Our comparison of RefSoil to soil amplicon datasets allowed us to identify targets that if cultured or sequenced would significantly increase the biodiversity represented within RefSoil. To demonstrate the opportunities to access these underrepresented targets, we employed single cell genomics in a pilot experiment to recover 14 genomes from the "most wanted" list, which improved RefSoil's representation of EMP sequences by 7% by abundance. This effort demonstrates the value of RefSoil in the guidance of future research efforts and the capability of single cell genomics as a practical means to fill the existing genomic data gaps.« less
Strategies to improve reference databases for soil microbiomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choi, Jinlyung; Yang, Fan; Stepanauskas, Ramunas
A database of curated genomes is needed to better assess soil microbial communities and their processes associated with differing land management and environmental impacts. Interpreting soil metagenomic datasets with existing sequence databases is challenging because these datasets are biased towards medical and biotechnology research and can result in misleading annotations. We have curated a database of 928 genomes of soil-associated organisms (888 bacteria, 34 archaea, and 6 fungi). Using this database as a representation of the current state of knowledge of soil microbes that are well-characterized, we evaluated its composition and compared it to broader microbial databases, specifically NCBI’s RefSeq,more » as well as 3,035 publicly available soil amplicon datasets. These comparisons identified phyla and functions that are enriched in soils as well as those that may be underrepresented in RefSoil. For example, RefSoil was observed to have increased representation of Firmicutes despite its low abundance in soil environments and also lacked representation of Acidobacteria and Verrucomicrobia, which are abundant in soils. Our comparison of RefSoil to soil amplicon datasets allowed us to identify targets that if cultured or sequenced would significantly increase the biodiversity represented within RefSoil. To demonstrate the opportunities to access these underrepresented targets, we employed single cell genomics in a pilot experiment to recover 14 genomes from the "most wanted" list, which improved RefSoil's representation of EMP sequences by 7% by abundance. This effort demonstrates the value of RefSoil in the guidance of future research efforts and the capability of single cell genomics as a practical means to fill the existing genomic data gaps.« less
Phylogenomics databases for facilitating functional genomics in rice.
Jung, Ki-Hong; Cao, Peijian; Sharma, Rita; Jain, Rashmi; Ronald, Pamela C
2015-12-01
The completion of whole genome sequence of rice (Oryza sativa) has significantly accelerated functional genomics studies. Prior to the release of the sequence, only a few genes were assigned a function each year. Since sequencing was completed in 2005, the rate has exponentially increased. As of 2014, 1,021 genes have been described and added to the collection at The Overview of functionally characterized Genes in Rice online database (OGRO). Despite this progress, that number is still very low compared with the total number of genes estimated in the rice genome. One limitation to progress is the presence of functional redundancy among members of the same rice gene family, which covers 51.6 % of all non-transposable element-encoding genes. There remain a significant portion or rice genes that are not functionally redundant, as reflected in the recovery of loss-of-function mutants. To more accurately analyze functional redundancy in the rice genome, we have developed a phylogenomics databases for six large gene families in rice, including those for glycosyltransferases, glycoside hydrolases, kinases, transcription factors, transporters, and cytochrome P450 monooxygenases. In this review, we introduce key features and applications of these databases. We expect that they will serve as a very useful guide in the post-genomics era of research.
The SUPERFAMILY database in 2004: additions and improvements.
Madera, Martin; Vogel, Christine; Kummerfeld, Sarah K; Chothia, Cyrus; Gough, Julian
2004-01-01
The SUPERFAMILY database provides structural assignments to protein sequences and a framework for analysis of the results. At the core of the database is a library of profile Hidden Markov Models that represent all proteins of known structure. The library is based on the SCOP classification of proteins: each model corresponds to a SCOP domain and aims to represent an entire superfamily. We have applied the library to predicted proteins from all completely sequenced genomes (currently 154), the Swiss-Prot and TrEMBL databases and other sequence collections. Close to 60% of all proteins have at least one match, and one half of all residues are covered by assignments. All models and full results are available for download and online browsing at http://supfam.org. Users can study the distribution of their superfamily of interest across all completely sequenced genomes, investigate with which other superfamilies it combines and retrieve proteins in which it occurs. Alternatively, concentrating on a particular genome as a whole, it is possible first, to find out its superfamily composition, and secondly, to compare it with that of other genomes to detect superfamilies that are over- or under-represented. In addition, the webserver provides the following standard services: sequence search; keyword search for genomes, superfamilies and sequence identifiers; and multiple alignment of genomic, PDB and custom sequences.
Tsang, Chi-Ching; Hui, Teresa W S; Lee, Kim-Chung; Chen, Jonathan H K; Ngan, Antonio H Y; Tam, Emily W T; Chan, Jasper F W; Wu, Andrea L; Cheung, Mei; Tse, Brian P H; Wu, Alan K L; Lai, Christopher K C; Tsang, Dominic N C; Que, Tak-Lun; Lam, Ching-Wan; Yuen, Kwok-Yung; Lau, Susanna K P; Woo, Patrick C Y
2016-02-01
Thirteen Aspergillus isolates recovered from nails of 13 patients (fingernails, n=2; toenails, n=11) with onychomycosis were characterized. Twelve strains were identified by multilocus sequencing as Aspergillus spp. (Aspergillus sydowii [n=4], Aspergillus welwitschiae [n=3], Aspergillus terreus [n=2], Aspergillus flavus [n=1], Aspergillus tubingensis [n=1], and Aspergillus unguis [n=1]). Isolates of A. terreus, A. flavus, and A. unguis were also identifiable by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. The 13th isolate (HKU49(T)) possessed unique morphological characteristics different from other Aspergillus spp. Molecular characterization also unambiguously showed that HKU49(T) was distinct from other Aspergillus spp. We propose the novel species Aspergillus hongkongensis to describe this previously unknown fungus. Antifungal susceptibility testing showed most Aspergillus isolates had low MICs against itraconazole and voriconazole, but all Aspergillus isolates had high MICs against fluconazole. A diverse spectrum of Aspergillus species is associated with onychomycosis. Itraconazole and voriconazole are probably better drug options for Aspergillus onychomycosis. Copyright © 2016 Elsevier Inc. All rights reserved.
The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)
Overbeek, Ross; Olson, Robert; Pusch, Gordon D.; Olsen, Gary J.; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Parrello, Bruce; Shukla, Maulik; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang; Stevens, Rick
2014-01-01
In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources. PMID:24293654
The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).
Overbeek, Ross; Olson, Robert; Pusch, Gordon D; Olsen, Gary J; Davis, James J; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Parrello, Bruce; Shukla, Maulik; Vonstein, Veronika; Wattam, Alice R; Xia, Fangfang; Stevens, Rick
2014-01-01
In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.
A knowledge base for tracking the impact of genomics on population health.
Yu, Wei; Gwinn, Marta; Dotson, W David; Green, Ridgely Fisk; Clyne, Mindy; Wulf, Anja; Bowen, Scott; Kolor, Katherine; Khoury, Muin J
2016-12-01
We created an online knowledge base (the Public Health Genomics Knowledge Base (PHGKB)) to provide systematically curated and updated information that bridges population-based research on genomics with clinical and public health applications. Weekly horizon scanning of a wide variety of online resources is used to retrieve relevant scientific publications, guidelines, and commentaries. After curation by domain experts, links are deposited into Web-based databases. PHGKB currently consists of nine component databases. Users can search the entire knowledge base or search one or more component databases directly and choose options for customizing the display of their search results. PHGKB offers researchers, policy makers, practitioners, and the general public a way to find information they need to understand the complicated landscape of genomics and population health.Genet Med 18 12, 1312-1314.
Outreach and online training services at the Saccharomyces Genome Database.
MacPherson, Kevin A; Starr, Barry; Wong, Edith D; Dalusag, Kyla S; Hellerstedt, Sage T; Lang, Olivia W; Nash, Robert S; Skrzypek, Marek S; Engel, Stacia R; Cherry, J Michael
2017-01-01
The Saccharomyces Genome Database (SGD; www.yeastgenome.org ), the primary genetics and genomics resource for the budding yeast S. cerevisiae , provides free public access to expertly curated information about the yeast genome and its gene products. As the central hub for the yeast research community, SGD engages in a variety of social outreach efforts to inform our users about new developments, promote collaboration, increase public awareness of the importance of yeast to biomedical research, and facilitate scientific discovery. Here we describe these various outreach methods, from networking at scientific conferences to the use of online media such as blog posts and webinars, and include our perspectives on the benefits provided by outreach activities for model organism databases. http://www.yeastgenome.org. © The Author(s) 2017. Published by Oxford University Press.
The MAR databases: development and implementation of databases specific for marine metagenomics
Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen
2018-01-01
Abstract We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. PMID:29106641
USDA-ARS?s Scientific Manuscript database
The Maize Database (MaizeDB) to the Maize Genetics and Genomics Database (MaizeGDB) turns 20 this year, and such a significant milestone must be celebrated! With the release of the B73 reference sequence and more sequenced genomes on the way, the maize community needs to address various opportunitie...
Blin, Kai; Medema, Marnix H; Kottmann, Renzo; Lee, Sang Yup; Weber, Tilmann
2017-01-04
Secondary metabolites produced by microorganisms are the main source of bioactive compounds that are in use as antimicrobial and anticancer drugs, fungicides, herbicides and pesticides. In the last decade, the increasing availability of microbial genomes has established genome mining as a very important method for the identification of their biosynthetic gene clusters (BGCs). One of the most popular tools for this task is antiSMASH. However, so far, antiSMASH is limited to de novo computing results for user-submitted genomes and only partially connects these with BGCs from other organisms. Therefore, we developed the antiSMASH database, a simple but highly useful new resource to browse antiSMASH-annotated BGCs in the currently 3907 bacterial genomes in the database and perform advanced search queries combining multiple search criteria. antiSMASH-DB is available at http://antismash-db.secondarymetabolites.org/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gallo, Antonia; Knox, Benjamin P.; Bruno, Kenneth S.
2014-06-02
Ochratoxin A (OTA) is a potent mycotoxin produced by Aspergillus and Penicillium species and is a common contaminant of a wide variety of food commodities, with Aspergillus carbonarius being the main producer of OTA contamination in grapes and wine. The molecular structure of OTA is composed of a dihydroisocoumarin ring linked to phenylalanine and, as shown in different producing fungal species, a polyketide synthase (PKS) is a component of the OTA biosynthetic pathway. Similar to observations in other filamentous ascomycetes, the genome sequence of A. carbonarius contains a large number of genes predicted to encode PKSs. In this work amore » pks gene identified within the putative OTA cluster of A. carbonarius, designated as AcOTApks, was inactivated and the resulting mutant strain was unable to produce OTA, confirming the role of AcOTApks in this biosynthetic pathway. AcOTApks protein is characteristic of the highly reduced (HR)-PKS family, and also contains a putative methyltransferase domain likely responsible for the addition of the methyl group to the OTA polyketide structure. AcOTApks is different from the ACpks protein that we previously described which showed an expression profile compatible with OTA production. We performed phylogenetic analyses of the β-ketosynthase and acyl-transferase domains of the OTA PKSs which had been identified and characterized in different OTA producing fungal species. The phylogenetic results were similar for both the two domains analyzed and showed that OTA PKS of A. carbonarius, Aspergillus niger, and Aspergillus ochraceus clustered in a monophyletic group with 100% bootstrap support suggesting a common origin, while the other OTA PKSs analyzed were phylogenetically distant. A qRT-PCR assay monitored AcOTApks expression during fungal growth and concomitant production of OTA by A. carbonarius in synthetic grape medium. A clear correlation between the expression profile of AcOTApks and kinetics of OTA production was observed with AcOTApks which reached its maximum level of transcription before OTA accumulation in mycelium reached its highest level, confirming the fact that gene transcription always precedes phenotypic production.« less
The Reference Genome Sequence of Saccharomyces cerevisiae: Then and Now
Engel, Stacia R.; Dietrich, Fred S.; Fisk, Dianna G.; Binkley, Gail; Balakrishnan, Rama; Costanzo, Maria C.; Dwight, Selina S.; Hitz, Benjamin C.; Karra, Kalpana; Nash, Robert S.; Weng, Shuai; Wong, Edith D.; Lloyd, Paul; Skrzypek, Marek S.; Miyasato, Stuart R.; Simison, Matt; Cherry, J. Michael
2014-01-01
The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called “S288C 2010,” was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science. PMID:24374639
Translational genomics for plant breeding with the genome sequence explosion.
Kang, Yang Jae; Lee, Taeyoung; Lee, Jayern; Shim, Sangrea; Jeong, Haneul; Satyawan, Dani; Kim, Moon Young; Lee, Suk-Ha
2016-04-01
The use of next-generation sequencers and advanced genotyping technologies has propelled the field of plant genomics in model crops and plants and enhanced the discovery of hidden bridges between genotypes and phenotypes. The newly generated reference sequences of unstudied minor plants can be annotated by the knowledge of model plants via translational genomics approaches. Here, we reviewed the strategies of translational genomics and suggested perspectives on the current databases of genomic resources and the database structures of translated information on the new genome. As a draft picture of phenotypic annotation, translational genomics on newly sequenced plants will provide valuable assistance for breeders and researchers who are interested in genetic studies. © 2015 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Design and implementation of a database for Brucella melitensis genome annotation.
De Hertogh, Benoît; Lahlimi, Leïla; Lambert, Christophe; Letesson, Jean-Jacques; Depiereux, Eric
2008-03-18
The genome sequences of three Brucella biovars and of some species close to Brucella sp. have become available, leading to new relationship analysis. Moreover, the automatic genome annotation of the pathogenic bacteria Brucella melitensis has been manually corrected by a consortium of experts, leading to 899 modifications of start sites predictions among the 3198 open reading frames (ORFs) examined. This new annotation, coupled with the results of automatic annotation tools of the complete genome sequences of the B. melitensis genome (including BLASTs to 9 genomes close to Brucella), provides numerous data sets related to predicted functions, biochemical properties and phylogenic comparisons. To made these results available, alphaPAGe, a functional auto-updatable database of the corrected sequence genome of B. melitensis, has been built, using the entity-relationship (ER) approach and a multi-purpose database structure. A friendly graphical user interface has been designed, and users can carry out different kinds of information by three levels of queries: (1) the basic search use the classical keywords or sequence identifiers; (2) the original advanced search engine allows to combine (by using logical operators) numerous criteria: (a) keywords (textual comparison) related to the pCDS's function, family domains and cellular localization; (b) physico-chemical characteristics (numerical comparison) such as isoelectric point or molecular weight and structural criteria such as the nucleic length or the number of transmembrane helix (TMH); (c) similarity scores with Escherichia coli and 10 species phylogenetically close to B. melitensis; (3) complex queries can be performed by using a SQL field, which allows all queries respecting the database's structure. The database is publicly available through a Web server at the following url: http://www.fundp.ac.be/urbm/bioinfo/aPAGe.
Butyaev, Alexander; Mavlyutov, Ruslan; Blanchette, Mathieu; Cudré-Mauroux, Philippe; Waldispühl, Jérôme
2015-09-18
Recent releases of genome three-dimensional (3D) structures have the potential to transform our understanding of genomes. Nonetheless, the storage technology and visualization tools need to evolve to offer to the scientific community fast and convenient access to these data. We introduce simultaneously a database system to store and query 3D genomic data (3DBG), and a 3D genome browser to visualize and explore 3D genome structures (3DGB). We benchmark 3DBG against state-of-the-art systems and demonstrate that it is faster than previous solutions, and importantly gracefully scales with the size of data. We also illustrate the usefulness of our 3D genome Web browser to explore human genome structures. The 3D genome browser is available at http://3dgb.cs.mcgill.ca/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Butyaev, Alexander; Mavlyutov, Ruslan; Blanchette, Mathieu; Cudré-Mauroux, Philippe; Waldispühl, Jérôme
2015-01-01
Recent releases of genome three-dimensional (3D) structures have the potential to transform our understanding of genomes. Nonetheless, the storage technology and visualization tools need to evolve to offer to the scientific community fast and convenient access to these data. We introduce simultaneously a database system to store and query 3D genomic data (3DBG), and a 3D genome browser to visualize and explore 3D genome structures (3DGB). We benchmark 3DBG against state-of-the-art systems and demonstrate that it is faster than previous solutions, and importantly gracefully scales with the size of data. We also illustrate the usefulness of our 3D genome Web browser to explore human genome structures. The 3D genome browser is available at http://3dgb.cs.mcgill.ca/. PMID:25990738
KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation.
Wang, Dapeng; Xu, Jiayue; Yu, Jun
2015-09-16
The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. To meet increasing demands for comparing large genome sequences and to promote the use of the K-mer approach, we develop a versatile database, KGCAK ( http://kgcak.big.ac.cn/KGCAK/ ), containing ~8,000 genomes that include genome sequences of diverse life forms (viruses, prokaryotes, protists, animals, and plants) and cellular organelles of eukaryotic lineages. It builds phylogeny based on genomic elements in an alignment-free fashion and provides in-depth data processing enabling users to compare the complexity of genome sequences based on K-mer distribution. We hope that KGCAK becomes a powerful tool for exploring relationship within and among groups of species in a tree of life based on genomic data.
Identification of genomic sites for CRISPR/Cas9-based genome editing in the Vitis vinifera genome.
Wang, Yi; Liu, Xianju; Ren, Chong; Zhong, Gan-Yuan; Yang, Long; Li, Shaohua; Liang, Zhenchang
2016-04-21
CRISPR/Cas9 has been recently demonstrated as an effective and popular genome editing tool for modifying genomes of humans, animals, microorganisms, and plants. Success of such genome editing is highly dependent on the availability of suitable target sites in the genomes to be edited. Many specific target sites for CRISPR/Cas9 have been computationally identified for several annual model and crop species, but such sites have not been reported for perennial, woody fruit species. In this study, we identified and characterized five types of CRISPR/Cas9 target sites in the widely cultivated grape species Vitis vinifera and developed a user-friendly database for editing grape genomes in the future. A total of 35,767,960 potential CRISPR/Cas9 target sites were identified from grape genomes in this study. Among them, 22,597,817 target sites were mapped to specific genomic locations and 7,269,788 were found to be highly specific. Protospacers and PAMs were found to distribute uniformly and abundantly in the grape genomes. They were present in all the structural elements of genes with the coding region having the highest abundance. Five PAM types, TGG, AGG, GGG, CGG and NGG, were observed. With the exception of the NGG type, they were abundantly present in the grape genomes. Synteny analysis of similar genes revealed that the synteny of protospacers matched the synteny of homologous genes. A user-friendly database containing protospacers and detailed information of the sites was developed and is available for public use at the Grape-CRISPR website ( http://biodb.sdau.edu.cn/gc/index.html ). Grape genomes harbour millions of potential CRISPR/Cas9 target sites. These sites are widely distributed among and within chromosomes with predominant abundance in the coding regions of genes. We developed a publicly-accessible Grape-CRISPR database for facilitating the use of the CRISPR/Cas9 system as a genome editing tool for functional studies and molecular breeding of grapes. Among other functions, the database allows users to identify and select multi-protospacers for editing similar sequences in grape genomes simultaneously.
The Saccharomyces Genome Database Variant Viewer
Sheppard, Travis K.; Hitz, Benjamin C.; Engel, Stacia R.; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C.; Dalusag, Kyla S.; Demeter, Janos; Hellerstedt, Sage T.; Karra, Kalpana; Nash, Robert S.; Paskov, Kelley M.; Skrzypek, Marek S.; Weng, Shuai; Wong, Edith D.; Cherry, J. Michael
2016-01-01
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. PMID:26578556
Novodvorska, Michaela; Stratford, Malcolm; Blythe, Martin J; Wilson, Raymond; Beniston, Richard G; Archer, David B
2016-09-01
The early stages of development of Aspergillus niger conidia during outgrowth were explored by combining genome-wide gene expression analysis (RNAseq), proteomics, Warburg manometry and uptake studies. Resting conidia suspended in water were demonstrated for the first time to be metabolically active as low levels of oxygen uptake and the generation of carbon dioxide were detected, suggesting that low-level respiratory metabolism occurs in conidia for maintenance. Upon triggering of spore germination, generation of CO2 increased dramatically. For a short period, which coincided with mobilisation of the intracellular polyol, trehalose, there was no increase in uptake of O2 indicating that trehalose was metabolised by fermentation. Data from genome-wide mRNA profiling showed the presence of transcripts associated with fermentative and respiratory metabolism in resting conidia. Following triggering of conidial outgrowth, there was a clear switch to respiration after 25min, confirmed by cyanide inhibition. No effect of SHAM, salicylhydroxamic acid, on respiration suggests electron flow via cytochrome c oxidase. Glucose entry into spores was not detectable before 1h after triggering germination. The impact of sorbic acid on germination was examined and we showed that it inhibits glucose uptake. O2 uptake was also inhibited, delaying the onset of respiration and extending the period of fermentation. In conclusion, we show that conidia suspended in water are not completely dormant and that conidial outgrowth involves fermentative metabolism that precedes respiration. Copyright © 2016. Published by Elsevier Inc.
Tanaka, Mizuki; Sakai, Yoshifumi; Yamada, Osamu; Shintani, Takahiro; Gomi, Katsuya
2011-01-01
To investigate 3′-end-processing signals in Aspergillus oryzae, we created a nucleotide sequence data set of the 3′-untranslated region (3′ UTR) plus 100 nucleotides (nt) sequence downstream of the poly(A) site using A. oryzae expressed sequence tags and genomic sequencing data. This data set comprised 1065 sequences derived from 1042 unique genes. The average 3′ UTR length in A. oryzae was 241 nt, which is greater than that in yeast but similar to that in plants. The 3′ UTR and 100 nt sequence downstream of the poly(A) site is notably U-rich, while the region located 15–30 nt upstream of the poly(A) site is markedly A-rich. The most frequently found hexanucleotide in this A-rich region is AAUGAA, although this sequence accounts for only 6% of all transcripts. These data suggested that A. oryzae has no highly conserved sequence element equivalent to AAUAAA, a mammalian polyadenylation signal. We identified that putative 3′-end-processing signals in A. oryzae, while less well conserved than those in mammals, comprised four sequence elements: the furthest upstream U-rich element, A-rich sequence, cleavage site, and downstream U-rich element flanking the cleavage site. Although these putative 3′-end-processing signals are similar to those in yeast and plants, some notable differences exist between them. PMID:21586533
Novel Route for Agmatine Catabolism in Aspergillus niger Involves 4-Guanidinobutyrase
Kumar, Sunil; Saragadam, Tejaswani
2015-01-01
Agmatine, a significant polyamine in bacteria and plants, mostly arises from the decarboxylation of arginine. The functional importance of agmatine in fungi is poorly understood. The metabolism of agmatine and related guanidinium group-containing compounds in Aspergillus niger was explored through growth, metabolite, and enzyme studies. The fungus was able to metabolize and grow on l-arginine, agmatine, or 4-guanidinobutyrate as the sole nitrogen source. Whereas arginase defined the only route for arginine catabolism, biochemical and bioinformatics approaches suggested the absence of arginine decarboxylase in A. niger. Efficient utilization by the parent strain and also by its arginase knockout implied an arginase-independent catabolic route for agmatine. Urea and 4-guanidinobutyrate were detected in the spent medium during growth on agmatine. The agmatine-grown A. niger mycelia contained significant levels of amine oxidase, 4-guanidinobutyraldehyde dehydrogenase, 4-guanidinobutyrase (GBase), and succinic semialdehyde dehydrogenase, but no agmatinase activity was detected. Taken together, the results support a novel route for agmatine utilization in A. niger. The catabolism of agmatine by way of 4-guanidinobutyrate to 4-aminobutyrate into the Krebs cycle is the first report of such a pathway in any organism. A. niger GBase peptide fragments were identified by tandem mass spectrometry analysis. The corresponding open reading frame from the A. niger NCIM 565 genome was located and cloned. Subsequent expression of GBase in both Escherichia coli and A. niger along with its disruption in A. niger functionally defined the GBase locus (gbu) in the A. niger genome. PMID:26048930
HisB as novel selection marker for gene targeting approaches in Aspergillus niger.
Fiedler, Markus R M; Gensheimer, Tarek; Kubisch, Christin; Meyer, Vera
2017-03-08
For Aspergillus niger, a broad set of auxotrophic and dominant resistance markers is available. However, only few offer targeted modification of a gene of interest into or at a genomic locus of choice, which hampers functional genomics studies. We thus aimed to extend the available set by generating a histidine auxotrophic strain with a characterized hisB locus for targeted gene integration and deletion in A. niger. A histidine-auxotrophic strain was established via disruption of the A. niger hisB gene by using the counterselectable pyrG marker. After curing, a hisB - , pyrG - strain was obtained, which served as recipient strain for further studies. We show here that both hisB orthologs from A. nidulans and A. niger can be used to reestablish histidine prototrophy in this recipient strain. Whereas the hisB gene from A. nidulans was suitable for efficient gene targeting at different loci in A. niger, the hisB gene from A. niger allowed efficient integration of a Tet-on driven luciferase reporter construct at the endogenous non-functional hisB locus. Subsequent analysis of the luciferase activity revealed that the hisB locus is tight under non-inducing conditions and allows even higher luciferase expression levels compared to the pyrG integration locus. Taken together, we provide here an alternative selection marker for A. niger, hisB, which allows efficient homologous integration rates as well as high expression levels which compare favorably to the well-established pyrG selection marker.
Kristensen, David M; Wolf, Yuri I; Koonin, Eugene V
2017-01-04
The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbial world. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of 'index' orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Kis-Papo, Tamar; Kirzhner, Valery; Wasser, Solomon P.; Nevo, Eviatar
2003-01-01
We have found that genomic diversity is generally positively correlated with abiotic and biotic stress levels (1–3). However, beyond a high-threshold level of stress, the diversity declines to a few adapted genotypes. The Dead Sea is the harshest planetary hypersaline environment (340 g·liter–1 total dissolved salts, ≈10 times sea water). Hence, the Dead Sea is an excellent natural laboratory for testing the “rise and fall” pattern of genetic diversity with stress proposed in this article. Here, we examined genomic diversity of the ascomycete fungus Aspergillus versicolor from saline, nonsaline, and hypersaline Dead Sea environments. We screened the coding and noncoding genomes of A. versicolor isolates by using >600 AFLP (amplified fragment length polymorphism) markers (equal to loci). Genomic diversity was positively correlated with stress, culminating in the Dead Sea surface but dropped drastically in 50- to 280-m-deep seawater. The genomic diversity pattern paralleled the pattern of sexual reproduction of fungal species across the same southward gradient of increasing stress in Israel. This parallel may suggest that diversity and sex are intertwined intimately according to the rise and fall pattern and adaptively selected by natural selection in fungal genome evolution. Future large-scale verification in micromycetes will define further the trajectories of diversity and sex in the rise and fall pattern. PMID:14645702
Gnome View: A tool for visual representation of human genome data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pelkey, J.E.; Thomas, G.S.; Thurman, D.A.
1993-02-01
GnomeView is a tool for exploring data generated by the Human Gemone Project. GnomeView provides both graphical and textural styles of data presentation: employs an intuitive window-based graphical query interface: and integrates its underlying genome databases in such a way that the user can navigate smoothly across databases and between different levels of data. This paper describes GnomeView and discusses how it addresses various genome informatics issues.
Monteiro, Pedro Tiago; Pais, Pedro; Costa, Catarina; Manna, Sauvagya; Sá-Correia, Isabel; Teixeira, Miguel Cacho
2017-01-04
We present the PATHOgenic YEAst Search for Transcriptional Regulators And Consensus Tracking (PathoYeastract - http://pathoyeastract.org) database, a tool for the analysis and prediction of transcription regulatory associations at the gene and genomic levels in the pathogenic yeasts Candida albicans and C. glabrata Upon data retrieval from hundreds of publications, followed by curation, the database currently includes 28 000 unique documented regulatory associations between transcription factors (TF) and target genes and 107 DNA binding sites, considering 134 TFs in both species. Following the structure used for the YEASTRACT database, PathoYeastract makes available bioinformatics tools that enable the user to exploit the existing information to predict the TFs involved in the regulation of a gene or genome-wide transcriptional response, while ranking those TFs in order of their relative importance. Each search can be filtered based on the selection of specific environmental conditions, experimental evidence or positive/negative regulatory effect. Promoter analysis tools and interactive visualization tools for the representation of TF regulatory networks are also provided. The PathoYeastract database further provides simple tools for the prediction of gene and genomic regulation based on orthologous regulatory associations described for other yeast species, a comparative genomics setup for the study of cross-species evolution of regulatory networks. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
The 2018 Nucleic Acids Research database issue and the online molecular biology database collection.
Rigden, Daniel J; Fernández, Xosé M
2018-01-04
The 2018 Nucleic Acids Research Database Issue contains 181 papers spanning molecular biology. Among them, 82 are new and 84 are updates describing resources that appeared in the Issue previously. The remaining 15 cover databases most recently published elsewhere. Databases in the area of nucleic acids include 3DIV for visualisation of data on genome 3D structure and RNArchitecture, a hierarchical classification of RNA families. Protein databases include the established SMART, ELM and MEROPS while GPCRdb and the newcomer STCRDab cover families of biomedical interest. In the area of metabolism, HMDB and Reactome both report new features while PULDB appears in NAR for the first time. This issue also contains reports on genomics resources including Ensembl, the UCSC Genome Browser and ENCODE. Update papers from the IUPHAR/BPS Guide to Pharmacology and DrugBank are highlights of the drug and drug target section while a number of proteomics databases including proteomicsDB are also covered. The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). The NAR online Molecular Biology Database Collection has been updated, reviewing 138 entries, adding 88 new resources and eliminating 47 discontinued URLs, bringing the current total to 1737 databases. It is available at http://www.oxfordjournals.org/nar/database/c/. © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes.
Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim
2010-03-01
Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. The database can be accessed through http://proteinworlddb.org
Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context
Faith, Jeremiah J; Olson, Andrew J; Gardner, Timothy S; Sachidanandam, Ravi
2007-01-01
Background Lightweight genome viewer (lwgv) is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. Results lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. Conclusion lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired. PMID:17877794
GenomeRNAi: a database for cell-based RNAi phenotypes.
Horn, Thomas; Arziman, Zeynep; Berger, Juerg; Boutros, Michael
2007-01-01
RNA interference (RNAi) has emerged as a powerful tool to generate loss-of-function phenotypes in a variety of organisms. Combined with the sequence information of almost completely annotated genomes, RNAi technologies have opened new avenues to conduct systematic genetic screens for every annotated gene in the genome. As increasing large datasets of RNAi-induced phenotypes become available, an important challenge remains the systematic integration and annotation of functional information. Genome-wide RNAi screens have been performed both in Caenorhabditis elegans and Drosophila for a variety of phenotypes and several RNAi libraries have become available to assess phenotypes for almost every gene in the genome. These screens were performed using different types of assays from visible phenotypes to focused transcriptional readouts and provide a rich data source for functional annotation across different species. The GenomeRNAi database provides access to published RNAi phenotypes obtained from cell-based screens and maps them to their genomic locus, including possible non-specific regions. The database also gives access to sequence information of RNAi probes used in various screens. It can be searched by phenotype, by gene, by RNAi probe or by sequence and is accessible at http://rnai.dkfz.de.
GenomeRNAi: a database for cell-based RNAi phenotypes
Horn, Thomas; Arziman, Zeynep; Berger, Juerg; Boutros, Michael
2007-01-01
RNA interference (RNAi) has emerged as a powerful tool to generate loss-of-function phenotypes in a variety of organisms. Combined with the sequence information of almost completely annotated genomes, RNAi technologies have opened new avenues to conduct systematic genetic screens for every annotated gene in the genome. As increasing large datasets of RNAi-induced phenotypes become available, an important challenge remains the systematic integration and annotation of functional information. Genome-wide RNAi screens have been performed both in Caenorhabditis elegans and Drosophila for a variety of phenotypes and several RNAi libraries have become available to assess phenotypes for almost every gene in the genome. These screens were performed using different types of assays from visible phenotypes to focused transcriptional readouts and provide a rich data source for functional annotation across different species. The GenomeRNAi database provides access to published RNAi phenotypes obtained from cell-based screens and maps them to their genomic locus, including possible non-specific regions. The database also gives access to sequence information of RNAi probes used in various screens. It can be searched by phenotype, by gene, by RNAi probe or by sequence and is accessible at PMID:17135194
The BIG Data Center: from deposition to integration to translation
2017-01-01
Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn. PMID:27899658
Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context.
Faith, Jeremiah J; Olson, Andrew J; Gardner, Timothy S; Sachidanandam, Ravi
2007-09-18
Lightweight genome viewer (lwgv) is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired.
Bolbase: a comprehensive genomics database for Brassica oleracea.
Yu, Jingyin; Zhao, Meixia; Wang, Xiaowu; Tong, Chaobo; Huang, Shunmou; Tehrim, Sadia; Liu, Yumei; Hua, Wei; Liu, Shengyi
2013-09-30
Brassica oleracea is a morphologically diverse species in the family Brassicaceae and contains a group of nutrition-rich vegetable crops, including common heading cabbage, cauliflower, broccoli, kohlrabi, kale, Brussels sprouts. This diversity along with its phylogenetic membership in a group of three diploid and three tetraploid species, and the recent availability of genome sequences within Brassica provide an unprecedented opportunity to study intra- and inter-species divergence and evolution in this species and its close relatives. We have developed a comprehensive database, Bolbase, which provides access to the B. oleracea genome data and comparative genomics information. The whole genome of B. oleracea is available, including nine fully assembled chromosomes and 1,848 scaffolds, with 45,758 predicted genes, 13,382 transposable elements, and 3,581 non-coding RNAs. Comparative genomics information is available, including syntenic regions among B. oleracea, Brassica rapa and Arabidopsis thaliana, synonymous (Ks) and non-synonymous (Ka) substitution rates between orthologous gene pairs, gene families or clusters, and differences in quantity, category, and distribution of transposable elements on chromosomes. Bolbase provides useful search and data mining tools, including a keyword search, a local BLAST server, and a customized GBrowse tool, which can be used to extract annotations of genome components, identify similar sequences and visualize syntenic regions among species. Users can download all genomic data and explore comparative genomics in a highly visual setting. Bolbase is the first resource platform for the B. oleracea genome and for genomic comparisons with its relatives, and thus it will help the research community to better study the function and evolution of Brassica genomes as well as enhance molecular breeding research. This database will be updated regularly with new features, improvements to genome annotation, and new genomic sequences as they become available. Bolbase is freely available at http://ocri-genomics.org/bolbase.
Benchmarking distributed data warehouse solutions for storing genomic variant information
Wiewiórka, Marek S.; Wysakowicz, Dawid P.; Okoniewski, Michał J.
2017-01-01
Abstract Genomic-based personalized medicine encompasses storing, analysing and interpreting genomic variants as its central issues. At a time when thousands of patientss sequenced exomes and genomes are becoming available, there is a growing need for efficient database storage and querying. The answer could be the application of modern distributed storage systems and query engines. However, the application of large genomic variant databases to this problem has not been sufficiently far explored so far in the literature. To investigate the effectiveness of modern columnar storage [column-oriented Database Management System (DBMS)] and query engines, we have developed a prototypic genomic variant data warehouse, populated with large generated content of genomic variants and phenotypic data. Next, we have benchmarked performance of a number of combinations of distributed storages and query engines on a set of SQL queries that address biological questions essential for both research and medical applications. In addition, a non-distributed, analytical database (MonetDB) has been used as a baseline. Comparison of query execution times confirms that distributed data warehousing solutions outperform classic relational DBMSs. Moreover, pre-aggregation and further denormalization of data, which reduce the number of distributed join operations, significantly improve query performance by several orders of magnitude. Most of distributed back-ends offer a good performance for complex analytical queries, while the Optimized Row Columnar (ORC) format paired with Presto and Parquet with Spark 2 query engines provide, on average, the lowest execution times. Apache Kudu on the other hand, is the only solution that guarantees a sub-second performance for simple genome range queries returning a small subset of data, where low-latency response is expected, while still offering decent performance for running analytical queries. In summary, research and clinical applications that require the storage and analysis of variants from thousands of samples can benefit from the scalability and performance of distributed data warehouse solutions. Database URL: https://github.com/ZSI-Bio/variantsdwh PMID:29220442
Lazzari, Barbara; Caprera, Andrea; Cestaro, Alessandro; Merelli, Ivan; Del Corvo, Marcello; Fontana, Paolo; Milanesi, Luciano; Velasco, Riccardo; Stella, Alessandra
2009-06-29
Two complete genome sequences are available for Vitis vinifera Pinot noir. Based on the sequence and gene predictions produced by the IASMA, we performed an in silico detection of putative microRNA genes and of their targets, and collected the most reliable microRNA predictions in a web database. The application is available at http://www.itb.cnr.it/ptp/grapemirna/. The program FindMiRNA was used to detect putative microRNA genes in the grape genome. A very high number of predictions was retrieved, calling for validation. Nine parameters were calculated and, based on the grape microRNAs dataset available at miRBase, thresholds were defined and applied to FindMiRNA predictions having targets in gene exons. In the resulting subset, predictions were ranked according to precursor positions and sequence similarity, and to target identity. To further validate FindMiRNA predictions, comparisons to the Arabidopsis genome, to the grape Genoscope genome, and to the grape EST collection were performed. Results were stored in a MySQL database and a web interface was prepared to query the database and retrieve predictions of interest. The GrapeMiRNA database encompasses 5,778 microRNA predictions spanning the whole grape genome. Predictions are integrated with information that can be of use in selection procedures. Tools added in the web interface also allow to inspect predictions according to gene ontology classes and metabolic pathways of targets. The GrapeMiRNA database can be of help in selecting candidate microRNA genes to be validated.
Pócsi, István; Miskei, Márton; Karányi, Zsolt; Emri, Tamás; Ayoubi, Patricia; Pusztahelyi, Tünde; Balla, György; Prade, Rolf A
2005-01-01
Background In addition to their cytotoxic nature, reactive oxygen species (ROS) are also signal molecules in diverse cellular processes in eukaryotic organisms. Linking genome-wide transcriptional changes to cellular physiology in oxidative stress-exposed Aspergillus nidulans cultures provides the opportunity to estimate the sizes of peroxide (O22-), superoxide (O2•-) and glutathione/glutathione disulphide (GSH/GSSG) redox imbalance responses. Results Genome-wide transcriptional changes triggered by diamide, H2O2 and menadione in A. nidulans vegetative tissues were recorded using DNA microarrays containing 3533 unique PCR-amplified probes. Evaluation of LOESS-normalized data indicated that 2499 gene probes were affected by at least one stress-inducing agent. The stress induced by diamide and H2O2 were pulse-like, with recovery after 1 h exposure time while no recovery was observed with menadione. The distribution of stress-responsive gene probes among major physiological functional categories was approximately the same for each agent. The gene group sizes solely responsive to changes in intracellular O22-, O2•- concentrations or to GSH/GSSG redox imbalance were estimated at 7.7, 32.6 and 13.0 %, respectively. Gene groups responsive to diamide, H2O2 and menadione treatments and gene groups influenced by GSH/GSSG, O22- and O2•- were only partly overlapping with distinct enrichment profiles within functional categories. Changes in the GSH/GSSG redox state influenced expression of genes coding for PBS2 like MAPK kinase homologue, PSK2 kinase homologue, AtfA transcription factor, and many elements of ubiquitin tagging, cell division cycle regulators, translation machinery proteins, defense and stress proteins, transport proteins as well as many enzymes of the primary and secondary metabolisms. Meanwhile, a separate set of genes encoding transport proteins, CpcA and JlbA amino acid starvation-responsive transcription factors, and some elements of sexual development and sporulation was ROS responsive. Conclusion The existence of separate O22-, O2•- and GSH/GSSG responsive gene groups in a eukaryotic genome has been demonstrated. Oxidant-triggered, genome-wide transcriptional changes should be analyzed considering changes in oxidative stress-responsive physiological conditions and not correlating them directly to the chemistry and concentrations of the oxidative stress-inducing agent. PMID:16368011
User Guidelines for the Brassica Database: BRAD.
Wang, Xiaobo; Cheng, Feng; Wang, Xiaowu
2016-01-01
The genome sequence of Brassica rapa was first released in 2011. Since then, further Brassica genomes have been sequenced or are undergoing sequencing. It is therefore necessary to develop tools that help users to mine information from genomic data efficiently. This will greatly aid scientific exploration and breeding application, especially for those with low levels of bioinformatic training. Therefore, the Brassica database (BRAD) was built to collect, integrate, illustrate, and visualize Brassica genomic datasets. BRAD provides useful searching and data mining tools, and facilitates the search of gene annotation datasets, syntenic or non-syntenic orthologs, and flanking regions of functional genomic elements. It also includes genome-analysis tools such as BLAST and GBrowse. One of the important aims of BRAD is to build a bridge between Brassica crop genomes with the genome of the model species Arabidopsis thaliana, thus transferring the bulk of A. thaliana gene study information for use with newly sequenced Brassica crops.
LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.
Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun
2012-01-01
Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene ontology (GO) annotation, promoter identification, gene expression (co-expression), and evolutionary analysis. This database not only provides a way to define lineage-specific and species-specific gene clusters but also facilitates future studies on gene co-regulation, epigenetic control of gene expression (DNA methylation and histone marks), and chromosomal structures in a context of gene clusters and species evolution. LCGbase is freely available at http://lcgbase.big.ac.cn/LCGbase.
Takahashi, Yui; Kawabata, Hiroaki; Murakami, Shuichiro
2013-01-01
Xylanases produced by Aspergillus niger are industrially important and many types of xylanases have been reported. Individual xylanases have been well studied for their enzymatic properties, gene cloning, and heterologous expression. However, less attention has been paid to the relationship between xylanase genes carried on the A. niger genome and xylanases produced by A. niger strains. Therefore, we examined xylanase genes encoded on the genome of A. niger E-1 and xylanases produced in culture. Seven putative xylanase genes, xynI-VII (named in ascending order of the molecular masses of the deduced amino acid sequences), were amplified from the strain E-1 genome using primers designed from the genome sequence of A. niger CBS 513.88 by PCR and phylogenetically classified into three clusters. Additionally, culture supernatant analysis by DE52 anion-exchange column chromatography revealed that this strain produced three xylanases, XynII, XynIII, and XynVII, which were identified by N-terminal amino acid sequencing and MALDI-TOF-MS analyses, in culture when gown in 0.5% xylan medium supplemented with 50 mM succinate. Furthermore, XynVII, the only GH family 10 xylanase in A. niger E-1, was purified and characterized. The purified enzyme showed a single band with a molecular mass of 35 kDa by SDS-PAGE. The highest activity of purified XynVII was observed at 55°C and pH 5.5. The enzyme was stable in the broad pH range of 3-10 and up to 60°C and was resistant to most metal ions and modifying regents. XynVII showed high specificity against beechwood xylan with K m and V max values of 2.8 mg mL(-1) and 127 μmol min(-1)mg(-1), respectively. TLC and MALDI-TOF-MS analyses showed that the final hydrolyzed products of the enzyme from beechwood xylan were xylose, xylobiose, and xylotriose substituted with a 4-o-metylglucuronic acid residue.
Genome sequence analysis of a flocculant-producing bacterium, Paenibacillus shenyangensis.
Fu, Lili; Jiang, Binhui; Liu, Jinliang; Zhao, Xin; Liu, Qian; Hu, Xiaomin
2016-03-01
To explore the metabolic process of Paenibacillus shenyangensis that is an efficient bioflocculant-producing bacterium. The biosynthesis mechanism of bioflocculation was used to enrich the genome of Paenibacillus shenyangensis and provide a basis for molecular genetics and functional genomics analyses. According to the analysis of de novo assembly, a total of 5,501,467 bp clean reads were generated, and were assembled into 92 contigs. 4800 unigenes were predicted of which 4393 were annotated showing a specific gene function in the NCBI-Nr database. 3423 genes were found in the database of cluster of orthologous groups. Among the 168 Kyoto Encyclopedia of Genes and Genomes database, cell growth and metabolism were the main biological processes, and a potential metabolic pathway was predicted from glucose to exopolysaccharide within the starch and sucrose metabolism pathway. By using the high-throughput sequencing technology, we provide a genome analysis of Paenibacillus shenyangensis that predicts the main metabolic processes and a potential pathway of exopolysaccharide biosynthesis.
GrTEdb: the first web-based database of transposable elements in cotton (Gossypium raimondii).
Xu, Zhenzhen; Liu, Jing; Ni, Wanchao; Peng, Zhen; Guo, Yue; Ye, Wuwei; Huang, Fang; Zhang, Xianggui; Xu, Peng; Guo, Qi; Shen, Xinlian; Du, Jianchang
2017-01-01
Although several diploid and tetroploid Gossypium species genomes have been sequenced, the well annotated web-based transposable elements (TEs) database is lacking. To better understand the roles of TEs in structural, functional and evolutionary dynamics of the cotton genome, a comprehensive, specific, and user-friendly web-based database, Gossypium raimondii transposable elements database (GrTEdb), was constructed. A total of 14 332 TEs were structurally annotated and clearly categorized in G. raimondii genome, and these elements have been classified into seven distinct superfamilies based on the order of protein-coding domains, structures and/or sequence similarity, including 2929 Copia-like elements, 10 368 Gypsy-like elements, 299 L1 , 12 Mutators , 435 PIF-Harbingers , 275 CACTAs and 14 Helitrons . Meanwhile, the web-based sequence browsing, searching, downloading and blast tool were implemented to help users easily and effectively to annotate the TEs or TE fragments in genomic sequences from G. raimondii and other closely related Gossypium species. GrTEdb provides resources and information related with TEs in G. raimondii , and will facilitate gene and genome analyses within or across Gossypium species, evaluating the impact of TEs on their host genomes, and investigating the potential interaction between TEs and protein-coding genes in Gossypium species. http://www.grtedb.org/. © The Author(s) 2017. Published by Oxford University Press.
The Ensembl genome database project.
Hubbard, T; Barker, D; Birney, E; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Huminiecki, L; Kasprzyk, A; Lehvaslaiho, H; Lijnzaad, P; Melsopp, C; Mongin, E; Pettett, R; Pocock, M; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Clamp, M
2002-01-01
The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.
RefSeq microbial genomes database: new representation and annotation strategy.
Tatusova, Tatiana; Ciufo, Stacy; Fedorov, Boris; O'Neill, Kathleen; Tolstoy, Igor
2014-01-01
The source of the microbial genomic sequences in the RefSeq collection is the set of primary sequence records submitted to the International Nucleotide Sequence Database public archives. These can be accessed through the Entrez search and retrieval system at http://www.ncbi.nlm.nih.gov/genome. Next-generation sequencing has enabled researchers to perform genomic sequencing at rates that were unimaginable in the past. Microbial genomes can now be sequenced in a matter of hours, which has led to a significant increase in the number of assembled genomes deposited in the public archives. This huge increase in DNA sequence data presents new challenges for the annotation, analysis and visualization bioinformatics tools. New strategies have been developed for the annotation and representation of reference genomes and sequence variations derived from population studies and clinical outbreaks.
The MAR databases: development and implementation of databases specific for marine metagenomics.
Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen; Willassen, Nils P
2018-01-04
We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
ASGARD: an open-access database of annotated transcriptomes for emerging model arthropod species.
Zeng, Victor; Extavour, Cassandra G
2012-01-01
The increased throughput and decreased cost of next-generation sequencing (NGS) have shifted the bottleneck genomic research from sequencing to annotation, analysis and accessibility. This is particularly challenging for research communities working on organisms that lack the basic infrastructure of a sequenced genome, or an efficient way to utilize whatever sequence data may be available. Here we present a new database, the Assembled Searchable Giant Arthropod Read Database (ASGARD). This database is a repository and search engine for transcriptomic data from arthropods that are of high interest to multiple research communities but currently lack sequenced genomes. We demonstrate the functionality and utility of ASGARD using de novo assembled transcriptomes from the milkweed bug Oncopeltus fasciatus, the cricket Gryllus bimaculatus and the amphipod crustacean Parhyale hawaiensis. We have annotated these transcriptomes to assign putative orthology, coding region determination, protein domain identification and Gene Ontology (GO) term annotation to all possible assembly products. ASGARD allows users to search all assemblies by orthology annotation, GO term annotation or Basic Local Alignment Search Tool. User-friendly features of ASGARD include search term auto-completion suggestions based on database content, the ability to download assembly product sequences in FASTA format, direct links to NCBI data for predicted orthologs and graphical representation of the location of protein domains and matches to similar sequences from the NCBI non-redundant database. ASGARD will be a useful repository for transcriptome data from future NGS studies on these and other emerging model arthropods, regardless of sequencing platform, assembly or annotation status. This database thus provides easy, one-stop access to multi-species annotated transcriptome information. We anticipate that this database will be useful for members of multiple research communities, including developmental biology, physiology, evolutionary biology, ecology, comparative genomics and phylogenomics. Database URL: asgard.rc.fas.harvard.edu.
VitisExpDB: a database resource for grape functional genomics.
Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L
2008-02-28
The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores approximately 320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of approximately 20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website http://cropdisease.ars.usda.gov/vitis_at/main-page.htm.
VitisExpDB: A database resource for grape functional genomics
Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L
2008-01-01
Background The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. Description VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores ~320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of ~20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. Conclusion The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website . PMID:18307813
EDGAR: A software framework for the comparative analysis of prokaryotic genomes
Blom, Jochen; Albaum, Stefan P; Doppmeier, Daniel; Pühler, Alfred; Vorhölter, Frank-Jörg; Zakrzewski, Martha; Goesmann, Alexander
2009-01-01
Background The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results To support these studies EDGAR – "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface , where the precomputed data sets can be browsed. PMID:19457249
Metabolomic Tools to Assess the Chemistry and Bioactivity of Endophytic Aspergillus Strain.
Tawfike, Ahmed F; Tate, Rothwelle; Abbott, Gráinne; Young, Louise; Viegelmann, Christina; Schumacher, Marc; Diederich, Marc; Edrada-Ebel, RuAngelie
2017-10-01
Endophytic fungi associated with medicinal plants are a potential source of novel chemistry and biology that may find applications as pharmaceutical and agrochemical drugs. In this study, a combination of metabolomics and bioactivity-guided approaches were employed to isolate secondary metabolites with cytotoxicity against cancer cells from an endophytic Aspergillus aculeatus. The endophyte was isolated from the Egyptian medicinal plant Terminalia laxiflora and identified using molecular biological methods. Metabolomics and dereplication studies were accomplished by utilizing the MZmine software coupled with the universal Dictionary of Natural Products database. Metabolic profiling, with aid of multivariate data analysis, was performed at different stages of the growth curve to choose the optimized method suitable for up-scaling. The optimized culture method yielded a crude extract abundant with biologically-active secondary metabolites. Crude extracts were fractionated using different high-throughput chromatographic techniques. Purified compounds were identified by HR-ESI-MS, 1D- and 2D-NMR. This study introduced a new method of dereplication utilizing both high-resolution mass spectrometry and NMR spectroscopy. The metabolites were putatively identified by applying a chemotaxonomic filter. We also present a short review on the diverse chemistry of terrestrial endophytic strains of Aspergillus, which has become a part of our dereplication work and this will be of wide interest to those working in this field. © 2017 Wiley-VHCA AG, Zurich, Switzerland.
Mishra, Bishwambhar; Suneetha, V
2014-07-01
The main focus of this study was to screen and characterize novel microbial strains isolated from culinary leaf samples, capable of producing high concentrations of pullulan. Hundred isolates were screened from the phylloplane of different plants. The results revealed that eight strains had the capability to produce exopolysaccharide (EPS) and only one potential strain (designated as VIT-SB1) could produce the significant amount of EPS (3.9 ± 0.02%) on the 6th day of the fermentation without optimisation. The EPS synthesized by VIT-SB1 strain was confirmed to be pullulan on the basis of the results of FT-IR, HPLC and the enzymatic (Pullulanase) analysis. More than 91% hydrolysis of pullulan by pullulanase enzyme also indicated the presence of α (1 → 6) glycosidic linkages of α (1 → 4) linked maltotriose units. This VIT-SB1 strain was identified as Aspergillus japonicus based on the nucleotide sequence of the D1/D2 domain of Large-Subunit rRNA gene. The sequence was submitted to the GenBank Nucleotide sequence database with Accession No: KC128815. This study has confirmed that pullulan production capacity of this novel strain and Aureobasidium pullulans are comparable. Hence Aspergillus japonicus-VIT-SB1 strain can be commercially exploited as a potential pullulan producing strain.
Viennas, Emmanouil; Komianou, Angeliki; Mizzi, Clint; Stojiljkovic, Maja; Mitropoulou, Christina; Muilu, Juha; Vihinen, Mauno; Grypioti, Panagiota; Papadaki, Styliani; Pavlidis, Cristiana; Zukic, Branka; Katsila, Theodora; van der Spek, Peter J.; Pavlovic, Sonja; Tzimas, Giannis; Patrinos, George P.
2017-01-01
FINDbase (http://www.findbase.org) is a comprehensive data repository that records the prevalence of clinically relevant genomic variants in various populations worldwide, such as pathogenic variants leading mostly to monogenic disorders and pharmacogenomics biomarkers. The database also records the incidence of rare genetic diseases in various populations, all in well-distinct data modules. Here, we report extensive data content updates in all data modules, with direct implications to clinical pharmacogenomics. Also, we report significant new developments in FINDbase, namely (i) the release of a new version of the ETHNOS software that catalyzes development curation of national/ethnic genetic databases, (ii) the migration of all FINDbase data content into 90 distinct national/ethnic mutation databases, all built around Microsoft's PivotViewer (http://www.getpivot.com) software (iii) new data visualization tools and (iv) the interrelation of FINDbase with DruGeVar database with direct implications in clinical pharmacogenomics. The abovementioned updates further enhance the impact of FINDbase, as a key resource for Genomic Medicine applications. PMID:27924022
Database resources of the National Center for Biotechnology Information
Wheeler, David L.; Church, Deanna M.; Lash, Alex E.; Leipe, Detlef D.; Madden, Thomas L.; Pontius, Joan U.; Schuler, Gregory D.; Schriml, Lynn M.; Tatusova, Tatiana A.; Wagner, Lukas; Rapp, Barbara A.
2001-01-01
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s Web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, GeneMap’99, Human–Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, Cancer Genome Anatomy Project (CGAP), SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov. PMID:11125038
Taxonomic re-evaluation of black koji molds.
Hong, Seung-Beom; Yamada, Osamu; Samson, Robert A
2014-01-01
Black koji molds including its albino mutant, the white koji mold, have been widely used for making the distilled spirit shochu in Northeast Asia because they produce citric acid which prevents undesirable contamination from bacteria. Since Inui reported Aspergillus luchuensis from black koji in Okinawa in 1901, many fungal names associated with black koji molds were reported. However, some species are similar and differentiation between species is difficult. Fungal taxonomists tried to arrange a taxonomic system for black koji molds, but the results were not clear. Recently, multi-locus sequence typing has been successfully used to taxonomy of black Aspergillus. According to β-tubulin and calmodulin gene sequences, black koji molds can be subdivided in three species, A. luchuensis, Aspergillus niger, and Aspergillus tubingensis. Aspergillus awamori, Aspergillus kawachii, Aspergillus inuii, Aspergillus nakazawai, and Aspergillus coreanus are synonyms of A. luchuensis, Aspergillus batatae, Aspergillus aureus (or Aspergillus foetidus), Aspergillus miyakoensis, and Aspergillus usamii (including A. usamii mut. shirousamii) are synonyms of A. niger and Aspergillus saitoi and A. saitoi var. kagoshimaensis are synonyms of A. tubingensis. A. luchuensis mut. kawachii was suggested particular names for A. kawachii because of their industrial importance. The history and modern taxonomy of black koji molds is further discussed.
Active Site Characterization of Proteases Sequences from Different Species of Aspergillus.
Morya, V K; Yadav, Virendra K; Yadav, Sangeeta; Yadav, Dinesh
2016-09-01
A total of 129 proteases sequences comprising 43 serine proteases, 36 aspartic proteases, 24 cysteine protease, 21 metalloproteases, and 05 neutral proteases from different Aspergillus species were analyzed for the catalytically active site residues using MEROPS database and various bioinformatics tools. Different proteases have predominance of variable active site residues. In case of 24 cysteine proteases of Aspergilli, the predominant active site residues observed were Gln193, Cys199, His364, Asn384 while for 43 serine proteases, the active site residues namely Asp164, His193, Asn284, Ser349 and Asp325, His357, Asn454, Ser519 were frequently observed. The analysis of 21 metalloproteases of Aspergilli revealed Glu298 and Glu388, Tyr476 as predominant active site residues. In general, Aspergilli species-specific active site residues were observed for different types of protease sequences analyzed. The phylogenetic analysis of these 129 proteases sequences revealed 14 different clans representing different types of proteases with diverse active site residues.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karpinets, Tatiana V; Park, Byung; Syed, Mustafa H
2010-01-01
The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire non-redundant sequences of the CAZy database. Themore » second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains (DUF) and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit (CAT), and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.« less
Supervised Learning for Detection of Duplicates in Genomic Sequence Databases.
Chen, Qingyu; Zobel, Justin; Zhang, Xiuzhen; Verspoor, Karin
2016-01-01
First identified as an issue in 1996, duplication in biological databases introduces redundancy and even leads to inconsistency when contradictory information appears. The amount of data makes purely manual de-duplication impractical, and existing automatic systems cannot detect duplicates as precisely as can experts. Supervised learning has the potential to address such problems by building automatic systems that learn from expert curation to detect duplicates precisely and efficiently. While machine learning is a mature approach in other duplicate detection contexts, it has seen only preliminary application in genomic sequence databases. We developed and evaluated a supervised duplicate detection method based on an expert curated dataset of duplicates, containing over one million pairs across five organisms derived from genomic sequence databases. We selected 22 features to represent distinct attributes of the database records, and developed a binary model and a multi-class model. Both models achieve promising performance; under cross-validation, the binary model had over 90% accuracy in each of the five organisms, while the multi-class model maintains high accuracy and is more robust in generalisation. We performed an ablation study to quantify the impact of different sequence record features, finding that features derived from meta-data, sequence identity, and alignment quality impact performance most strongly. The study demonstrates machine learning can be an effective additional tool for de-duplication of genomic sequence databases. All Data are available as described in the supplementary material.
Park, Byung H; Karpinets, Tatiana V; Syed, Mustafa H; Leuze, Michael R; Uberbacher, Edward C
2010-12-01
The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire nonredundant sequences of the CAZy database. The second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit, and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.
PGSB/MIPS Plant Genome Information Resources and Concepts for the Analysis of Complex Grass Genomes.
Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X
2016-01-01
PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.
GEMINI: a computationally-efficient search engine for large gene expression datasets.
DeFreitas, Timothy; Saddiki, Hachem; Flaherty, Patrick
2016-02-24
Low-cost DNA sequencing allows organizations to accumulate massive amounts of genomic data and use that data to answer a diverse range of research questions. Presently, users must search for relevant genomic data using a keyword, accession number of meta-data tag. However, in this search paradigm the form of the query - a text-based string - is mismatched with the form of the target - a genomic profile. To improve access to massive genomic data resources, we have developed a fast search engine, GEMINI, that uses a genomic profile as a query to search for similar genomic profiles. GEMINI implements a nearest-neighbor search algorithm using a vantage-point tree to store a database of n profiles and in certain circumstances achieves an [Formula: see text] expected query time in the limit. We tested GEMINI on breast and ovarian cancer gene expression data from The Cancer Genome Atlas project and show that it achieves a query time that scales as the logarithm of the number of records in practice on genomic data. In a database with 10(5) samples, GEMINI identifies the nearest neighbor in 0.05 sec compared to a brute force search time of 0.6 sec. GEMINI is a fast search engine that uses a query genomic profile to search for similar profiles in a very large genomic database. It enables users to identify similar profiles independent of sample label, data origin or other meta-data information.
GenColors: annotation and comparative genomics of prokaryotes made easy.
Romualdi, Alessandro; Felder, Marius; Rose, Dominic; Gausmann, Ulrike; Schilhabel, Markus; Glöckner, Gernot; Platzer, Matthias; Sühnel, Jürgen
2007-01-01
GenColors (gencolors.fli-leibniz.de) is a new web-based software/database system aimed at an improved and accelerated annotation of prokaryotic genomes considering information on related genomes and making extensive use of genome comparison. It offers a seamless integration of data from ongoing sequencing projects and annotated genomic sequences obtained from GenBank. A variety of export/import filters manages an effective data flow from sequence assembly and manipulation programs (e.g., GAP4) to GenColors and back as well as to standard GenBank file(s). The genome comparison tools include best bidirectional hits, gene conservation, syntenies, and gene core sets. Precomputed UniProt matches allow annotation and analysis in an effective manner. In addition to these analysis options, base-specific quality data (coverage and confidence) can also be handled if available. The GenColors system can be used both for annotation purposes in ongoing genome projects and as an analysis tool for finished genomes. GenColors comes in two types, as dedicated genome browsers and as the Jena Prokaryotic Genome Viewer (JPGV). Dedicated genome browsers contain genomic information on a set of related genomes and offer a large number of options for genome comparison. The system has been efficiently used in the genomic sequencing of Borrelia garinii and is currently applied to various ongoing genome projects on Borrelia, Legionella, Escherichia, and Pseudomonas genomes. One of these dedicated browsers, the Spirochetes Genome Browser (sgb.fli-leibniz.de) with Borrelia, Leptospira, and Treponema genomes, is freely accessible. The others will be released after finalization of the corresponding genome projects. JPGV (jpgv.fli-leibniz.de) offers information on almost all finished bacterial genomes, as compared to the dedicated browsers with reduced genome comparison functionality, however. As of January 2006, this viewer includes 632 genomic elements (e.g., chromosomes and plasmids) of 293 species. The system provides versatile quick and advanced search options for all currently known prokaryotic genomes and generates circular and linear genome plots. Gene information sheets contain basic gene information, database search options, and links to external databases. GenColors is also available on request for local installation.
Mora-Lugo, Rodrigo; Madrigal, Marvin; Yelemane, Vikas; Fernandez-Lahore, Marcelo
2015-11-01
The biotechnological value of Aspergillus sojae ATCC 20235 (A. sojae) for production of pectinases in solid-state fermentation (SSF) has been demonstrated recently. However, a common drawback of fungal solid-state cultures is the poor diffusion of oxygen into the fungi that limits its growth and biological productivity. The bacterial Vitreoscilla hemoglobin (VHb) has favored the metabolism and productivities of various bacterial and yeast strains besides alleviating hypoxic conditions of its native host, but the use of VHb in filamentous fungi still remains poor explored. Based on the known effects of VHb, this study assessed its applicability to improve A. sojae performance in SSF. The VHb gene (vgb) under control of the constitutive Aspergillus nidulants gpdA promoter was introduced into the genome of A. sojae by Agrobacterium-mediated transformation. Successful fungal transformants were identified by fluorescence microscopy and polymerase chain reaction (PCR) analyses. In solid-state cultures, the content of protease, exo-polygalacturonase (exo-PG), and exo-polymethylgalacturonase (exo-PMG) of the transformed fungus (A. sojae vgb+) improved were 26, 60, and 44 % higher, respectively, in comparison to its parental strain (A. sojae wt). Similarly, biomass content was also 1.3 times higher in the transformant strain. No significant difference was observed in endo-polygalacturonase (endo-PG) content between both fungal strains, suggesting dissimilar effects of VHb towards different enzymatic productions. Overall, our results show that biomass, protease, and exo-pectinase content of A. sojae in SSF can be improved by transformation with VHb.
FluG affects secretion in colonies of Aspergillus niger.
Wang, Fengfeng; Krijgsheld, Pauline; Hulsman, Marc; de Bekker, Charissa; Müller, Wally H; Reinders, Marcel; de Vries, Ronald P; Wösten, Han A B
2015-01-01
Colonies of Aspergillus niger are characterized by zonal heterogeneity in growth, sporulation, gene expression and secretion. For instance, the glucoamylase gene glaA is more highly expressed at the periphery of colonies when compared to the center. As a consequence, its encoded protein GlaA is mainly secreted at the outer part of the colony. Here, multiple copies of amyR were introduced in A. niger. Most transformants over-expressing this regulatory gene of amylolytic genes still displayed heterogeneous glaA expression and GlaA secretion. However, heterogeneity was abolished in transformant UU-A001.13 by expressing glaA and secreting GlaA throughout the mycelium. Sequencing the genome of UU-A001.13 revealed that transformation had been accompanied by deletion of part of the fluG gene and disrupting its 3' end by integration of a transformation vector. Inactivation of fluG in the wild-type background of A. niger also resulted in breakdown of starch under the whole colony. Asexual development of the ∆fluG strain was not affected, unlike what was previously shown in Aspergillus nidulans. Genes encoding proteins with a signal sequence for secretion, including part of the amylolytic genes, were more often downregulated in the central zone of maltose-grown ∆fluG colonies and upregulated in the intermediate part and periphery when compared to the wild-type. Together, these data indicate that FluG of A. niger is a repressor of secretion.
A Database for Tracking Toxicogenomic Samples and Procedures with Genomic, Proteomic and Metabonomic Components
Wenjun Bao1, Jennifer Fostel2, Michael D. Waters2, B. Alex Merrick2, Drew Ekman3, Mitchell Kostich4, Judith Schmid1, David Dix1
Office of Research and Developmen...
Fast neutron mutants database and web displays at SoyBase
USDA-ARS?s Scientific Manuscript database
SoyBase, the USDA-ARS soybean genetics and genomics database, has been expanded to include data for the fast neutron mutants produced by Bolon, Vance, et al. In addition to the expected text and sequence homology searches and visualization of the indels in the context of the genome sequence viewer, ...
Chaudhary, Sakshi; Mishra, Bharat Kumar; Vivek, Thiruvettai; Magadum, Santoshkumar; Yasin, Jeshima Khan
2016-01-01
Simple Sequence Repeats or microsatellites are resourceful molecular genetic markers. There are only few reports of SSR identification and development in pineapple. Complete genome sequence of pineapple available in the public domain can be used to develop numerous novel SSRs. Therefore, an attempt was made to identify SSRs from genomic, chloroplast, mitochondrial and EST sequences of pineapple which will help in deciphering genetic makeup of its germplasm resources. A total of 359511 SSRs were identified in pineapple (356385 from genome sequence, 45 from chloroplast sequence, 249 in mitochondrial sequence and 2832 from EST sequences). The list of EST-SSR markers and their details are available in the database. PineElm_SSRdb is an open source database available for non-commercial academic purpose at http://app.bioelm.com/ with a mapping tool which can develop circular maps of selected marker set. This database will be of immense use to breeders, researchers and graduates working on Ananas spp. and to others working on cross-species transferability of markers, investigating diversity, mapping and DNA fingerprinting.
Biological Databases for Human Research
Zou, Dong; Ma, Lina; Yu, Jun; Zhang, Zhang
2015-01-01
The completion of the Human Genome Project lays a foundation for systematically studying the human genome from evolutionary history to precision medicine against diseases. With the explosive growth of biological data, there is an increasing number of biological databases that have been developed in aid of human-related research. Here we present a collection of human-related biological databases and provide a mini-review by classifying them into different categories according to their data types. As human-related databases continue to grow not only in count but also in volume, challenges are ahead in big data storage, processing, exchange and curation. PMID:25712261
Ortholog Identification and Comparative Analysis of Microbial Genomes Using MBGD and RECOG.
Uchiyama, Ikuo
2017-01-01
Comparative genomics is becoming an essential approach for identification of genes associated with a specific function or phenotype. Here, we introduce the microbial genome database for comparative analysis (MBGD), which is a comprehensive ortholog database among the microbial genomes available so far. MBGD contains several precomputed ortholog tables including the standard ortholog table covering the entire taxonomic range and taxon-specific ortholog tables for various major taxa. In addition, MBGD allows the users to create an ortholog table within any specified set of genomes through dynamic calculations. In particular, MBGD has a "My MBGD" mode where users can upload their original genome sequences and incorporate them into orthology analysis. The created ortholog table can serve as the basis for various comparative analyses. Here, we describe the use of MBGD and briefly explain how to utilize the orthology information during comparative genome analysis in combination with the stand-alone comparative genomics software RECOG, focusing on the application to comparison of closely related microbial genomes.
RatMap--rat genome tools and data.
Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M; Ståhl, Fredrik
2005-01-01
The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB-Genetics at Goteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided.
RatMap—rat genome tools and data
Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M.; Ståhl, Fredrik
2005-01-01
The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB–Genetics at Göteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided. PMID:15608244
Karp, Peter D; Paley, Suzanne; Romero, Pedro
2002-01-01
Bioinformatics requires reusable software tools for creating model-organism databases (MODs). The Pathway Tools is a reusable, production-quality software environment for creating a type of MOD called a Pathway/Genome Database (PGDB). A PGDB such as EcoCyc (see http://ecocyc.org) integrates our evolving understanding of the genes, proteins, metabolic network, and genetic network of an organism. This paper provides an overview of the four main components of the Pathway Tools: The PathoLogic component supports creation of new PGDBs from the annotated genome of an organism. The Pathway/Genome Navigator provides query, visualization, and Web-publishing services for PGDBs. The Pathway/Genome Editors support interactive updating of PGDBs. The Pathway Tools ontology defines the schema of PGDBs. The Pathway Tools makes use of the Ocelot object database system for data management services for PGDBs. The Pathway Tools has been used to build PGDBs for 13 organisms within SRI and by external users.
UCbase 2.0: ultraconserved sequences database (2014 update)
Lomonaco, Vincenzo; Martoglia, Riccardo; Mandreoli, Federica; Anderlucci, Laura; Emmett, Warren; Bicciato, Silvio; Taccioli, Cristian
2014-01-01
UCbase 2.0 (http://ucbase.unimore.it) is an update, extension and evolution of UCbase, a Web tool dedicated to the analysis of ultraconserved sequences (UCRs). UCRs are 481 sequences >200 bases sharing 100% identity among human, mouse and rat genomes. They are frequently located in genomic regions known to be involved in cancer or differentially expressed in human leukemias and carcinomas. UCbase 2.0 is a platform-independent Web resource that includes the updated version of the human genome annotation (hg19), information linking disorders to chromosomal coordinates based on the Systematized Nomenclature of Medicine classification, a query tool to search for Single Nucleotide Polymorphisms (SNPs) and a new text box to directly interrogate the database using a MySQL interface. To facilitate the interactive visual interpretation of UCR chromosomal positioning, UCbase 2.0 now includes a graph visualization interface directly linked to UCSC genome browser. Database URL: http://ucbase.unimore.it PMID:24951797
PGMapper: a web-based tool linking phenotype to genes.
Xiong, Qing; Qiu, Yuhui; Gu, Weikuan
2008-04-01
With the availability of whole genome sequence in many species, linkage analysis, positional cloning and microarray are gradually becoming powerful tools for investigating the links between phenotype and genotype or genes. However, in these methods, causative genes underlying a quantitative trait locus, or a disease, are usually located within a large genomic region or a large set of genes. Examining the function of every gene is very time consuming and needs to retrieve and integrate the information from multiple databases or genome resources. PGMapper is a software tool for automatically matching phenotype to genes from a defined genome region or a group of given genes by combining the mapping information from the Ensembl database and gene function information from the OMIM and PubMed databases. PGMapper is currently available for candidate gene search of human, mouse, rat, zebrafish and 12 other species. Available online at http://www.genediscovery.org/pgmapper/index.jsp.
GWFASTA: server for FASTA search in eukaryotic and microbial genomes.
Issac, Biju; Raghava, G P S
2002-09-01
Similarity searches are a powerful method for solving important biological problems such as database scanning, evolutionary studies, gene prediction, and protein structure prediction. FASTA is a widely used sequence comparison tool for rapid database scanning. Here we describe the GWFASTA server that was developed to assist the FASTA user in similarity searches against partially and/or completely sequenced genomes. GWFASTA consists of more than 60 microbial genomes, eight eukaryote genomes, and proteomes of annotatedgenomes. Infact, it provides the maximum number of databases for similarity searching from a single platform. GWFASTA allows the submission of more than one sequence as a single query for a FASTA search. It also provides integrated post-processing of FASTA output, including compositional analysis of proteins, multiple sequences alignment, and phylogenetic analysis. Furthermore, it summarizes the search results organism-wise for prokaryotes and chromosome-wise for eukaryotes. Thus, the integration of different tools for sequence analyses makes GWFASTA a powerful toolfor biologists.
CROPPER: a metagene creator resource for cross-platform and cross-species compendium studies.
Paananen, Jussi; Storvik, Markus; Wong, Garry
2006-09-22
Current genomic research methods provide researchers with enormous amounts of data. Combining data from different high-throughput research technologies commonly available in biological databases can lead to novel findings and increase research efficiency. However, combining data from different heterogeneous sources is often a very arduous task. These sources can be different microarray technology platforms, genomic databases, or experiments performed on various species. Our aim was to develop a software program that could facilitate the combining of data from heterogeneous sources, and thus allow researchers to perform genomic cross-platform/cross-species studies and to use existing experimental data for compendium studies. We have developed a web-based software resource, called CROPPER that uses the latest genomic information concerning different data identifiers and orthologous genes from the Ensembl database. CROPPER can be used to combine genomic data from different heterogeneous sources, allowing researchers to perform cross-platform/cross-species compendium studies without the need for complex computational tools or the requirement of setting up one's own in-house database. We also present an example of a simple cross-platform/cross-species compendium study based on publicly available Parkinson's disease data derived from different sources. CROPPER is a user-friendly and freely available web-based software resource that can be successfully used for cross-species/cross-platform compendium studies.
EuPathDB: the eukaryotic pathogen genomics database resource
Aurrecoechea, Cristina; Barreto, Ana; Basenko, Evelina Y.; Brestelli, John; Brunk, Brian P.; Cade, Shon; Crouch, Kathryn; Doherty, Ryan; Falke, Dave; Fischer, Steve; Gajria, Bindu; Harb, Omar S.; Heiges, Mark; Hertz-Fowler, Christiane; Hu, Sufen; Iodice, John; Kissinger, Jessica C.; Lawrence, Cris; Li, Wei; Pinney, Deborah F.; Pulman, Jane A.; Roos, David S.; Shanmugasundram, Achchuthan; Silva-Franco, Fatima; Steinbiss, Sascha; Stoeckert, Christian J.; Spruill, Drew; Wang, Haiming; Warrenfeltz, Susanne; Zheng, Jie
2017-01-01
The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host–pathogen interactions. PMID:27903906
GenoQuery: a new querying module for functional annotation in a genomic warehouse
Lemoine, Frédéric; Labedan, Bernard; Froidevaux, Christine
2008-01-01
Motivation: We have to cope with both a deluge of new genome sequences and a huge amount of data produced by high-throughput approaches used to exploit these genomic features. Crossing and comparing such heterogeneous and disparate data will help improving functional annotation of genomes. This requires designing elaborate integration systems such as warehouses for storing and querying these data. Results: We have designed a relational genomic warehouse with an original multi-layer architecture made of a databases layer and an entities layer. We describe a new querying module, GenoQuery, which is based on this architecture. We use the entities layer to define mixed queries. These mixed queries allow searching for instances of biological entities and their properties in the different databases, without specifying in which database they should be found. Accordingly, we further introduce the central notion of alternative queries. Such queries have the same meaning as the original mixed queries, while exploiting complementarities yielded by the various integrated databases of the warehouse. We explain how GenoQuery computes all the alternative queries of a given mixed query. We illustrate how useful this querying module is by means of a thorough example. Availability: http://www.lri.fr/~lemoine/GenoQuery/ Contact: chris@lri.fr, lemoine@lri.fr PMID:18586731
Nasri, Tuba; Hedayati, Mohammad Taghi; Abastabar, Mahdi; Pasqualotto, Alessandro C; Armaki, Mojtaba Taghizadeh; Hoseinnejad, Akbar; Nabili, Mojtaba
2015-10-01
Aspergillus species are important agents of life-threatening infections in immunosuppressed patients. Proper speciation in the Aspergilli has been justified based on varied fungal virulence, clinical presentations, and antifungal resistance. Accurate identification of Aspergillus species usually relies on fungal DNA sequencing but this requires expensive equipment that is not available in most clinical laboratories. We developed and validated a discriminative low-cost PCR-based test to discriminate Aspergillus isolates at the species level. The Beta tubulin gene of various reference strains of Aspergillus species was amplified using the universal fungal primers Bt2a and Bt2b. The PCR products were subjected to digestion with a single restriction enzyme AlwI. All Aspergillus isolates were subjected to DNA sequencing for final species characterization. The PCR-RFLP test generated unique patterns for six clinically important Aspergillus species, including Aspergillus flavus, Aspergillus fumigatus, Aspergillus nidulans, Aspergillus terreus, Aspergillus clavatus and Aspergillus nidulans. The one-enzyme PCR-RFLP on Beta tubulin gene designed in this study is a low-cost tool for the reliable and rapid differentiation of the clinically important Aspergillus species. Copyright © 2015 Elsevier B.V. All rights reserved.
MBGD update 2013: the microbial genome database for exploring the diversity of microbial world.
Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu
2013-01-01
The microbial genome database for comparative analysis (MBGD, available at http://mbgd.genome.ad.jp/) is a platform for microbial genome comparison based on orthology analysis. As its unique feature, MBGD allows users to conduct orthology analysis among any specified set of organisms; this flexibility allows MBGD to adapt to a variety of microbial genomic study. Reflecting the huge diversity of microbial world, the number of microbial genome projects now becomes several thousands. To efficiently explore the diversity of the entire microbial genomic data, MBGD now provides summary pages for pre-calculated ortholog tables among various taxonomic groups. For some closely related taxa, MBGD also provides the conserved synteny information (core genome alignment) pre-calculated using the CoreAligner program. In addition, efficient incremental updating procedure can create extended ortholog table by adding additional genomes to the default ortholog table generated from the representative set of genomes. Combining with the functionalities of the dynamic orthology calculation of any specified set of organisms, MBGD is an efficient and flexible tool for exploring the microbial genome diversity.
Identification of a Novel L-rhamnose Uptake Transporter in the Filamentous Fungus Aspergillus niger.
Sloothaak, Jasper; Odoni, Dorett I; Martins Dos Santos, Vitor A P; Schaap, Peter J; Tamayo-Ramos, Juan Antonio
2016-12-01
The study of plant biomass utilization by fungi is a research field of great interest due to its many implications in ecology, agriculture and biotechnology. Most of the efforts done to increase the understanding of the use of plant cell walls by fungi have been focused on the degradation of cellulose and hemicellulose, and transport and metabolism of their constituent monosaccharides. Pectin is another important constituent of plant cell walls, but has received less attention. In relation to the uptake of pectic building blocks, fungal transporters for the uptake of galacturonic acid recently have been reported in Aspergillus niger and Neurospora crassa. However, not a single L-rhamnose (6-deoxy-L-mannose) transporter has been identified yet in fungi or in other eukaryotic organisms. L-rhamnose is a deoxy-sugar present in plant cell wall pectic polysaccharides (mainly rhamnogalacturonan I and rhamnogalacturonan II), but is also found in diverse plant secondary metabolites (e.g. anthocyanins, flavonoids and triterpenoids), in the green seaweed sulfated polysaccharide ulvan, and in glycan structures from viruses and bacteria. Here, a comparative plasmalemma proteomic analysis was used to identify candidate L-rhamnose transporters in A. niger. Further analysis was focused on protein ID 1119135 (RhtA) (JGI A. niger ATCC 1015 genome database). RhtA was classified as a Family 7 Fucose: H+ Symporter (FHS) within the Major Facilitator Superfamily. Family 7 currently includes exclusively bacterial transporters able to use different sugars. Strong indications for its role in L-rhamnose transport were obtained by functional complementation of the Saccharomyces cerevisiae EBY.VW.4000 strain in growth studies with a range of potential substrates. Biochemical analysis using L-[3H(G)]-rhamnose confirmed that RhtA is a L-rhamnose transporter. The RhtA gene is located in tandem with a hypothetical alpha-L-rhamnosidase gene (rhaB). Transcriptional analysis of rhtA and rhaB confirmed that both genes have a coordinated expression, being strongly and specifically induced by L-rhamnose, and controlled by RhaR, a transcriptional regulator involved in the release and catabolism of the methyl-pentose. RhtA is the first eukaryotic L-rhamnose transporter identified and functionally validated to date.
Identification of a Novel L-rhamnose Uptake Transporter in the Filamentous Fungus Aspergillus niger
Sloothaak, Jasper; Odoni, Dorett I.; Martins dos Santos, Vitor A. P.; Schaap, Peter J.
2016-01-01
The study of plant biomass utilization by fungi is a research field of great interest due to its many implications in ecology, agriculture and biotechnology. Most of the efforts done to increase the understanding of the use of plant cell walls by fungi have been focused on the degradation of cellulose and hemicellulose, and transport and metabolism of their constituent monosaccharides. Pectin is another important constituent of plant cell walls, but has received less attention. In relation to the uptake of pectic building blocks, fungal transporters for the uptake of galacturonic acid recently have been reported in Aspergillus niger and Neurospora crassa. However, not a single L-rhamnose (6-deoxy-L-mannose) transporter has been identified yet in fungi or in other eukaryotic organisms. L-rhamnose is a deoxy-sugar present in plant cell wall pectic polysaccharides (mainly rhamnogalacturonan I and rhamnogalacturonan II), but is also found in diverse plant secondary metabolites (e.g. anthocyanins, flavonoids and triterpenoids), in the green seaweed sulfated polysaccharide ulvan, and in glycan structures from viruses and bacteria. Here, a comparative plasmalemma proteomic analysis was used to identify candidate L-rhamnose transporters in A. niger. Further analysis was focused on protein ID 1119135 (RhtA) (JGI A. niger ATCC 1015 genome database). RhtA was classified as a Family 7 Fucose: H+ Symporter (FHS) within the Major Facilitator Superfamily. Family 7 currently includes exclusively bacterial transporters able to use different sugars. Strong indications for its role in L-rhamnose transport were obtained by functional complementation of the Saccharomyces cerevisiae EBY.VW.4000 strain in growth studies with a range of potential substrates. Biochemical analysis using L-[3H(G)]-rhamnose confirmed that RhtA is a L-rhamnose transporter. The RhtA gene is located in tandem with a hypothetical alpha-L-rhamnosidase gene (rhaB). Transcriptional analysis of rhtA and rhaB confirmed that both genes have a coordinated expression, being strongly and specifically induced by L-rhamnose, and controlled by RhaR, a transcriptional regulator involved in the release and catabolism of the methyl-pentose. RhtA is the first eukaryotic L-rhamnose transporter identified and functionally validated to date. PMID:27984587
Seaver, Samuel M. D.; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M. T.; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D.; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D.; Henry, Christopher S.
2014-01-01
The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today’s annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed. PMID:24927599
Seaver, Samuel M D; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M T; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D; Henry, Christopher S
2014-07-01
The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed.
Genomics and Public Health Research: Can the State Allow Access to Genomic Databases?
Cousineau, J; Girard, N; Monardes, C; Leroux, T; Jean, M Stanton
2012-01-01
Because many diseases are multifactorial disorders, the scientific progress in genomics and genetics should be taken into consideration in public health research. In this context, genomic databases will constitute an important source of information. Consequently, it is important to identify and characterize the State’s role and authority on matters related to public health, in order to verify whether it has access to such databases while engaging in public health genomic research. We first consider the evolution of the concept of public health, as well as its core functions, using a comparative approach (e.g. WHO, PAHO, CDC and the Canadian province of Quebec). Following an analysis of relevant Quebec legislation, the precautionary principle is examined as a possible avenue to justify State access to and use of genomic databases for research purposes. Finally, we consider the Influenza pandemic plans developed by WHO, Canada, and Quebec, as examples of key tools framing public health decision-making process. We observed that State powers in public health, are not, in Quebec, well adapted to the expansion of genomics research. We propose that the scope of the concept of research in public health should be clear and include the following characteristics: a commitment to the health and well-being of the population and to their determinants; the inclusion of both applied research and basic research; and, an appropriate model of governance (authorization, follow-up, consent, etc.). We also suggest that the strategic approach version of the precautionary principle could guide collective choices in these matters. PMID:23113174
LDSplitDB: a database for studies of meiotic recombination hotspots in MHC using human genomic data.
Guo, Jing; Chen, Hao; Yang, Peng; Lee, Yew Ti; Wu, Min; Przytycka, Teresa M; Kwoh, Chee Keong; Zheng, Jie
2018-04-20
Meiotic recombination happens during the process of meiosis when chromosomes inherited from two parents exchange genetic materials to generate chromosomes in the gamete cells. The recombination events tend to occur in narrow genomic regions called recombination hotspots. Its dysregulation could lead to serious human diseases such as birth defects. Although the regulatory mechanism of recombination events is still unclear, DNA sequence polymorphisms have been found to play crucial roles in the regulation of recombination hotspots. To facilitate the studies of the underlying mechanism, we developed a database named LDSplitDB which provides an integrative and interactive data mining and visualization platform for the genome-wide association studies of recombination hotspots. It contains the pre-computed association maps of the major histocompatibility complex (MHC) region in the 1000 Genomes Project and the HapMap Phase III datasets, and a genome-scale study of the European population from the HapMap Phase II dataset. Besides the recombination profiles, related data of genes, SNPs and different types of epigenetic modifications, which could be associated with meiotic recombination, are provided for comprehensive analysis. To meet the computational requirement of the rapidly increasing population genomics data, we prepared a lookup table of 400 haplotypes for recombination rate estimation using the well-known LDhat algorithm which includes all possible two-locus haplotype configurations. To the best of our knowledge, LDSplitDB is the first large-scale database for the association analysis of human recombination hotspots with DNA sequence polymorphisms. It provides valuable resources for the discovery of the mechanism of meiotic recombination hotspots. The information about MHC in this database could help understand the roles of recombination in human immune system. DATABASE URL: http://histone.scse.ntu.edu.sg/LDSplitDB.