Science.gov

Sample records for isoapostatic gene clusters

  1. Persistence drives gene clustering in bacterial genomes

    PubMed Central

    Fang, Gang; Rocha, Eduardo PC; Danchin, Antoine

    2008-01-01

    Background Gene clustering plays an important role in the organization of the bacterial chromosome and several mechanisms have been proposed to explain its extent. However, the controversies raised about the validity of each of these mechanisms remind us that the cause of this gene organization remains an open question. Models proposed to explain clustering did not take into account the function of the gene products nor the likely presence or absence of a given gene in a genome. However, genomes harbor two very different categories of genes: those genes present in a majority of organisms – persistent genes – and those present in very few organisms – rare genes. Results We show that two classes of genes are significantly clustered in bacterial genomes: the highly persistent and the rare genes. The clustering of rare genes is readily explained by the selfish operon theory. Yet, genes persistently present in bacterial genomes are also clustered and we try to understand why. We propose a model accounting specifically for such clustering, and show that indispensability in a genome with frequent gene deletion and insertion leads to the transient clustering of these genes. The model describes how clusters are created via the gene flux that continuously introduces new genes while deleting others. We then test if known selective processes, such as co-transcription, physical interaction or functional neighborhood, account for the stabilization of these clusters. Conclusion We show that the strong selective pressure acting on the function of persistent genes, in a permanent state of flux of genes in bacterial genomes, maintaining their size fairly constant, that drives persistent genes clustering. A further selective stabilization process might contribute to maintaining the clustering. PMID:18179692

  2. Evolution of the Aflatoxin Gene Cluster

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Why Aspergillus species produce aflatoxin remains an unsolved question. In this report, we suggest that evolution of the aflatoxin biosynthesis gene cluster has been a multistep process. More than 300 million years ago, a primordial cluster of genes allowed production of anthraquinones that may ha...

  3. Biological cluster evaluation for gene function prediction.

    PubMed

    Klie, Sebastian; Nikoloski, Zoran; Selbig, Joachim

    2014-06-01

    Recent advances in high-throughput omics techniques render it possible to decode the function of genes by using the "guilt-by-association" principle on biologically meaningful clusters of gene expression data. However, the existing frameworks for biological evaluation of gene clusters are hindered by two bottleneck issues: (1) the choice for the number of clusters, and (2) the external measures which do not take in consideration the structure of the analyzed data and the ontology of the existing biological knowledge. Here, we address the identified bottlenecks by developing a novel framework that allows not only for biological evaluation of gene expression clusters based on existing structured knowledge, but also for prediction of putative gene functions. The proposed framework facilitates propagation of statistical significance at each of the following steps: (1) estimating the number of clusters, (2) evaluating the clusters in terms of novel external structural measures, (3) selecting an optimal clustering algorithm, and (4) predicting gene functions. The framework also includes a method for evaluation of gene clusters based on the structure of the employed ontology. Moreover, our method for obtaining a probabilistic range for the number of clusters is demonstrated valid on synthetic data and available gene expression profiles from Saccharomyces cerevisiae. Finally, we propose a network-based approach for gene function prediction which relies on the clustering of optimal score and the employed ontology. Our approach effectively predicts gene function on the Saccharomyces cerevisiae data set and is also employed to obtain putative gene functions for an Arabidopsis thaliana data set. PMID:20059365

  4. Pichia stipitis genomics, transcriptomics, and gene clusters

    PubMed Central

    Jeffries, Thomas W; Van Vleet, Jennifer R Headman

    2009-01-01

    Genome sequencing and subsequent global gene expression studies have advanced our understanding of the lignocellulose-fermenting yeast Pichia stipitis. These studies have provided an insight into its central carbon metabolism, and analysis of its genome has revealed numerous functional gene clusters and tandem repeats. Specialized physiological traits are often the result of several gene products acting together. When coinheritance is necessary for the overall physiological function, recombination and selection favor colocation of these genes in a cluster. These are particularly evident in strongly conserved and idiomatic traits. In some cases, the functional clusters consist of multiple gene families. Phylogenetic analyses of the members in each family show that once formed, functional clusters undergo duplication and differentiation. Genome-wide expression analysis reveals that regulatory patterns of clusters are similar after they have duplicated and that the expression profiles evolve along with functional differentiation of the clusters. Orthologous gene families appear to arise through tandem gene duplication, followed by differentiation in the regulatory and coding regions of the gene. Genome-wide expression analysis combined with cross-species comparisons of functional gene clusters should reveal many more aspects of eukaryotic physiology. PMID:19659741

  5. Clustering of High Throughput Gene Expression Data

    PubMed Central

    Pirim, Harun; Ekşioğlu, Burak; Perkins, Andy; Yüceer, Çetin

    2012-01-01

    High throughput biological data need to be processed, analyzed, and interpreted to address problems in life sciences. Bioinformatics, computational biology, and systems biology deal with biological problems using computational methods. Clustering is one of the methods used to gain insight into biological processes, particularly at the genomics level. Clearly, clustering can be used in many areas of biological data analysis. However, this paper presents a review of the current clustering algorithms designed especially for analyzing gene expression data. It is also intended to introduce one of the main problems in bioinformatics - clustering gene expression data - to the operations research community. PMID:23144527

  6. Clustering of gene ontology terms in genomes.

    PubMed

    Tiirikka, Timo; Siermala, Markku; Vihinen, Mauno

    2014-10-25

    Although protein coding genes occupy only a small fraction of genomes in higher species, they are not randomly distributed within or between chromosomes. Clustering of genes with related function(s) and/or characteristics has been evident at several different levels. To study how common the clustering of functionally related genes is and what kind of functions the end products of these genes are involved, we collected gene ontology (GO) terms for complete genomes and developed a method to detect previously undefined gene clustering. Exhaustive analysis was performed for seven widely studied species ranging from human to Escherichia coli. To overcome problems related to varying gene lengths and densities, a novel method was developed and a fixed number of genes were analyzed irrespective of the genome span covered. Statistically very significant GO term clustering was apparent in all the investigated genomes. The analysis window, which ranged from 5 to 50 consecutive genes, revealed extensive GO term clusters for genes with widely varying functions. Here, the most interesting and significant results are discussed and the complete dataset for each analyzed species is available at the GOme database at http://bioinf.uta.fi/GOme. The results indicated that clusters of genes with related functions are very common, not only in bacteria, in which operons are frequent, but also in all the studied species irrespective of how complex they are. There are some differences between species but in all of them GO term clusters are common and of widely differing sizes. The presented method can be applied to analyze any genome or part of a genome for which descriptive features are available, and thus is not restricted to ontology terms. This method can also be applied to investigate gene and protein expression patterns. The results pave a way for further studies of mechanisms that shape genome structure and evolutionary forces related to them. PMID:24995610

  7. Chicken rRNA Gene Cluster Structure

    PubMed Central

    Dyomin, Alexander G.; Koshel, Elena I.; Kiselev, Artem M.; Saifitdinova, Alsu F.; Galkina, Svetlana A.; Fukagawa, Tatsuo; Kostareva, Anna A.

    2016-01-01

    Ribosomal RNA (rRNA) genes, whose activity results in nucleolus formation, constitute an extremely important part of genome. Despite the extensive exploration into avian genomes, no complete description of avian rRNA gene primary structure has been offered so far. We publish a complete chicken rRNA gene cluster sequence here, including 5’ETS (1836 bp), 18S rRNA gene (1823 bp), ITS1 (2530 bp), 5.8S rRNA gene (157 bp), ITS2 (733 bp), 28S rRNA gene (4441 bp) and 3’ETS (343 bp). The rRNA gene cluster sequence of 11863 bp was assembled from raw reads and deposited to GenBank under KT445934 accession number. The assembly was validated through in situ fluorescent hybridization analysis on chicken metaphase chromosomes using computed and synthesized specific probes, as well as through the reference assembly against de novo assembled rRNA gene cluster sequence using sequenced fragments of BAC-clone containing chicken NOR (nucleolus organizer region). The results have confirmed the chicken rRNA gene cluster validity. PMID:27299357

  8. Clustering gene expression data using graph separators.

    PubMed

    Kaba, Bangaly; Pinet, Nicolas; Lelandais, Gaëlle; Sigayret, Alain; Berry, Anne

    2007-01-01

    Recent work has used graphs to modelize expression data from microarray experiments, in view of partitioning the genes into clusters. In this paper, we introduce the use of a decomposition by clique separators. Our aim is to improve the classical clustering methods in two ways: first we want to allow an overlap between clusters, as this seems biologically sound, and second we want to be guided by the structure of the graph to define the number of clusters. We test this approach with a well-known yeast database (Saccharomyces cerevisiae). Our results are good, as the expression profiles of the clusters we find are very coherent. Moreover, we are able to organize into another graph the clusters we find, and order them in a fashion which turns out to respect the chronological order defined by the the sporulation process. PMID:18391236

  9. Clustering Genes of Common Evolutionary History

    PubMed Central

    Gori, Kevin; Suchan, Tomasz; Alvarez, Nadir; Goldman, Nick; Dessimoz, Christophe

    2016-01-01

    Phylogenetic inference can potentially result in a more accurate tree using data from multiple loci. However, if the loci are incongruent—due to events such as incomplete lineage sorting or horizontal gene transfer—it can be misleading to infer a single tree. To address this, many previous contributions have taken a mechanistic approach, by modeling specific processes. Alternatively, one can cluster loci without assuming how these incongruencies might arise. Such “process-agnostic” approaches typically infer a tree for each locus and cluster these. There are, however, many possible combinations of tree distance and clustering methods; their comparative performance in the context of tree incongruence is largely unknown. Furthermore, because standard model selection criteria such as AIC cannot be applied to problems with a variable number of topologies, the issue of inferring the optimal number of clusters is poorly understood. Here, we perform a large-scale simulation study of phylogenetic distances and clustering methods to infer loci of common evolutionary history. We observe that the best-performing combinations are distances accounting for branch lengths followed by spectral clustering or Ward’s method. We also introduce two statistical tests to infer the optimal number of clusters and show that they strongly outperform the silhouette criterion, a general-purpose heuristic. We illustrate the usefulness of the approach by 1) identifying errors in a previous phylogenetic analysis of yeast species and 2) identifying topological incongruence among newly sequenced loci of the globeflower fly genus Chiastocheta. We release treeCl, a new program to cluster genes of common evolutionary history (http://git.io/treeCl). PMID:26893301

  10. Clustering Genes of Common Evolutionary History.

    PubMed

    Gori, Kevin; Suchan, Tomasz; Alvarez, Nadir; Goldman, Nick; Dessimoz, Christophe

    2016-06-01

    Phylogenetic inference can potentially result in a more accurate tree using data from multiple loci. However, if the loci are incongruent-due to events such as incomplete lineage sorting or horizontal gene transfer-it can be misleading to infer a single tree. To address this, many previous contributions have taken a mechanistic approach, by modeling specific processes. Alternatively, one can cluster loci without assuming how these incongruencies might arise. Such "process-agnostic" approaches typically infer a tree for each locus and cluster these. There are, however, many possible combinations of tree distance and clustering methods; their comparative performance in the context of tree incongruence is largely unknown. Furthermore, because standard model selection criteria such as AIC cannot be applied to problems with a variable number of topologies, the issue of inferring the optimal number of clusters is poorly understood. Here, we perform a large-scale simulation study of phylogenetic distances and clustering methods to infer loci of common evolutionary history. We observe that the best-performing combinations are distances accounting for branch lengths followed by spectral clustering or Ward's method. We also introduce two statistical tests to infer the optimal number of clusters and show that they strongly outperform the silhouette criterion, a general-purpose heuristic. We illustrate the usefulness of the approach by 1) identifying errors in a previous phylogenetic analysis of yeast species and 2) identifying topological incongruence among newly sequenced loci of the globeflower fly genus Chiastocheta We release treeCl, a new program to cluster genes of common evolutionary history (http://git.io/treeCl). PMID:26893301

  11. The rise of operon-like gene clusters in plants.

    PubMed

    Boycheva, Svetlana; Daviet, Laurent; Wolfender, Jean-Luc; Fitzpatrick, Teresa B

    2014-07-01

    Gene clusters are common features of prokaryotic genomes also present in eukaryotes. Most clustered genes known are involved in the biosynthesis of secondary metabolites. Although horizontal gene transfer is a primary source of prokaryotic gene cluster (operon) formation and has been reported to occur in eukaryotes, the predominant source of cluster formation in eukaryotes appears to arise de novo or through gene duplication followed by neo- and sub-functionalization or translocation. Here we aim to provide an overview of the current knowledge and open questions related to plant gene cluster functioning, assembly, and regulation. We also present potential research approaches and point out the benefits of a better understanding of gene clusters in plants for both fundamental and applied plant science. PMID:24582794

  12. An approach for clustering gene expression data with error information

    PubMed Central

    Tjaden, Brian

    2006-01-01

    Background Clustering of gene expression patterns is a well-studied technique for elucidating trends across large numbers of transcripts and for identifying likely co-regulated genes. Even the best clustering methods, however, are unlikely to provide meaningful results if too much of the data is unreliable. With the maturation of microarray technology, a wealth of research on statistical analysis of gene expression data has encouraged researchers to consider error and uncertainty in their microarray experiments, so that experiments are being performed increasingly with repeat spots per gene per chip and with repeat experiments. One of the challenges is to incorporate the measurement error information into downstream analyses of gene expression data, such as traditional clustering techniques. Results In this study, a clustering approach is presented which incorporates both gene expression values and error information about the expression measurements. Using repeat expression measurements, the error of each gene expression measurement in each experiment condition is estimated, and this measurement error information is incorporated directly into the clustering algorithm. The algorithm, CORE (Clustering Of Repeat Expression data), is presented and its performance is validated using statistical measures. By using error information about gene expression measurements, the clustering approach is less sensitive to noise in the underlying data and it is able to achieve more accurate clusterings. Results are described for both synthetic expression data as well as real gene expression data from Escherichia coli and Saccharomyces cerevisiae. Conclusion The additional information provided by replicate gene expression measurements is a valuable asset in effective clustering. Gene expression profiles with high errors, as determined from repeat measurements, may be unreliable and may associate with different clusters, whereas gene expression profiles with low errors can be

  13. Inferring the Recent Duplication History of a Gene Cluster

    NASA Astrophysics Data System (ADS)

    Song, Giltae; Zhang, Louxin; Vinař, Tomáš; Miller, Webb

    Much important evolutionary activity occurs in gene clusters, where a copy of a gene may be free to evolve new functions. Computational methods to extract evolutionary information from sequence data for such clusters are currently imperfect, in part because accurate sequence data are often lacking in these genomic regions, making the existing methods difficult to apply. We describe a new method for reconstructing the recent evolutionary history of gene clusters. The method’s performance is evaluated on simulated data and on actual human gene clusters.

  14. Super-paramagnetic clustering of yeast gene expression profiles

    NASA Astrophysics Data System (ADS)

    Getz, G.; Levine, E.; Domany, E.; Zhang, M. Q.

    2000-04-01

    High-density DNA arrays, used to monitor gene expression at a genomic scale, have produced vast amounts of information which require the development of efficient computational methods to analyze them. The important first step is to extract the fundamental patterns of gene expression inherent in the data. This paper describes the application of a novel clustering algorithm, super-paramagnetic clustering (SPC) to analysis of gene expression profiles that were generated recently during a study of the yeast cell cycle. SPC was used to organize genes into biologically relevant clusters that are suggestive for their co-regulation. Some of the advantages of SPC are its robustness against noise and initialization, a clear signature of cluster formation and splitting, and an unsupervised self-organized determination of the number of clusters at each resolution. Our analysis revealed interesting correlated behavior of several groups of genes which has not been previously identified.

  15. Prokaryotic Gene Clusters: A Rich Toolbox for Synthetic Biology

    PubMed Central

    Fischbach, Michael; Voigt, Christopher A.

    2014-01-01

    Bacteria construct elaborate nanostructures, obtain nutrients and energy from diverse sources, synthesize complex molecules, and implement signal processing to react to their environment. These complex phenotypes require the coordinated action of multiple genes, which are often encoded in a contiguous region of the genome, referred to as a gene cluster. Gene clusters sometimes contain all of the genes necessary and sufficient for a particular function. As an evolutionary mechanism, gene clusters facilitate the horizontal transfer of the complete function between species. Here, we review recent work on a number of clusters whose functions are relevant to biotechnology. Engineering these clusters has been hindered by their regulatory complexity, the need to balance the expression of many genes, and a lack of tools to design and manipulate DNA at this scale. Advances in synthetic biology will enable the large-scale bottom-up engineering of the clusters to optimize their functions, wake up cryptic clusters, or to transfer them between organisms. Understanding and manipulating gene clusters will move towards an era of genome engineering, where multiple functions can be “mixed-and-matched” to create a designer organism. PMID:21154668

  16. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis

    PubMed Central

    Noar, Roslyn D.; Daub, Margaret E.

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  17. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    PubMed

    Noar, Roslyn D; Daub, Margaret E

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  18. A Large Gene Cluster for the Clostridium cellulovorans Cellulosome

    PubMed Central

    Tamaru, Yutaka; Karita, Shuichi; Ibrahim, Atef; Chan, Helen; Doi, Roy H.

    2000-01-01

    A large gene cluster for the Clostridium cellulovorans cellulosome has been cloned and sequenced upstream and downstream of the cbpA and exgS genes (C.-C. Liu and R. H. Doi, Gene 211:39–47, 1998). Gene walking revealed that the engL gene cluster (Y. Tamaru and R. H. Doi, J. Bacteriol. 182:244–247, 2000) was located downstream of the cbpA-exgS genes. Further DNA sequencing revealed that this cluster contains the genes for the scaffolding protein CbpA, the exoglucanase ExgS, several endoglucanases of family 9, the mannanase ManA, and the hydrophobic protein HbpA containing a surface layer homology domain and a hydrophobic (or cohesin) domain. The sequence of the clustered genes is cbpA-exgS-engH-engK-hbpA-engL-manA-engM-engN and is about 22 kb in length. The engN gene did not have a complete catalytic domain, indicating that engN is a truncated gene. This large gene cluster is flanked at the 5′ end by a putative noncellulosomal operon consisting of nifV-orf1-sigX-regA and at the 3′ end by noncellulosomal genes with homology to transposase (trp) and malate permease (mle). Since gene clusters for the cellulosome are also found in C. cellulolyticum and C. josui, they seem to be typical of mesophilic clostridia, indicating that the large gene clusters may arise from a common ancestor with some evolutionary modifications. PMID:11004194

  19. A knowledge-based clustering algorithm driven by Gene Ontology.

    PubMed

    Cheng, Jill; Cline, Melissa; Martin, John; Finkelstein, David; Awad, Tarif; Kulp, David; Siani-Rose, Michael A

    2004-08-01

    We have developed an algorithm for inferring the degree of similarity between genes by using the graph-based structure of Gene Ontology (GO). We applied this knowledge-based similarity metric to a clique-finding algorithm for detecting sets of related genes with biological classifications. We also combined it with an expression-based distance metric to produce a co-cluster analysis, which accentuates genes with both similar expression profiles and similar biological characteristics and identifies gene clusters that are more stable and biologically meaningful. These algorithms are demonstrated in the analysis of MPRO cell differentiation time series experiments. PMID:15468759

  20. Sesterterpene ophiobolin biosynthesis involving multiple gene clusters in Aspergillus ustus

    PubMed Central

    Chai, Hangzhen; Yin, Ru; Liu, Yongfeng; Meng, Huiying; Zhou, Xianqiang; Zhou, Guolin; Bi, Xupeng; Yang, Xue; Zhu, Tonghan; Zhu, Weiming; Deng, Zixin; Hong, Kui

    2016-01-01

    Terpenoids are the most diverse and abundant natural products among which sesterterpenes account for less than 2%, with very few reports on their biosynthesis. Ophiobolins are tricyclic 5–8–5 ring sesterterpenes with potential pharmaceutical application. Aspergillus ustus 094102 from mangrove rizhosphere produces ophiobolin and other terpenes. We obtained five gene cluster knockout mutants, with altered ophiobolin yield using genome sequencing and in silico analysis, combined with in vivo genetic manipulation. Involvement of the five gene clusters in ophiobolin synthesis was confirmed by investigation of the five key terpene synthesis relevant enzymes in each gene cluster, either by gene deletion and complementation or in vitro verification of protein function. The results demonstrate that ophiobolin skeleton biosynthesis involves five gene clusters, which are responsible for C15, C20, C25, and C30 terpenoid biosynthesis. PMID:27273151

  1. Sesterterpene ophiobolin biosynthesis involving multiple gene clusters in Aspergillus ustus.

    PubMed

    Chai, Hangzhen; Yin, Ru; Liu, Yongfeng; Meng, Huiying; Zhou, Xianqiang; Zhou, Guolin; Bi, Xupeng; Yang, Xue; Zhu, Tonghan; Zhu, Weiming; Deng, Zixin; Hong, Kui

    2016-01-01

    Terpenoids are the most diverse and abundant natural products among which sesterterpenes account for less than 2%, with very few reports on their biosynthesis. Ophiobolins are tricyclic 5-8-5 ring sesterterpenes with potential pharmaceutical application. Aspergillus ustus 094102 from mangrove rizhosphere produces ophiobolin and other terpenes. We obtained five gene cluster knockout mutants, with altered ophiobolin yield using genome sequencing and in silico analysis, combined with in vivo genetic manipulation. Involvement of the five gene clusters in ophiobolin synthesis was confirmed by investigation of the five key terpene synthesis relevant enzymes in each gene cluster, either by gene deletion and complementation or in vitro verification of protein function. The results demonstrate that ophiobolin skeleton biosynthesis involves five gene clusters, which are responsible for C15, C20, C25, and C30 terpenoid biosynthesis. PMID:27273151

  2. Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

    PubMed Central

    Huttenhower, Curtis; Flamholz, Avi I; Landis, Jessica N; Sahi, Sauhard; Myers, Chad L; Olszewski, Kellen L; Hibbs, Matthew A; Siemers, Nathan O; Troyanskaya, Olga G; Coller, Hilary A

    2007-01-01

    Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes). Results We developed Nearest Neighbor Networks (NNN), a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the analysis of large datasets

  3. Genomic analyses of bacterial porin-cytochrome gene clusters

    DOE PAGESBeta

    Shi, Liang; Fredrickson, James K.; Zachara, John M.

    2014-11-26

    In this study, the porin-cytochrome (Pcc) protein complex is responsible for trans-outer membrane electron transfer during extracellular reduction of Fe(III) by the dissimilatory metal-reducing bacterium Geobacter sulfurreducens PCA. The identified and characterized Pcc complex of G. sulfurreducens PCA consists of a porin-like outer-membrane protein, a periplasmic 8-heme c type cytochrome (c-Cyt) and an outer-membrane 12-heme c-Cyt, and the genes encoding the Pcc proteins are clustered in the same regions of genome (i.e., the pcc gene clusters) of G. sulfurreducens PCA. A survey of additionally microbial genomes has identified the pcc gene clusters in all sequenced Geobacter spp. and other bacteriamore » from six different phyla, including Anaeromyxobacter dehalogenans 2CP-1, A. dehalogenans 2CP-C, Anaeromyxobacter sp. K, Candidatus Kuenenia stuttgartiensis, Denitrovibrio acetiphilus DSM 12809, Desulfurispirillum indicum S5, Desulfurivibrio alkaliphilus AHT2, Desulfurobacterium thermolithotrophum DSM 11699, Desulfuromonas acetoxidans DSM 684, Ignavibacterium album JCM 16511, and Thermovibrio ammonificans HB-1. The numbers of genes in the pcc gene clusters vary, ranging from two to nine. Similar to the metal-reducing (Mtr) gene clusters of other Fe(III)-reducing bacteria, such as Shewanella spp., additional genes that encode putative c-Cyts with predicted cellular localizations at the cytoplasmic membrane, periplasm and outer membrane often associate with the pcc gene clusters. This suggests that the Pcc-associated c-Cyts may be part of the pathways for extracellular electron transfer reactions. The presence of pcc gene clusters in the microorganisms that do not reduce solid-phase Fe(III) and Mn(IV) oxides, such as D. alkaliphilus AHT2 and I. album JCM 16511, also suggests that some of the pcc gene clusters may be involved in extracellular electron transfer reactions with the substrates other than Fe(III) and Mn(IV) oxides.« less

  4. Genomic Gene Clustering Analysis of Pathways in Eukaryotes

    PubMed Central

    Lee, Jennifer M.; Sonnhammer, Erik L.L.

    2003-01-01

    Genomic clustering of genes in a pathway is commonly found in prokaryotes due to transcriptional operons, but these are not present in most eukaryotes. Yet, there might be clustering to a lesser extent of pathway members in eukaryotic genomes, that assist coregulation of a set of functionally cooperating genes. We analyzed five sequenced eukaryotic genomes for clustering of genes assigned to the same pathway in the KEGG database. Between 98% and 30% of the analyzed pathways in a genome were found to exhibit significantly higher clustering levels than expected by chance. In descending order by the level of clustering, the genomes studied were Saccharomyces cerevisiae, Homo sapiens, Caenorhabditis elegans, Arabidopsis thaliana, and Drosophila melanogaster. Surprisingly, there is not much agreement between genomes in terms of which pathways are most clustered. Only seven of 69 pathways found in all species were significantly clustered in all five of them. This species-specific pattern of pathway clustering may reflect adaptations or evolutionary events unique to a particular lineage. We note that although operons are common in C. elegans, only 58% of the pathways showed significant clustering, which is less than in human. Virtually all pathways in S. cerevisiae showed significant clustering. PMID:12695325

  5. The duplication of the Hox gene clusters in teleost fishes.

    PubMed

    Prohaska, Sonja J; Stadler, Peter F

    2004-06-01

    Higher teleost fishes, including zebrafish and fugu, have duplicated their Hox genes relative to the gene inventory of other gnathostome lineages. The most widely accepted theory contends that the duplicate Hox clusters orginated synchronously during a single genome duplication event in the early history of ray-finned fishes. In this contribution we collect and re-evaluate all publicly available sequence information. In particular, we show that the short Hox gene fragments from published PCR surveys of the killifish Fundulus heteroclitus, the medaka Oryzias latipes and the goldfish Carassius auratus can be used to determine with little ambiguity not only their paralog group but also their membership in a particular cluster.Together with a survey of the genomic sequence data from the pufferfish Tetraodon nigroviridis we show that at least percomorpha, and possibly all eutelosts, share a system of 7 or 8 orthologous Hox gene clusters. There is little doubt about the orthology of the two teleost duplicates of the HoxA and HoxB clusters. A careful analysis of both the coding sequence of Hox genes and of conserved non-coding sequences provides additional support for the "duplication early" hypothesis that the Hox clusters in teleosts are derived from eight ancestral clusters by means of subsequent gene loss; the data remain ambiguous, however, in particular for the HoxC clusters.Assuming the "duplication early" hypothesis we use the new evidence on the Hox gene complements to determine the phylogenetic positions of gene-loss events in the wake of the cluster duplication. Surprisingly, we find that the resolution of redundancy seems to be a slow process that is still ongoing. A few suggestions on which additional sequence data would be most informative for resolving the history of the teleostean Hox genes are discussed. PMID:18202881

  6. Biologically supervised hierarchical clustering algorithms for gene expression data.

    PubMed

    Boratyn, Grzegorz M; Datta, Susmita; Datta, Somnath

    2006-01-01

    Cluster analysis has become a standard part of gene expression analysis. In this paper, we propose a novel semi-supervised approach that offers the same flexibility as that of a hierarchical clustering. Yet it utilizes, along with the experimental gene expression data, common biological information about different genes that is being complied at various public, Web accessible databases. We argue that such an approach is inherently superior than the standard unsupervised approach of grouping genes based on expression data alone. It is shown that our biologically supervised methods produce better clustering results than the corresponding unsupervised methods as judged by the distance from the model temporal profiles. R-codes of the clustering algorithm are available from the authors upon request. PMID:17947147

  7. SMART: unique splitting-while-merging framework for gene clustering.

    PubMed

    Fa, Rui; Roberts, David J; Nandi, Asoke K

    2014-01-01

    Successful clustering algorithms are highly dependent on parameter settings. The clustering performance degrades significantly unless parameters are properly set, and yet, it is difficult to set these parameters a priori. To address this issue, in this paper, we propose a unique splitting-while-merging clustering framework, named "splitting merging awareness tactics" (SMART), which does not require any a priori knowledge of either the number of clusters or even the possible range of this number. Unlike existing self-splitting algorithms, which over-cluster the dataset to a large number of clusters and then merge some similar clusters, our framework has the ability to split and merge clusters automatically during the process and produces the the most reliable clustering results, by intrinsically integrating many clustering techniques and tasks. The SMART framework is implemented with two distinct clustering paradigms in two algorithms: competitive learning and finite mixture model. Nevertheless, within the proposed SMART framework, many other algorithms can be derived for different clustering paradigms. The minimum message length algorithm is integrated into the framework as the clustering selection criterion. The usefulness of the SMART framework and its algorithms is tested in demonstration datasets and simulated gene expression datasets. Moreover, two real microarray gene expression datasets are studied using this approach. Based on the performance of many metrics, all numerical results show that SMART is superior to compared existing self-splitting algorithms and traditional algorithms. Three main properties of the proposed SMART framework are summarized as: (1) needing no parameters dependent on the respective dataset or a priori knowledge about the datasets, (2) extendible to many different applications, (3) offering superior performance compared with counterpart algorithms. PMID:24714159

  8. Identification of Nitrogen-Fixing Genes and Gene Clusters from Metagenomic Library of Acid Mine Drainage

    PubMed Central

    Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

    2014-01-01

    Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community. PMID:24498417

  9. Characterization of the largest effector gene cluster of Ustilago maydis.

    PubMed

    Brefort, Thomas; Tanaka, Shigeyuki; Neidig, Nina; Doehlemann, Gunther; Vincon, Volker; Kahmann, Regine

    2014-07-01

    In the genome of the biotrophic plant pathogen Ustilago maydis, many of the genes coding for secreted protein effectors modulating virulence are arranged in gene clusters. The vast majority of these genes encode novel proteins whose expression is coupled to plant colonization. The largest of these gene clusters, cluster 19A, encodes 24 secreted effectors. Deletion of the entire cluster results in severe attenuation of virulence. Here we present the functional analysis of this genomic region. We show that a 19A deletion mutant behaves like an endophyte, i.e. is still able to colonize plants and complete the infection cycle. However, tumors, the most conspicuous symptoms of maize smut disease, are only rarely formed and fungal biomass in infected tissue is significantly reduced. The generation and analysis of strains carrying sub-deletions identified several genes significantly contributing to tumor formation after seedling infection. Another of the effectors could be linked specifically to anthocyanin induction in the infected tissue. As the individual contributions of these genes to tumor formation were small, we studied the response of maize plants to the whole cluster mutant as well as to several individual mutants by array analysis. This revealed distinct plant responses, demonstrating that the respective effectors have discrete plant targets. We propose that the analysis of plant responses to effector mutant strains that lack a strong virulence phenotype may be a general way to visualize differences in effector function. PMID:24992561

  10. Biosynthetic Gene Cluster for the Polyenoyltetramic Acid α-Lipomycin

    PubMed Central

    Bihlmaier, C.; Welle, E.; Hofmann, C.; Welzel, K.; Vente, A.; Breitling, E.; Müller, M.; Glaser, S.; Bechthold, A.

    2006-01-01

    The gram-positive bacterium Streptomyces aureofaciens Tü117 produces the acyclic polyene antibiotic α-lipomycin. The entire biosynthetic gene cluster (lip gene cluster) was cloned and characterized. DNA sequence analysis of a 74-kb region revealed the presence of 28 complete open reading frames (ORFs), 22 of them belonging to the biosynthetic gene cluster. Central to the cluster is a polyketide synthase locus that encodes an eight-module system comprised of four multifunctional proteins. In addition, one ORF shows homology to those for nonribosomal peptide synthetases, indicating that α-lipomycin belongs to the classification of hybrid peptide-polyketide natural products. Furthermore, the lip cluster includes genes responsible for the formation and attachment of d-digitoxose as well as ORFs that resemble those for putative regulatory and export functions. We generated biosynthetic mutants by insertional gene inactivation. By analysis of culture extracts of these mutants, we could prove that, indeed, the genes involved in the biosynthesis of lipomycin had been cloned, and additionally we gained insight into an unusual biosynthesis pathway. PMID:16723573

  11. Detecting sequence homology at the gene cluster level with MultiGeneBlast.

    PubMed

    Medema, Marnix H; Takano, Eriko; Breitling, Rainer

    2013-05-01

    The genes encoding many biomolecular systems and pathways are genomically organized in operons or gene clusters. With MultiGeneBlast, we provide a user-friendly and effective tool to perform homology searches with operons or gene clusters as basic units, instead of single genes. The contextualization offered by MultiGeneBlast allows users to get a better understanding of the function, evolutionary history, and practical applications of such genomic regions. The tool is fully equipped with applications to generate search databases from GenBank or from the user's own sequence data. Finally, an architecture search mode allows searching for gene clusters with novel configurations, by detecting genomic regions with any user-specified combination of genes. Sources, precompiled binaries, and a graphical tutorial of MultiGeneBlast are freely available from http://multigeneblast.sourceforge.net/. PMID:23412913

  12. Clustered Genes Involved in Cyclopiazonic Acid Production are Next to the Aflatoxin Biosynthesis Gene Cluster in Aspergillus flavus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cyclopiazonic acid (CPA), an indole-tetramic acid toxin, is produced by many species of Aspergillus and Penicillium. In addition to CPA Aspergillus flavus produces polyketide-derived carcinogenic aflatoxins (AFs). AF biosynthesis genes form a gene cluster in a subtelomeric region. Isolates of A. fla...

  13. Genomic analyses of bacterial porin-cytochrome gene clusters

    SciTech Connect

    Shi, Liang; Fredrickson, James K.; Zachara, John M.

    2014-11-26

    In this study, the porin-cytochrome (Pcc) protein complex is responsible for trans-outer membrane electron transfer during extracellular reduction of Fe(III) by the dissimilatory metal-reducing bacterium Geobacter sulfurreducens PCA. The identified and characterized Pcc complex of G. sulfurreducens PCA consists of a porin-like outer-membrane protein, a periplasmic 8-heme c type cytochrome (c-Cyt) and an outer-membrane 12-heme c-Cyt, and the genes encoding the Pcc proteins are clustered in the same regions of genome (i.e., the pcc gene clusters) of G. sulfurreducens PCA. A survey of additionally microbial genomes has identified the pcc gene clusters in all sequenced Geobacter spp. and other bacteria from six different phyla, including Anaeromyxobacter dehalogenans 2CP-1, A. dehalogenans 2CP-C, Anaeromyxobacter sp. K, Candidatus Kuenenia stuttgartiensis, Denitrovibrio acetiphilus DSM 12809, Desulfurispirillum indicum S5, Desulfurivibrio alkaliphilus AHT2, Desulfurobacterium thermolithotrophum DSM 11699, Desulfuromonas acetoxidans DSM 684, Ignavibacterium album JCM 16511, and Thermovibrio ammonificans HB-1. The numbers of genes in the pcc gene clusters vary, ranging from two to nine. Similar to the metal-reducing (Mtr) gene clusters of other Fe(III)-reducing bacteria, such as Shewanella spp., additional genes that encode putative c-Cyts with predicted cellular localizations at the cytoplasmic membrane, periplasm and outer membrane often associate with the pcc gene clusters. This suggests that the Pcc-associated c-Cyts may be part of the pathways for extracellular electron transfer reactions. The presence of pcc gene clusters in the microorganisms that do not reduce solid-phase Fe(III) and Mn(IV) oxides, such as D. alkaliphilus AHT2 and I. album JCM 16511, also suggests that some of the pcc gene clusters may be involved in extracellular

  14. Phage cluster relationships identified through single gene analysis

    PubMed Central

    2013-01-01

    Background Phylogenetic comparison of bacteriophages requires whole genome approaches such as dotplot analysis, genome pairwise maps, and gene content analysis. Currently mycobacteriophages, a highly studied phage group, are categorized into related clusters based on the comparative analysis of whole genome sequences. With the recent explosion of phage isolation, a simple method for phage cluster prediction would facilitate analysis of crude or complex samples without whole genome isolation and sequencing. The hypothesis of this study was that mycobacteriophage-cluster prediction is possible using comparison of a single, ubiquitous, semi-conserved gene. Tape Measure Protein (TMP) was selected to test the hypothesis because it is typically the longest gene in mycobacteriophage genomes and because regions within the TMP gene are conserved. Results A single gene, TMP, identified the known Mycobacteriophage clusters and subclusters using a Gepard dotplot comparison or a phylogenetic tree constructed from global alignment and maximum likelihood comparisons. Gepard analysis of 247 mycobacteriophage TMP sequences appropriately recovered 98.8% of the subcluster assignments that were made by whole-genome comparison. Subcluster-specific primers within TMP allow for PCR determination of the mycobacteriophage subcluster from DNA samples. Using the single-gene comparison approach for siphovirus coliphages, phage groupings by TMP comparison reflected relationships observed in a whole genome dotplot comparison and confirm the potential utility of this approach to another widely studied group of phages. Conclusions TMP sequence comparison and PCR results support the hypothesis that a single gene can be used for distinguishing phage cluster and subcluster assignments. TMP single-gene analysis can quickly and accurately aid in mycobacteriophage classification. PMID:23777341

  15. Identification of the Scopularide Biosynthetic Gene Cluster in Scopulariopsis brevicaulis.

    PubMed

    Lukassen, Mie Bech; Saei, Wagma; Sondergaard, Teis Esben; Tamminen, Anu; Kumar, Abhishek; Kempken, Frank; Wiebe, Marilyn G; Sørensen, Jens Laurids

    2015-07-01

    Scopularide A is a promising potent anticancer lipopeptide isolated from a marine derived Scopulariopsis brevicaulis strain. The compound consists of a reduced carbon chain (3-hydroxy-methyldecanoyl) attached to five amino acids (glycine, l-valine, d-leucine, l-alanine, and l-phenylalanine). Using the newly sequenced S. brevicaulis genome we were able to identify the putative biosynthetic gene cluster using genetic information from the structurally related emericellamide A from Aspergillus nidulans and W493-B from Fusarium pseudograminearum. The scopularide A gene cluster includes a nonribosomal peptide synthetase (NRPS1), a polyketide synthase (PKS2), a CoA ligase, an acyltransferase, and a transcription factor. Homologous recombination was low in S. brevicaulis so the local transcription factor was integrated randomly under a constitutive promoter, which led to a three to four-fold increase in scopularide A production. This indirectly verifies the identity of the proposed biosynthetic gene cluster. PMID:26184239

  16. Identification of the Scopularide Biosynthetic Gene Cluster in Scopulariopsis brevicaulis

    PubMed Central

    Lukassen, Mie Bech; Saei, Wagma; Sondergaard, Teis Esben; Tamminen, Anu; Kumar, Abhishek; Kempken, Frank; Wiebe, Marilyn G.; Sørensen, Jens Laurids

    2015-01-01

    Scopularide A is a promising potent anticancer lipopeptide isolated from a marine derived Scopulariopsis brevicaulis strain. The compound consists of a reduced carbon chain (3-hydroxy-methyldecanoyl) attached to five amino acids (glycine, l-valine, d-leucine, l-alanine, and l-phenylalanine). Using the newly sequenced S. brevicaulis genome we were able to identify the putative biosynthetic gene cluster using genetic information from the structurally related emericellamide A from Aspergillus nidulans and W493-B from Fusarium pseudograminearum. The scopularide A gene cluster includes a nonribosomal peptide synthetase (NRPS1), a polyketide synthase (PKS2), a CoA ligase, an acyltransferase, and a transcription factor. Homologous recombination was low in S. brevicaulis so the local transcription factor was integrated randomly under a constitutive promoter, which led to a three to four-fold increase in scopularide A production. This indirectly verifies the identity of the proposed biosynthetic gene cluster. PMID:26184239

  17. Genomic architecture and inheritance of human ribosomal RNA gene clusters

    PubMed Central

    Stults, Dawn M.; Killen, Michael W.; Pierce, Heather H.; Pierce, Andrew J.

    2008-01-01

    The finishing of the Human Genome Project largely completed the detailing of human euchromatic sequences; however, the most highly repetitive regions of the genome still could not be assembled. The 12 gene clusters producing the structural RNA components of the ribosome are critically important for cellular viability, yet fall into this unassembled region of the Human Genome Project. To determine the extent of human variation in ribosomal RNA gene content (rDNA) and patterns of rDNA cluster inheritance, we have determined the physical lengths of the rDNA clusters in peripheral blood white cells of healthy human volunteers. The cluster lengths exhibit striking variability between and within human individuals, ranging from 50 kb to >6 Mb, manifest essentially complete heterozygosity, and provide each person with their own unique rDNA electrophoretic karyotype. Analysis of these rDNA fingerprints in multigenerational human families demonstrates that the rDNA clusters are subject to meiotic rearrangement at a frequency >10% per cluster, per meiosis. With this high intrinsic recombinational instability, the rDNA clusters may serve as a unique paradigm of potential human genomic plasticity. PMID:18025267

  18. The Fusarium graminearum Genome Reveals More Secondary Metabolite Gene Clusters and Hints of Horizontal Gene Transfer

    PubMed Central

    Wong, Philip; Münsterkötter, Martin; Mewes, Hans-Werner; Schmeitzl, Clemens; Varga, Elisabeth; Berthiller, Franz; Adam, Gerhard; Güldener, Ulrich

    2014-01-01

    Fungal secondary metabolite biosynthesis genes are of major interest due to the pharmacological properties of their products (like mycotoxins and antibiotics). The genome of the plant pathogenic fungus Fusarium graminearum codes for a large number of candidate enzymes involved in secondary metabolite biosynthesis. However, the chemical nature of most enzymatic products of proteins encoded by putative secondary metabolism biosynthetic genes is largely unknown. Based on our analysis we present 67 gene clusters with significant enrichment of predicted secondary metabolism related enzymatic functions. 20 gene clusters with unknown metabolites exhibit strong gene expression correlation in planta and presumably play a role in virulence. Furthermore, the identification of conserved and over-represented putative transcription factor binding sites serves as additional evidence for cluster co-regulation. Orthologous cluster search provided insight into the evolution of secondary metabolism clusters. Some clusters are characteristic for the Fusarium phylum while others show evidence of horizontal gene transfer as orthologs can be found in representatives of the Botrytis or Cochliobolus lineage. The presented candidate clusters provide valuable targets for experimental examination. PMID:25333987

  19. A Resampling Based Clustering Algorithm for Replicated Gene Expression Data.

    PubMed

    Li, Han; Li, Chun; Hu, Jie; Fan, Xiaodan

    2015-01-01

    In gene expression data analysis, clustering is a fruitful exploratory technique to reveal the underlying molecular mechanism by identifying groups of co-expressed genes. To reduce the noise, usually multiple experimental replicates are performed. An integrative analysis of the full replicate data, instead of reducing the data to the mean profile, carries the promise of yielding more precise and robust clusters. In this paper, we propose a novel resampling based clustering algorithm for genes with replicated expression measurements. Assuming those replicates are exchangeable, we formulate the problem in the bootstrap framework, and aim to infer the consensus clustering based on the bootstrap samples of replicates. In our approach, we adopt the mixed effect model to accommodate the heterogeneous variances and implement a quasi-MCMC algorithm to conduct statistical inference. Experiments demonstrate that by taking advantage of the full replicate data, our algorithm produces more reliable clusters and has robust performance in diverse scenarios, especially when the data is subject to multiple sources of variance. PMID:26671802

  20. Quantitative Methylation Analysis of the PCDHB Gene Cluster.

    PubMed

    Banelli, Barbara; Romani, Massimo

    2015-01-01

    Long Range Epigenetic Silencing (LRES) is a repressed chromatin state of large chromosomal regions caused by DNA hypermethylation and histone modifications and is commonly observed in cancer. At 5q31 a LRES region of 800 kb includes three multi-gene clusters (PCDHA@, PCDHB@, and PCDHG@, respectively). Multiple experimental evidences have led to consider the PCDHB cluster as a DNA methylation marker of aggressiveness in neuroblastoma, second most common solid tumor in childhood. Because of its potential involvement not only in neuroblastoma but also in other malignancies, an easy and fast assay to screen the DNA methylation content of the PCDHB cluster might be useful for the precise stratification of the patients into risk groups and hence for choosing the most appropriate therapeutic protocol. Accordingly, we have developed a simple and cost-effective Pyrosequencing(®) assay to evaluate the methylation level of 17 genes in the protocadherin B cluster (PCDHB@). The rationale behind this Pyrosequencing assay can in principle be applied to analyze the DNA methylation level of any gene cluster with high homologies for screening purposes. PMID:26103900

  1. Cloning and Heterologous Expression of the Grecocycline Biosynthetic Gene Cluster

    PubMed Central

    Bilyk, Oksana; Sekurova, Olga N.; Zotchev, Sergey B.; Luzhetskyy, Andriy

    2016-01-01

    Transformation-associated recombination (TAR) in yeast is a rapid and inexpensive method for cloning and assembly of large DNA fragments, which relies on natural homologous recombination. Two vectors, based on p15a and F-factor replicons that can be maintained in yeast, E. coli and streptomycetes have been constructed. These vectors have been successfully employed for assembly of the grecocycline biosynthetic gene cluster from Streptomyces sp. Acta 1362. Fragments of the cluster were obtained by PCR and transformed together with the “capture” vector into the yeast cells, yielding a construct carrying the entire gene cluster. The obtained construct was heterologously expressed in S. albus J1074, yielding several grecocycline congeners. Grecocyclines have unique structural moieties such as a dissacharide side chain, an additional amino sugar at the C-5 position and a thiol group. Enzymes from this pathway may be used for the derivatization of known active angucyclines in order to improve their desired biological properties. PMID:27410036

  2. Discovery of the lomaiviticin biosynthetic gene cluster in Salinispora pacifica

    PubMed Central

    Janso, Jeffrey E.; Haltli, Brad A.; Eustáquio, Alessandra S.; Kulowski, Kerry; Waldman, Abraham J.; Zha, Li; Nakamura, Hitomi; Bernan, Valerie S.; He, Haiyin; Carter, Guy T.; Koehn, Frank E.; Balskus, Emily P.

    2014-01-01

    The lomaiviticins are a family of cytotoxic marine natural products that have captured the attention of both synthetic and biological chemists due to their intricate molecular scaffolds and potent biological activities. Here we describe the identification of the gene cluster responsible for lomaiviticin biosynthesis in Salinispora pacifica strains DPJ-0016 and DPJ-0019 using a combination of molecular approaches and genome sequencing. The link between the lom gene cluster and lomaiviticin production was confirmed using bacterial genetics, and subsequent analysis and annotation of this cluster revealed the biosynthetic basis for the core polyketide scaffold. Additionally, we have used comparative genomics to identify candidate enzymes for several unusual tailoring events, including diazo formation and oxidative dimerization. These findings will allow further elucidation of the biosynthetic logic of lomaiviticin assembly and provide useful molecular tools for application in biocatalysis and synthetic biology. PMID:25045187

  3. Duplications of hox gene clusters and the emergence of vertebrates.

    PubMed

    Soshnikova, Natalia; Dewaele, Romain; Janvier, Philippe; Krumlauf, Robb; Duboule, Denis

    2013-06-15

    The vertebrate body plan is characterized by an increased complexity relative to that of all other chordates and large-scale gene amplifications have been associated with key morphological innovations leading to their remarkable evolutionary success. Here, we use compound full Hox clusters deletions to investigate how Hox genes duplications may have contributed to the emergence of vertebrate-specific innovations. We show that the combined deletion of HoxA and HoxB leads to an atavistic heart phenotype, suggesting that the ancestral HoxA/B cluster was co-opted to help in diversifying the complex organ in vertebrates. Other phenotypic effects observed seem to illustrate the resurgence of ancestral (plesiomorphic) features. This indicates that the duplications of Hox clusters were associated with the recruitment or formation of novel cis-regulatory controls, which were key to the evolution of many vertebrate features and hence to the evolutionary radiation of this group. PMID:23501471

  4. Cloning and Heterologous Expression of the Grecocycline Biosynthetic Gene Cluster.

    PubMed

    Bilyk, Oksana; Sekurova, Olga N; Zotchev, Sergey B; Luzhetskyy, Andriy

    2016-01-01

    Transformation-associated recombination (TAR) in yeast is a rapid and inexpensive method for cloning and assembly of large DNA fragments, which relies on natural homologous recombination. Two vectors, based on p15a and F-factor replicons that can be maintained in yeast, E. coli and streptomycetes have been constructed. These vectors have been successfully employed for assembly of the grecocycline biosynthetic gene cluster from Streptomyces sp. Acta 1362. Fragments of the cluster were obtained by PCR and transformed together with the "capture" vector into the yeast cells, yielding a construct carrying the entire gene cluster. The obtained construct was heterologously expressed in S. albus J1074, yielding several grecocycline congeners. Grecocyclines have unique structural moieties such as a dissacharide side chain, an additional amino sugar at the C-5 position and a thiol group. Enzymes from this pathway may be used for the derivatization of known active angucyclines in order to improve their desired biological properties. PMID:27410036

  5. Identification and analysis of the resorcinomycin biosynthetic gene cluster.

    PubMed

    Ooya, Koichi; Ogasawara, Yasushi; Noike, Motoyoshi; Dairi, Tohru

    2015-01-01

    Resorcinomycin (1) is composed of a nonproteinogenic amino acid, (S)-2-(3,5-dihydroxy-4-isopropylphenyl)-2-guanidinoacetic acid (2), and glycine. A biosynthetic gene cluster was identified in a genome database of Streptoverticillium roseoverticillatum by searching for orthologs of the genes responsible for biosynthesis of pheganomycin (3), which possesses a (2)-derivative at its N-terminus. The cluster contained a gene encoding an ATP-grasp-ligase (res5), which was suggested to catalyze the peptide bond formation between 2 and glycine. A res5-deletion mutant lost 1 productivity but accumulated 2 in the culture broth. However, recombinant RES5 did not show catalytic activity to form 1 with 2 and glycine as substrates. Moreover, heterologous expression of the cluster resulted in accumulation of only 2 and no production of 1 was observed. These results suggested that a peptide with glycine at its N-terminus may be used as a nucleophile and then maturated by a peptidase encoded by a gene outside of the cluster. PMID:26034896

  6. Retention of genes in a secondary metabolite gene cluster that has degenerated in multiple lineages of the Ascomycota

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Fungal secondary metabolite (SM) gene clusters encode proteins involved in SM biosynthesis, protection against SMs, and regulation of cluster gene transcription. RNA-Seq analysis of Fusarium langsethiae (class Sordariomycetes) revealed a cluster of six genes that were highly expressed during growth...

  7. Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

    SciTech Connect

    Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

    2003-12-31

    Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involved in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.

  8. Evolution of chemical diversity by coordinated gene swaps in type II polyketide gene clusters

    PubMed Central

    Hillenmeyer, Maureen E.; Vandova, Gergana A.; Berlew, Erin E.; Charkoudian, Louise K.

    2015-01-01

    Natural product biosynthetic pathways generate molecules of enormous structural complexity and exquisitely tuned biological activities. Studies of natural products have led to the discovery of many pharmaceutical agents, particularly antibiotics. Attempts to harness the catalytic prowess of biosynthetic enzyme systems, for both compound discovery and engineering, have been limited by a poor understanding of the evolution of the underlying gene clusters. We developed an approach to study the evolution of biosynthetic genes on a cluster-wide scale, integrating pairwise gene coevolution information with large-scale phylogenetic analysis. We used this method to infer the evolution of type II polyketide gene clusters, tracing the path of evolution from the single ancestor to those gene clusters surviving today. We identified 10 key gene types in these clusters, most of which were swapped in from existing cellular processes and subsequently specialized. The ancestral type II polyketide gene cluster likely comprised a core set of five genes, a roster that expanded and contracted throughout evolution. A key C24 ancestor diversified into major classes of longer and shorter chain length systems, from which a C20 ancestor gave rise to the majority of characterized type II polyketide antibiotics. Our findings reveal that (i) type II polyketide structure is predictable from its gene roster, (ii) only certain gene combinations are compatible, and (iii) gene swaps were likely a key to evolution of chemical diversity. The lessons learned about how natural selection drives polyketide chemical innovation can be applied to the rational design and guided discovery of chemicals with desired structures and properties. PMID:26499248

  9. Cluster of genes controlling proline degradation in Salmonella typhimurium.

    PubMed Central

    Ratzkin, B; Roth, J

    1978-01-01

    A cluster of genes essential for degradation of proline to glutamate (put) is located between the pyrC and pyrD loci at min 22 of the Salmonella chromosome. A series of 25 deletion mutants of this region have been isolated and used to construct a fine-structure map of the put genes. The map includes mutations affecting the proline degradative activities, proline oxidase and pyrroline-5-carboxylic dehydrogenase. Also included are mutations affecting the major proline permease and a regulatory mutation that affects both enzyme and permease production. The two enzymatic activities appear to be encoded by a single gene (putA). The regulatory mutation maps between the putA gene and the proline permease gene (putP). PMID:342507

  10. Identification of genes and gene clusters involved in mycotoxin synthesis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Research methods to identify and characterize genes involved in mycotoxin biosynthetic pathways have evolved considerably over the years. Before whole genome sequences were available (e.g. pre-genomics), work focused primarily on chemistry, biosynthetic mutant strains and molecular analysis of sing...

  11. Transcription mediated insulation and interference direct gene cluster expression switches

    PubMed Central

    Nguyen, Tania; Brown, David; Murray, Struan C; Haenni, Simon; Halstead, James M; O'Connor, Leigh; Shipkovenska, Gergana; Steinmetz, Lars M; Mellor, Jane

    2014-01-01

    In yeast, many tandemly arranged genes show peak expression in different phases of the metabolic cycle (YMC) or in different carbon sources, indicative of regulation by a bi-modal switch, but it is not clear how these switches are controlled. Using native elongating transcript analysis (NET-seq), we show that transcription itself is a component of bi-modal switches, facilitating reciprocal expression in gene clusters. HMS2, encoding a growth-regulated transcription factor, switches between sense- or antisense-dominant states that also coordinate up- and down-regulation of transcription at neighbouring genes. Engineering HMS2 reveals alternative mono-, di- or tri-cistronic and antisense transcription units (TUs), using different promoter and terminator combinations, that underlie state-switching. Promoters or terminators are excluded from functional TUs by read-through transcriptional interference, while antisense TUs insulate downstream genes from interference. We propose that the balance of transcriptional insulation and interference at gene clusters facilitates gene expression switches during intracellular and extracellular environmental change. DOI: http://dx.doi.org/10.7554/eLife.03635.001 PMID:25407679

  12. Reconstructing Histories of Complex Gene Clusters on a Phylogeny

    NASA Astrophysics Data System (ADS)

    Vinař, Tomáš; Brejová, Broňa; Song, Giltae; Siepel, Adam

    Clusters of genes that have evolved by repeated segmental duplication present difficult challenges throughout genomic analysis, from sequence assembly to functional analysis. These clusters are one of the major sources of evolutionary innovation, and they are linked to multiple diseases, including HIV and a variety of cancers. Understanding their evolutionary histories is a key to the application of comparative genomics methods in these regions of the genome. We propose a probabilistic model of gene cluster evolution on a phylogeny, and an MCMC algorithm for reconstruction of duplication histories from genomic sequences in multiple species. Several projects are underway to obtain high quality BAC-based assemblies of duplicated clusters in multiple species, and we anticipate use of our methods in their analysis. Supplementary materials are located at http://compbio.fmph.uniba.sk/suppl/09recombcg/

  13. Human metallothionein genes are clustered on chromosome 16.

    PubMed Central

    Karin, M; Eddy, R L; Henry, W M; Haley, L L; Byers, M G; Shows, T B

    1984-01-01

    The metallothioneins are a family of heavy-metal binding proteins of low molecular weight. They function in the regulation of trace metal metabolism and in the protection against toxic heavy metal ions. In man, the metallothioneins are encoded by at least 10-12 genes separated into two groups, MT-I and MT-II. To understand the genomic organization of these genes and their involvement in hereditary disorders of trace metal metabolism, we have determined their chromosomal location. Using human-mouse cell hybrids and hybridization probes derived from cloned and functional human MT1 and MT2 genes, we show that the functional human genes are clustered on human chromosome 16. Analysis of RNA from somatic cell hybrids indicated that hybrids that contained human chromosome 16 expressed both human MT1 and MT2 mRNA, and this expression is regulated by both heavy metal ions and glucocorticoid hormones. Images PMID:6089206

  14. Bi-clustering of Gene Expression Data Using Conditional Entropy

    NASA Astrophysics Data System (ADS)

    Olomola, Afolabi; Dua, Sumeet

    The inherent sparseness of gene expression data and the rare exhibition of similar expression patterns across a wide range of conditions make traditional clustering techniques unsuitable for gene expression analysis. Biclustering methods currently used to identify correlated gene patterns based on a subset of conditions do not effectively mine constant, coherent, or overlapping biclusters, partially because they perform poorly in the presence of noise. In this paper, we present a new methodology (BiEntropy) that combines information entropy and graph theory techniques to identify co-expressed gene patterns that are relevant to a subset of the sample. Our goal is to discover different types of biclusters in the presence of noise and to demonstrate the superiority of our method over existing methods in terms of discovering functionally enriched biclusters. We demonstrate the effectiveness of our method using both synthetic and real data.

  15. Surveying phylogenetic footprints in large gene clusters: applications to Hox cluster duplications.

    PubMed

    Prohaska, Sonja J; Fried, Claudia; Flamm, Christoph; Wagner, Günter P; Stadler, Peter F

    2004-05-01

    Evolutionarily conserved non-coding genomic sequences represent a potentially rich source for the discovery of gene regulatory regions. Since these elements are subject to stabilizing selection they evolve much more slowly than adjacent non-functional DNA. These so-called phylogenetic footprints can be detected by comparison of the sequences surrounding orthologous genes in different species. Therefore the loss of phylogenetic footprints as well as the acquisition of conserved non-coding sequences in some lineages, but not in others, can provide evidence for the evolutionary modification of cis-regulatory elements. We introduce here a statistical model of footprint evolution that allows us to estimate the loss of sequence conservation that can be attributed to gene loss and other structural reasons. This approach to studying the pattern of cis-regulatory element evolution, however, requires the comparison of relatively long sequences from many species. We have therefore developed an efficient software tool for the identification of corresponding footprints in long sequences from multiple species. We apply this novel method to the published sequences of HoxA clusters of shark, human, and the duplicated zebrafish and Takifugu clusters as well as the published HoxB cluster sequences. We find that there is a massive loss of sequence conservation in the intergenic region of the HoxA clusters, consistent with the finding in [Chiu et al., PNAS 99 (2002) 5492]. The loss of conservation after cluster duplication is more extensive than expected from structural reasons. This suggests that binding site turnover and/or adaptive modification may also contribute to the loss of sequence conservation. PMID:15062796

  16. Molecular genetics of the human MHC complement gene cluster.

    PubMed

    Yu, C Y

    1998-01-01

    The human major histocompatibility complex (MHC) complement gene cluster (MCGC) is a highly variable region that is characterized by polymorphisms, variations in gene size and gene number, and associations with diseases. Deficiencies in complement C2 are either due to abolition of C2 protein synthesis by mini-deletions that caused frameshift mutations, or blocked secretion of the C2 protein by single amino acid substitutions. One, two or three C4 genes may be present in a human MCGC haplotype and these genes may code for C4A, C4B, or both. Deficiencies of C4A or C4B proteins are attributed to the expression of identical C4 isotypes or allotypes from the C4 loci, the absence or deletion of a C4 gene, 2-bp insertion at exon 29 or 1-bp deletion at exon 20 that caused frameshift mutations. The C4 genes are either 21 or 14.6 kb in size due to the presence of endogenous retrovirus HERV-K(C4) in the intron 9 of long C4 genes. A deletion or duplication of a C4 gene is always accompanied by its neighboring genes, RP at the 5' region, and CYP21 and TNX at the 3' region. These four genes form a genetic unit termed the RCCX module. In an RCCX bimodular structure, the pseudogene CYP21A, and partially duplicated gene segments TNXA and RP2 are present between the two C4 loci. The RCCX modular variations in gene number and gene size contributed to unequal crossovers and exchanges of polymorphic sequences/mutations, resulting in the homogenization of C4 polymorphisms and acquisitions of deleterious mutations in RP1, C4A, C4B, CYP21B and TNXB genes. RD, SKI2W, DOM3Z and RP1 are the four novel genes found between Bf and C4. RD and Ski2w proteins may be related to RNA splicing, RNA turnover and regulation of translation. The functions of Dom3z and RP1 are being investigated. The complete genomic DNA sequence between C2 and TNX is now available. This should facilitate a complete documentation of polymorphisms, mutations and disease associations for the MCGC. PMID:10072631

  17. Cluster analysis of gene expression data based on self-splitting and merging competitive learning.

    PubMed

    Wu, Shuanhu; Liew, Alan Wee-Chung; Yan, Hong; Yang, Mengsu

    2004-03-01

    Cluster analysis of gene expression data from a cDNA microarray is useful for identifying biologically relevant groups of genes. However, finding the natural clusters in the data and estimating the correct number of clusters are still two largely unsolved problems. In this paper, we propose a new clustering framework that is able to address both these problems. By using the one-prototype-take-one-cluster (OPTOC) competitive learning paradigm, the proposed algorithm can find natural clusters in the input data, and the clustering solution is not sensitive to initialization. In order to estimate the number of distinct clusters in the data, we propose a cluster splitting and merging strategy. We have applied the new algorithm to simulated gene expression data for which the correct distribution of genes over clusters is known a priori. The results show that the proposed algorithm can find natural clusters and give the correct number of clusters. The algorithm has also been tested on real gene expression changes during yeast cell cycle, for which the fundamental patterns of gene expression and assignment of genes to clusters are well understood from numerous previous studies. Comparative studies with several clustering algorithms illustrate the effectiveness of our method. PMID:15055797

  18. Genetic organization of the Salmonella typhimurium ilv gene cluster.

    PubMed

    Blazey, D L; Burns, R O

    1979-01-01

    A number of Salmonella typhimurium ilv::Tn10 insertion strains were used to analyze the Salmonella ilv gene cluster. Tn10 generated ilv deletion mutants were employed in mapping experiments to conclusively define the gene order as ilvG-E-D-A-C. Examination of ilv enzyme levels confirms that the direction of transcription of ilvGEDA is from ilvG to ilvA. The major control locus, designated ilvO, is located before ilvG forming an ilvOGEDA transcriptional unit that is multivalently repressed by isoleucine, valine and leucine. Two internal promoters, one before ilvE and anonother before ilvD, are identified and are shown to provide repressed levels of the ilvE, D and A gene products. Possible regulation of transcription from these promoters in response to isoleucine limitation is discussed in terms of attenuation. PMID:395408

  19. Hox cluster polarity in early transcriptional availability: a high order regulatory level of clustered Hox genes in the mouse.

    PubMed

    Roelen, Bernard A J; de Graaff, Wim; Forlani, Sylvie; Deschamps, Jacqueline

    2002-11-01

    The molecular mechanism underlying the 3' to 5' polarity of induction of mouse Hox genes is still elusive. While relief from a cluster-encompassing repression was shown to lead to all Hoxd genes being expressed like the 3'most of them, Hoxd1 (Kondo and Duboule, 1999), the molecular basis of initial activation of this 3'most gene, is not understood yet. We show that, already before primitive streak formation, prior to initial expression of the first Hox gene, a dramatic transcriptional stimulation of the 3'most genes, Hoxb1 and Hoxb2, is observed upon a short pulse of exogenous retinoic acid (RA), whereas it is not in the case for more 5', cluster-internal, RA-responsive Hoxb genes. In contrast, the RA-responding Hoxb1lacZ transgene that faithfully mimics the endogenous gene (Marshall et al., 1994) did not exhibit the sensitivity of Hoxb1 to precocious activation. We conclude that polarity in initial activation of Hoxb genes reflects a greater availability of 3'Hox genes for transcription, suggesting a pre-existing (susceptibility to) opening of the chromatin structure at the 3' extremity of the cluster. We discuss the data in the context of prevailing models involving differential chromatin opening in the directionality of clustered Hox gene transcription, and regarding the importance of the cluster context for correct timing of initial Hox gene expression.Interestingly, Cdx1 manifested the same early transcriptional availability as Hoxb1. PMID:12385756

  20. Discovery of a widely distributed toxin biosynthetic gene cluster

    PubMed Central

    Lee, Shaun W.; Mitchell, Douglas A.; Markley, Andrew L.; Hensler, Mary E.; Gonzalez, David; Wohlrab, Aaron; Dorrestein, Pieter C.; Nizet, Victor; Dixon, Jack E.

    2008-01-01

    Bacteriocins represent a large family of ribosomally produced peptide antibiotics. Here we describe the discovery of a widely conserved biosynthetic gene cluster for the synthesis of thiazole and oxazole heterocycles on ribosomally produced peptides. These clusters encode a toxin precursor and all necessary proteins for toxin maturation and export. Using the toxin precursor peptide and heterocycle-forming synthetase proteins from the human pathogen Streptococcus pyogenes, we demonstrate the in vitro reconstitution of streptolysin S activity. We provide evidence that the synthetase enzymes, as predicted from our bioinformatics analysis, introduce heterocycles onto precursor peptides, thereby providing molecular insight into the chemical structure of streptolysin S. Furthermore, our studies reveal that the synthetase exhibits relaxed substrate specificity and modifies toxin precursors from both related and distant species. Given our findings, it is likely that the discovery of similar peptidic toxins will rapidly expand to existing and emerging genomes. PMID:18375757

  1. Molecular Characterization of Neurally Expressing Genes in the Para Sodium Channel Gene Cluster of Drosophila

    PubMed Central

    Hong, C. S.; Ganetzky, B.

    1996-01-01

    To elucidate the mechanisms regulating expression of para, which encodes the major class of sodium channels in the Drosophila nervous system, we have tried to locate upstream cis-acting regulatory elements by mapping the transcriptional start site and analyzing the region immediately upstream of para in region 14D of the polytene chromosomes. From these studies, we have discovered that the region contains a cluster of neurally expressing genes. Here we report the molecular characterization of the genomic organization of the 14D region and the genes within this region, which are: calnexin (Cnx), actin related protein 14D (Arp14D), calcineurin A 14D (CnnA14D), and chromosome associated protein (Cap). The tight clustering of these genes, their neuronal expression patterns, and their potential functions related to expression, modulation, or regulation of sodium channels raise the possibility that these genes represent a functionally related group sharing some coordinate regulatory mechanism. PMID:8849894

  2. Cloning and characterization of the biosynthetic gene cluster for kutznerides

    PubMed Central

    Fujimori, Danica Galonić; Hrvatin, Siniša; Neumann, Christopher S.; Strieker, Matthias; Marahiel, Mohamed A.; Walsh, Christopher T.

    2007-01-01

    Kutznerides, actinomycete-derived cyclic depsipetides, consist of six nonproteinogenic residues, including a highly oxygenated tricyclic hexahydropyrroloindole, a chlorinated piperazic acid, 2-(1-methylcyclopropyl)-glycine, a β-branched-hydroxy acid, and 3-hydroxy glutamic acid, for which biosynthetic logic has not been elucidated. Herein we describe the biosynthetic gene cluster for the kutzneride family, identified by degenerate primer PCR for halogenating enzymes postulated to be involved in biosyntheses of these unusual monomers. The 56-kb gene cluster encodes a series of six nonribosomal peptide synthetase (NRPS) modules distributed over three proteins and a variety of tailoring enzymes, including both mononuclear nonheme iron and two flavin-dependent halogenases, and an array of oxygen transfer catalysts. The sequence and organization of NRPS genes support incorporation of the unusual monomer units into the densely functionalized scaffold of kutznerides. Our work provides insight into the formation of this intriguing class of compounds and provides a foundation for elucidating the timing and mechanisms of their biosynthesis. PMID:17940045

  3. Functional clustering of time series gene expression data by Granger causality

    PubMed Central

    2012-01-01

    Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425

  4. Evolutionary formation of gene clusters by reorganization: the meleagrin/roquefortine paradigm in different fungi.

    PubMed

    Martín, Juan F; Liras, Paloma

    2016-02-01

    The biosynthesis of secondary metabolites in fungi is catalyzed by enzymes encoded by genes linked in clusters that are frequently co-regulated at the transcriptional level. Formation of gene clusters may take place by de novo assembly of genes recruited from other cellular functions, but also novel gene clusters are formed by reorganization of progenitor clusters and are distributed by horizontal gene transfer. This article reviews (i) the published information on the roquefortine/meleagrin/neoxaline gene clusters of Penicillium chrysogenum (Penicillium rubens) and the short roquefortine cluster of Penicillium roqueforti, and (ii) the correlation of the genes present in those clusters with the enzymes and metabolites derived from these pathways. The P. chrysogenum roq/mel cluster consists of seven genes and includes a gene (roqT) encoding a 12-TMS transporter protein of the MFS family. Interestingly, the orthologous P. roquefortine gene cluster has only four genes and the roqT gene is present as a residual pseudogene that encodes only small peptides. Two of the genes present in the central region of the P. chrysogenum roq/mel cluster have been lost during the evolutionary formation of the short cluster and the order of the structural genes in the cluster has been rearranged. The two lost genes encode a N1 atom hydroxylase (nox) and a roquefortine scaffold-reorganizing oxygenase (sro). As a consequence P. roqueforti has lost the ability to convert the roquefortine-type carbon skeleton to the glandicoline/meleagrin-type scaffold and is unable to produce glandicoline B, meleagrin and neoxaline. The loss of this genetic information is not recent and occurred probably millions of years ago when a progenitor Penicillium strain got adapted to life in a few rich habitats such as cheese, fermented cereal grains or silage. P. roqueforti may be considered as a "domesticated" variant of a progenitor common to contemporary P. chrysogenum and related Penicillia. PMID:26668029

  5. Toward Awakening Cryptic Secondary Metabolite Gene Clusters in Filamentous Fungi

    PubMed Central

    Lim, Fang Yun; Sanchez, James F.; Wang, Clay C.C.; Keller, Nancy P.

    2013-01-01

    Mining for novel natural compounds is of eminent importance owing to the continuous need for new pharmaceuticals. Filamentous fungi are historically known to harbor the genetic capacity for an arsenal of natural compounds, both beneficial and detrimental to humans. The majority of these metabolites are still cryptic or silent under standard laboratory culture conditions. Mining for these cryptic natural products can be an excellent source for identifying new compound classes. Capitalizing on the current knowledge on how secondary metabolite gene clusters are regulated has allowed the research community to unlock many hidden fungal treasures, as described in this chapter. PMID:23084945

  6. Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.

    PubMed

    Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C

    2015-10-01

    Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. PMID:26253660

  7. Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes

    PubMed Central

    Azevedo, Analice C.; Bento, Cláudia B. P.; Ruiz, Jeronimo C.; Queiroz, Marisa V.

    2015-01-01

    Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. PMID:26253660

  8. Gene prioritization and clustering by multi-view text mining

    PubMed Central

    2010-01-01

    Background Text mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate genes for a disease for further experimental analysis. Many text mining approaches have been introduced, but the effect of disease-gene identification varies in different text mining models. Thus, the idea of incorporating more text mining models may be beneficial to obtain more refined and accurate knowledge. However, how to effectively combine these models still remains a challenging question in machine learning. In particular, it is a non-trivial issue to guarantee that the integrated model performs better than the best individual model. Results We present a multi-view approach to retrieve biomedical knowledge using different controlled vocabularies. These controlled vocabularies are selected on the basis of nine well-known bio-ontologies and are applied to index the vast amounts of gene-based free-text information available in the MEDLINE repository. The text mining result specified by a vocabulary is considered as a view and the obtained multiple views are integrated by multi-source learning algorithms. We investigate the effect of integration in two fundamental computational disease gene identification tasks: gene prioritization and gene clustering. The performance of the proposed approach is systematically evaluated and compared on real benchmark data sets. In both tasks, the multi-view approach demonstrates significantly better performance than other comparing methods. Conclusions In practical research, the relevance of specific vocabulary pertaining to the task is usually unknown. In such case, multi-view text mining is a superior and promising strategy for text-based disease gene identification. PMID:20074336

  9. Arrangement of the Clostridium baratii F7 Toxin Gene Cluster with Identification of a σ Factor That Recognizes the Botulinum Toxin Gene Cluster Promoters

    SciTech Connect

    Dover, Nir; Barash, Jason R.; Burke, Julianne N.; Hill, Karen K.; Detter, John C.; Arnon, Stephen S.

    2014-05-22

    Botulinum neurotoxin (BoNT) is the most poisonous substances known and its eight toxin types (A to H) are distinguished by the inability of polyclonal antibodies that neutralize one toxin type to neutralize any of the other seven toxin types. Infant botulism, an intestinal toxemia orphan disease, is the most common form of human botulism in the United States. It results from swallowed spores of Clostridium botulinum (or rarely, neurotoxigenic Clostridium butyricum or Clostridium baratii) that germinate and temporarily colonize the lumen of the large intestine, where, as vegetative cells, they produce botulinum toxin. Botulinum neurotoxin is encoded by the bont gene that is part of a toxin gene cluster that includes several accessory genes. In this paper, we sequenced for the first time the complete botulinum neurotoxin gene cluster of nonproteolytic C. baratii type F7. Like the type E and the nonproteolytic type F6 botulinum toxin gene clusters, the C. baratii type F7 had an orfX toxin gene cluster that lacked the regulatory botR gene which is found in proteolytic C. botulinum strains and codes for an alternative σ factor. In the absence of botR, we identified a putative alternative regulatory gene located upstream of the C. baratii type F7 toxin gene cluster. This putative regulatory gene codes for a predicted σ factor that contains DNA-binding-domain homologues to the DNA-binding domains both of BotR and of other members of the TcdR-related group 5 of the σ70 family that are involved in the regulation of toxin gene expression in clostridia. We showed that this TcdR-related protein in association with RNA polymerase core enzyme specifically binds to the C. baratii type F7 botulinum toxin gene cluster promoters. Finally, this TcdR-related protein may therefore be involved in regulating the expression of the genes of the botulinum toxin gene cluster in neurotoxigenic C. baratii.

  10. Arrangement of the Clostridium baratii F7 Toxin Gene Cluster with Identification of a σ Factor That Recognizes the Botulinum Toxin Gene Cluster Promoters

    PubMed Central

    Dover, Nir; Barash, Jason R.; Burke, Julianne N.; Hill, Karen K.; Detter, John C.; Arnon, Stephen S.

    2014-01-01

    Botulinum neurotoxin (BoNT) is the most poisonous substances known and its eight toxin types (A to H) are distinguished by the inability of polyclonal antibodies that neutralize one toxin type to neutralize any of the other seven toxin types. Infant botulism, an intestinal toxemia orphan disease, is the most common form of human botulism in the United States. It results from swallowed spores of Clostridium botulinum (or rarely, neurotoxigenic Clostridium butyricum or Clostridium baratii) that germinate and temporarily colonize the lumen of the large intestine, where, as vegetative cells, they produce botulinum toxin. Botulinum neurotoxin is encoded by the bont gene that is part of a toxin gene cluster that includes several accessory genes. We sequenced for the first time the complete botulinum neurotoxin gene cluster of nonproteolytic C. baratii type F7. Like the type E and the nonproteolytic type F6 botulinum toxin gene clusters, the C. baratii type F7 had an orfX toxin gene cluster that lacked the regulatory botR gene which is found in proteolytic C. botulinum strains and codes for an alternative σ factor. In the absence of botR, we identified a putative alternative regulatory gene located upstream of the C. baratii type F7 toxin gene cluster. This putative regulatory gene codes for a predicted σ factor that contains DNA-binding-domain homologues to the DNA-binding domains both of BotR and of other members of the TcdR-related group 5 of the σ70 family that are involved in the regulation of toxin gene expression in clostridia. We showed that this TcdR-related protein in association with RNA polymerase core enzyme specifically binds to the C. baratii type F7 botulinum toxin gene cluster promoters. This TcdR-related protein may therefore be involved in regulating the expression of the genes of the botulinum toxin gene cluster in neurotoxigenic C. baratii. PMID:24853378

  11. A Single Gene Cluster for Chalcomycins and Aldgamycins: Genetic Basis for Bifurcation of Their Biosynthesis.

    PubMed

    Tang, Xiao-Long; Dai, Ping; Gao, Hao; Wang, Chuan-Xi; Chen, Guo-Dong; Hong, Kui; Hu, Dan; Yao, Xin-Sheng

    2016-07-01

    Aldgamycins are 16-membered macrolide antibiotics with a rare branched-chain sugar d-aldgarose or decarboxylated d-aldgarose at C-5. In our efforts to clone the gene cluster for aldgamycins from a marine-derived Streptomyces sp. HK-2006-1 capable of producing both aldgamycins and chalcomycins, we found that both are biosynthesized from a single gene cluster. Whole-genome sequencing combined with gene disruption established the entire gene cluster of aldgamycins: nine new genes are incorporated with the previously identified chalcomycin gene cluster. Functional analysis of these genes revealed that almDI/almDII, (encoding α/β subunits of pyruvate dehydrogenase) triggers the biosynthesis of aldgamycins, whereas almCI (encoding an oxidoreductase) initiates chalcomycins biosynthesis. This is the first report that aldgamycins and chalcomycins are derived from a single gene cluster and of the genetic basis for bifurcation in their biosynthesis. PMID:27191535

  12. Parallel evolutionary events in the haptoglobin gene clusters of rhesus monkey and human

    SciTech Connect

    Erickson, L.M.; Maeda, N.

    1994-08-01

    Parallel occurrences of evolutionary events in the haptoglobin gene clusters of rhesus monkeys and humans were studied. We found six different haplotypes among 11 individuals from two rhesus monkey families. The six haplotypes include two types of haptoglobin gene clusters: one type with a single gene and the other with two genes. DNA sequence analysis indicates that the one-gene and the two-gene clusters were both formed by unequal homologous crossovers between two genes of an ancestral three-gene cluster, near exon 5, the longest exon of the gene. This exon is also the location where a separate unequal homologous crossover occured in the human lineage, forming the human two-gene haptoglobin gene cluster from an ancestral three-gene cluster. The occurrence of independent homologous unequal crossovers in rhesus monkey and in human within the same region of DNA suggests that the evolutionary history of the haptoglobin gene cluster in primates is the consequence of frequent homologous pairings facilitated by the longest and most conserved exon of the gene. 27 refs., 7 figs., 1 tab.

  13. Deletion analysis of the avermectin biosynthetic genes of Streptomyces avermitilis by gene cluster displacement.

    PubMed Central

    MacNeil, T; Gewain, K M; MacNeil, D J

    1993-01-01

    Streptomyces avermitilis produces a group of glycosylated, methylated macrocyclic lactones, the avermectins, which have potent anthelmintic activity. A homologous recombination strategy termed gene cluster displacement was used to construct Neor deletion strains with defined endpoints and to clone the corresponding complementary DNA encoding functions for avermectin biosynthesis (avr). Thirty-five unique deletions of 0.5 to > 100 kb over a continuous 150-kb region were introduced into S. avermitilis. Analysis of the avermectin phenotypes of the deletion-containing strains defined the extent and ends of the 95-kb avr gene cluster, identified a regulatory region, and mapped several avr functions. A 60-kb region in the central portion determines the synthesis of the macrolide ring. A 13-kb region at one end of the cluster is responsible for synthesis and attachment of oleandrose disaccharide. A 10-kb region at the other end has functions for positive regulation and C-5 O methylation. Physical analysis of the deletions and of in vivo-cloned fragments refined a 130-kb physical map of the avr gene cluster region. Images PMID:8478321

  14. A metabolic gene cluster in Lotus japonicus discloses novel enzyme functions and products in triterpene biosynthesis.

    PubMed

    Krokida, Afrodite; Delis, Costas; Geisler, Katrin; Garagounis, Constantine; Tsikou, Daniela; Peña-Rodríguez, Luis M; Katsarou, Dimitra; Field, Ben; Osbourn, Anne E; Papadopoulou, Kalliope K

    2013-11-01

    Genes for triterpene biosynthetic pathways exist as metabolic gene clusters in oat and Arabidopsis thaliana plants. We characterized the presence of an analogous gene cluster in the model legume Lotus japonicus. In the genomic regions flanking the oxidosqualene cyclase AMY2 gene, genes for two different classes of cytochrome P450 and a gene predicted to encode a reductase were identified. Functional characterization of the cluster genes was pursued by heterologous expression in Nicotiana benthamiana. The gene expression pattern was studied under different developmental and environmental conditions. The physiological role of the gene cluster in nodulation and plant development was studied in knockdown experiments. A novel triterpene structure, dihydrolupeol, was produced by AMY2. A new plant cytochrome P450, CYP71D353, which catalyses the formation of 20-hydroxybetulinic acid in a sequential three-step oxidation of 20-hydroxylupeol was characterized. The genes within the cluster are highly co-expressed during root and nodule development, in hormone-treated plants and under various environmental stresses. A transcriptional gene silencing mechanism that appears to be involved in the regulation of the cluster genes was also revealed. A tightly co-regulated cluster of functionally related genes is involved in legume triterpene biosynthesis, with a possible role in plant development. PMID:23909862

  15. Time-series clustering of gene expression in irradiated and bystander fibroblasts: an application of FBPA clustering

    PubMed Central

    2011-01-01

    Background The radiation bystander effect is an important component of the overall biological response of tissues and organisms to ionizing radiation, but the signaling mechanisms between irradiated and non-irradiated bystander cells are not fully understood. In this study, we measured a time-series of gene expression after α-particle irradiation and applied the Feature Based Partitioning around medoids Algorithm (FBPA), a new clustering method suitable for sparse time series, to identify signaling modules that act in concert in the response to direct irradiation and bystander signaling. We compared our results with those of an alternate clustering method, Short Time series Expression Miner (STEM). Results While computational evaluations of both clustering results were similar, FBPA provided more biological insight. After irradiation, gene clusters were enriched for signal transduction, cell cycle/cell death and inflammation/immunity processes; but only FBPA separated clusters by function. In bystanders, gene clusters were enriched for cell communication/motility, signal transduction and inflammation processes; but biological functions did not separate as clearly with either clustering method as they did in irradiated samples. Network analysis confirmed p53 and NF-κB transcription factor-regulated gene clusters in irradiated and bystander cells and suggested novel regulators, such as KDM5B/JARID1B (lysine (K)-specific demethylase 5B) and HDACs (histone deacetylases), which could epigenetically coordinate gene expression after irradiation. Conclusions In this study, we have shown that a new time series clustering method, FBPA, can provide new leads to the mechanisms regulating the dynamic cellular response to radiation. The findings implicate epigenetic control of gene expression in addition to transcription factor networks. PMID:21205307

  16. Deletion and Gene Expression Analyses Define the Paxilline Biosynthetic Gene Cluster in Penicillium paxilli

    PubMed Central

    Scott, Barry; Young, Carolyn A.; Saikia, Sanjay; McMillan, Lisa K.; Monahan, Brendon J.; Koulman, Albert; Astin, Jonathan; Eaton, Carla J.; Bryant, Andrea; Wrenn, Ruth E.; Finch, Sarah C.; Tapper, Brian A.; Parker, Emily J.; Jameson, Geoffrey B.

    2013-01-01

    The indole-diterpene paxilline is an abundant secondary metabolite synthesized by Penicillium paxilli. In total, 21 genes have been identified at the PAX locus of which six have been previously confirmed to have a functional role in paxilline biosynthesis. A combination of bioinformatics, gene expression and targeted gene replacement analyses were used to define the boundaries of the PAX gene cluster. Targeted gene replacement identified seven genes, paxG, paxA, paxM, paxB, paxC, paxP and paxQ that were all required for paxilline production, with one additional gene, paxD, required for regular prenylation of the indole ring post paxilline synthesis. The two putative transcription factors, PP104 and PP105, were not co-regulated with the pax genes and based on targeted gene replacement, including the double knockout, did not have a role in paxilline production. The relationship of indole dimethylallyl transferases involved in prenylation of indole-diterpenes such as paxilline or lolitrem B, can be found as two disparate clades, not supported by prenylation type (e.g., regular or reverse). This paper provides insight into the P. paxilli indole-diterpene locus and reviews the recent advances identified in paxilline biosynthesis. PMID:23949005

  17. Use of Semisupervised Clustering and Feature-Selection Techniques for Identification of Co-expressed Genes.

    PubMed

    Saha, Sriparna; Alok, Abhay Kumar; Ekbal, Asif

    2016-07-01

    Studying the patterns hidden in gene-expression data helps to understand the functionality of genes. In general, clustering techniques are widely used for the identification of natural partitionings from the gene expression data. In order to put constraints on dimensionality, feature selection is the key issue because not all features are important from clustering point of view. Moreover some limited amount of supervised information can help to fine tune the obtained clustering solution. In this paper, the problem of simultaneous feature selection and semisupervised clustering is formulated as a multiobjective optimization (MOO) task. A modern simulated annealing-based MOO technique namely AMOSA is utilized as the background optimization methodology. Here, features and cluster centers are represented in the form of a string and the assignment of genes to different clusters is done using a point symmetry-based distance. Six optimization criteria based on several internal and external cluster validity indices are utilized. In order to generate the supervised information, a popular clustering technique, Fuzzy C-mean, is utilized. Appropriate subset of features, proper number of clusters and the proper partitioning are determined using the search capability of AMOSA. The effectiveness of this proposed semisupervised clustering technique, Semi-FeaClustMOO, is demonstrated on five publicly available benchmark gene-expression datasets. Comparison results with the existing techniques for gene-expression data clustering again reveal the superiority of the proposed technique. Statistical and biological significance tests have also been carried out. PMID:26208367

  18. Comparative Analysis of Cluster Validity Indices in Identifying Some Possible Genes Mediating Certain Cancers.

    PubMed

    Ghosh, Anupam; Dhara, Bibhas Chandra; De, Rajat K

    2013-04-01

    In this article, we compare the performance of 19 cluster validity indices, in identifying some possible genes mediating certain cancers, based on gene expression data. For the purpose of this comparison, we have developed a method. The proposed method involves cluster generation, selection of the best k-value or c-values, cluster identification, identifying the altered gene cluster, scoring an altered gene cluster and determining the best k-value or c-value exploring through biological repositories. The effectiveness of the method has been demonstrated on three gene expression data sets dealing with human lung cancer, colon cancer, and leukemia. Here, we have used three clustering algorithms, i.e., k-means, PAM and fuzzy c-means. We have used biochemical pathways related to these cancers and p-value statistics for validating the study. PMID:27481591

  19. Elephant shark (Callorhinchus milii) provides insights into the evolution of Hox gene clusters in gnathostomes.

    PubMed

    Ravi, Vydianathan; Lam, Kevin; Tay, Boon-Hui; Tay, Alice; Brenner, Sydney; Venkatesh, Byrappa

    2009-09-22

    We have sequenced and analyzed Hox gene clusters from elephant shark, a holocephalian cartilaginous fish. Elephant shark possesses 4 Hox clusters with 45 Hox genes that include orthologs for a higher number of ancient gnathostome Hox genes than the 4 clusters in tetrapods and the supernumerary clusters in teleost fishes. Phylogenetic analysis of elephant shark Hox genes from 7 paralogous groups that contain all of the 4 members indicated an ((AB)(CD)) topology for the order of Hox cluster duplication, providing support for the 2R hypothesis (i.e., 2 rounds of whole-genome duplication during the early evolution of vertebrates). Comparisons of noncoding sequences of the elephant shark and human Hox clusters have identified a large number of conserved noncoding elements (CNEs), which represent putative cis-regulatory elements that may be involved in the regulation of Hox genes. Interestingly, in fugu more than 50% of these ancient CNEs have diverged beyond recognition in the duplicated (HoxA, HoxB, and HoxD) as well as the singleton (HoxC) Hox clusters. Furthermore, the b-paralogs of the duplicated fugu Hox clusters are virtually devoid of unique ancient CNEs. In contrast to fugu Hox clusters, elephant shark and human Hox clusters have lost fewer ancient CNEs. If these ancient CNEs are indeed enhancers directing tissue-specific expression of Hox genes, divergence of their sequences in vertebrate lineages might have led to altered expression patterns and presumably the functions of their associated Hox genes. PMID:19805301

  20. A Special Local Clustering Algorithm for Identifying the Genes Associated With Alzheimer’s Disease

    PubMed Central

    Pang, Chao-Yang; Hu, Wei; Hu, Ben-Qiong; Shi, Ying; Vanderburg, Charles R.; Rogers, Jack T.

    2010-01-01

    Clustering is the grouping of similar objects into a class. Local clustering feature refers to the phenomenon whereby one group of data is separated from another, and the data from these different groups are clustered locally. A compact class is defined as one cluster in which all similar elements cluster tightly within the cluster. Herein, the essence of the local clustering feature, revealed by mathematical manipulation, results in a novel clustering algorithm termed as the special local clustering (SLC) algorithm that was used to process gene microarray data related to Alzheimer’s disease (AD). SLC algorithm was able to group together genes with similar expression patterns and identify significantly varied gene expression values as isolated points. If a gene belongs to a compact class in control data and appears as an isolated point in incipient, moderate and/or severe AD gene microarray data, this gene is possibly associated with AD. Application of a clustering algorithm in disease-associated gene identification such as in AD is rarely reported. PMID:20089478

  1. High presence/absence gene variability in defense-related gene clusters of Cucumis melo

    PubMed Central

    2013-01-01

    Background Changes in the copy number of DNA sequences are one of the main mechanisms generating genome variability in eukaryotes. These changes are often related to phenotypic effects such as genetic disorders or novel pathogen resistance. The increasing availability of genome sequences through the application of next-generation massive sequencing technologies has allowed the study of genomic polymorphisms at both the interspecific and intraspecific levels, thus helping to understand how species adapt to changing environments through genome variability. Results Data on gene presence/absence variation (PAV) in melon was obtained by resequencing a cultivated accession and an old-relative melon variety, and using previously obtained resequencing data from three other melon cultivars, among them DHL92, on which the current draft melon genome sequence is based. A total of 1,697 PAV events were detected, involving 4.4% of the predicted melon gene complement. In all, an average 1.5% of genes were absent from each analyzed cultivar as compared to the DHL92 reference genome. The most populated functional category among the 304 PAV genes of known function was that of stress response proteins (30% of all classified PAVs). Our results suggest that genes from multi-copy families are five times more likely to be affected by PAV than singleton genes. Also, the chance of genes present in the genome in tandem arrays being affected by PAV is double that of isolated genes, with PAV genes tending to be in longer clusters. The highest concentration of PAV events detected in the melon genome was found in a 1.1 Mb region of linkage group V, which also shows the highest density of melon stress-response genes. In particular, this region contains the longest continuous gene-containing PAV sequence so far identified in melon. Conclusions The first genome-wide report of PAV variation among several melon cultivars is presented here. Multi-copy and clustered genes, especially those with

  2. An Ergot Alkaloid Biosynthesis Gene and Clustered Hypothetical Genes from Aspergillus fumigatus†

    PubMed Central

    Coyle, Christine M.; Panaccione, Daniel G.

    2005-01-01

    The ergot alkaloids are a family of indole-derived mycotoxins with a variety of significant biological activities. Aspergillus fumigatus, a common airborne fungus and opportunistic human pathogen, and several fungi in the relatively distant taxon Clavicipitaceae (clavicipitaceous fungi) produce different sets of ergot alkaloids. The ergot alkaloids of these divergent fungi share a four-member ergoline ring but differ in the number, type, and position of the side chains. Several genes required for ergot alkaloid production are known in the clavicipitaceous fungi, and these genes are clustered in the genome of the ergot fungus Claviceps purpurea. We investigated whether the ergot alkaloids of A. fumigatus have a common biosynthetic and genetic origin with those of the clavicipitaceous fungi. A homolog of dmaW, the gene controlling the determinant step in the ergot alkaloid pathway of clavicipitaceous fungi, was identified in the A. fumigatus genome. Knockout of dmaW eliminated all known ergot alkaloids from A. fumigatus, and complementation of the mutation restored ergot alkaloid production. Clustered with dmaW in the A. fumigatus genome are sequences corresponding to five genes previously proposed to encode steps in the ergot alkaloid pathway of C. purpurea, as well as additional sequences whose deduced protein products are consistent with their involvement in the ergot alkaloid pathway. The corresponding genes have similarities in their nucleotide sequences, but the orientations and positions within the cluster of several of these genes differ. The data indicate that the ergot alkaloid biosynthetic capabilities in A. fumigatus and the clavicipitaceous fungi had a common origin. PMID:15933009

  3. Identification and Functional Analysis of the Nocardithiocin Gene Cluster in Nocardia pseudobrasiliensis

    PubMed Central

    Sakai, Kanae; Komaki, Hisayuki; Gonoi, Tohru

    2015-01-01

    Nocardithiocin is a thiopeptide compound isolated from the opportunistic pathogen Nocardia pseudobrasiliensis. It shows a strong activity against acid-fast bacteria and is also active against rifampicin-resistant Mycobacterium tuberculosis. Here, we report the identification of the nocardithiocin gene cluster in N. pseudobrasiliensis IFM 0761 based on conserved thiopeptide biosynthesis gene sequence and the whole genome sequence. The predicted gene cluster was confirmed by gene disruption and complementation. As expected, strains containing the disrupted gene did not produce nocardithiocin while gene complementation restored nocardithiocin production in these strains. The predicted cluster was further analyzed using RNA-seq which showed that the nocardithiocin gene cluster contains 12 genes within a 15.2-kb region. This finding will promote the improvement of nocardithiocin productivity and its derivatives production. PMID:26588225

  4. Translating biosynthetic gene clusters into fungal armor and weaponry

    PubMed Central

    Keller, Nancy P

    2015-01-01

    Filamentous fungi are renowned for the production of a diverse array of secondary metabolites (SMs) where the genetic material required for synthesis of a SM is typically arrayed in a biosynthetic gene cluster (BGC). These natural products are valued for their bioactive properties stemming from their functions in fungal biology, key among those protection from abiotic and biotic stress and establishment of a secure niche. The producing fungus must not only avoid self-harm from endogenous SMs but also deliver specific SMs at the right time to the right tissue requiring biochemical aid. This review highlights functions of BGCs beyond the enzymatic assembly of SMs, considering the timing and location of SM production and other proteins in the clusters that control SM activity. Specifically, self-protection is provided by both BGC-encoded mechanisms and non-BGC subcellular containment of toxic SM precursors; delivery and timing is orchestrated through cellular trafficking patterns and stress- and developmental-responsive transcriptional programs. PMID:26284674

  5. DoBISCUIT: a database of secondary metabolite biosynthetic gene clusters

    PubMed Central

    Ichikawa, Natsuko; Sasagawa, Machi; Yamamoto, Mika; Komaki, Hisayuki; Yoshida, Yumi; Yamazaki, Shuji; Fujita, Nobuyuki

    2013-01-01

    This article introduces DoBISCUIT (Database of BIoSynthesis clusters CUrated and InTegrated, http://www.bio.nite.go.jp/pks/), a literature-based, manually curated database of gene clusters for secondary metabolite biosynthesis. Bacterial secondary metabolites often show pharmacologically important activities and can serve as lead compounds and/or candidates for drug development. Biosynthesis of each secondary metabolite is catalyzed by a number of enzymes, usually encoded by a gene cluster. Although many scientific papers describe such gene clusters, the gene information is not always described in a comprehensive manner and the related information is rarely integrated. DoBISCUIT integrates the latest literature information and provides standardized gene/module/domain descriptions related to the gene clusters. PMID:23185043

  6. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    PubMed Central

    2010-01-01

    Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is

  7. A tripartite clustering analysis on microRNA, gene and disease model.

    PubMed

    Shen, Chengcheng; Liu, Ying

    2012-02-01

    Alteration of gene expression in response to regulatory molecules or mutations could lead to different diseases. MicroRNAs (miRNAs) have been discovered to be involved in regulation of gene expression and a wide variety of diseases. In a tripartite biological network of human miRNAs, their predicted target genes and the diseases caused by altered expressions of these genes, valuable knowledge about the pathogenicity of miRNAs, involved genes and related disease classes can be revealed by co-clustering miRNAs, target genes and diseases simultaneously. Tripartite co-clustering can lead to more informative results than traditional co-clustering with only two kinds of members and pass the hidden relational information along the relation chain by considering multi-type members. Here we report a spectral co-clustering algorithm for k-partite graph to find clusters with heterogeneous members. We use the method to explore the potential relationships among miRNAs, genes and diseases. The clusters obtained from the algorithm have significantly higher density than randomly selected clusters, which means members in the same cluster are more likely to have common connections. Results also show that miRNAs in the same family based on the hairpin sequences tend to belong to the same cluster. We also validate the clustering results by checking the correlation of enriched gene functions and disease classes in the same cluster. Finally, widely studied miR-17-92 and its paralogs are analyzed as a case study to reveal that genes and diseases co-clustered with the miRNAs are in accordance with current research findings. PMID:22809308

  8. A Cluster of Cuticle Protein Genes of Drosophila Melanogaster at 65a: Sequence, Structure and Evolution

    PubMed Central

    Charles, J. P.; Chihara, C.; Nejad, S.; Riddiford, L. M.

    1997-01-01

    A 36-kb genomic DNA segment of the Drosophila melanogaster genome containing 12 clustered cuticle genes has been mapped and partially sequenced. The cluster maps at 65A 5-6 on the left arm of the third chromosome, in agreement with the previously determined location of a putative cluster encompassing the genes for the third instar larval cuticle proteins LCP5, LCP6 and LCP8. This cluster is the largest cuticle gene cluster discovered to date and shows a number of surprising features that explain in part the genetic complexity of the LCP5, LCP6 and LCP8 loci. The genes encoding LCP5 and LCP8 are multiple copy genes and the presence of extensive similarity in their coding regions gives the first evidence for gene conversion in cuticle genes. In addition, five genes in the cluster are intronless. Four of these five have arisen by retroposition. The other genes in the cluster have a single intron located at an unusual location for insect cuticle genes. PMID:9383064

  9. Functional gene clustering via gene annotation sentences, MeSH and GO keywords from biomedical literature

    PubMed Central

    Natarajan, Jeyakumar; Ganapathy, Jawahar

    2007-01-01

    Gene function annotation remains a key challenge in modern biology. This is especially true for high-throughput techniques such as gene expression experiments. Vital information about genes is available electronically from biomedical literature in the form of full texts and abstracts. In addition, various publicly available databases (such as GenBank, Gene Ontology and Entrez) provide access to gene-related information at different levels of biological organization, granularity and data format. This information is being used to assess and interpret the results from high-throughput experiments. To improve keyword extraction for annotational clustering and other types of analyses, we have developed a novel text mining approach, which is based on keywords identified at the level of gene annotation sentences (in particular sentences characterizing biological function) instead of entire abstracts. Further, to improve the expressiveness and usefulness of gene annotation terms, we investigated the combination of sentence-level keywords with terms from the Medical Subject Headings (MeSH) and Gene Ontology (GO) resources. We find that sentence-level keywords combined with MeSH terms outperforms the typical ‘baseline’ set-up (term frequencies at the level of abstracts) by a significant margin, whereas the addition of GO terms improves matters only marginally. We validated our approach on the basis of a manually annotated corpus of 200 abstracts generated on the basis of 2 cancer categories and 10 genes per category. We applied the method in the context of three sets of differentially expressed genes obtained from pediatric brain tumor samples. This analysis suggests novel interpretations of discovered gene expression patterns. PMID:18305827

  10. Base J represses genes at the end of polycistronic gene clusters in Leishmania major by promoting RNAP II termination.

    PubMed

    Reynolds, David L; Hofmeister, Brigitte T; Cliffe, Laura; Siegel, T Nicolai; Anderson, Britta A; Beverley, Stephen M; Schmitz, Robert J; Sabatini, Robert

    2016-08-01

    The genomes of kinetoplastids are organized into polycistronic gene clusters that are flanked by the modified DNA base J. Previous work has established a role of base J in promoting RNA polymerase II termination in Leishmania spp. where the loss of J leads to termination defects and transcription into adjacent gene clusters. It remains unclear whether these termination defects affect gene expression and whether read through transcription is detrimental to cell growth, thus explaining the essential nature of J. We now demonstrate that reduction of base J at specific sites within polycistronic gene clusters in L. major leads to read through transcription and increased expression of downstream genes in the cluster. Interestingly, subsequent transcription into the opposing polycistronic gene cluster does not lead to downregulation of sense mRNAs. These findings indicate a conserved role for J regulating transcription termination and expression of genes within polycistronic gene clusters in trypanosomatids. In contrast to the expectations often attributed to opposing transcription, the essential nature of J in Leishmania spp. is related to its role in gene repression rather than preventing transcriptional interference resulting from read through and dual strand transcription. PMID:27125778

  11. A phylogenomic gene cluster resource: The phylogeneticallyinferred groups (PhlGs) database

    SciTech Connect

    Dehal, Paramvir S.; Boore, Jeffrey L.

    2005-08-25

    We present here the PhIGs database, a phylogenomic resource for sequenced genomes. Although many methods exist for clustering gene families, very few attempt to create truly orthologous clusters sharing descent from a single ancestral gene across a range of evolutionary depths. Although these non-phylogenetic gene family clusters have been used broadly for gene annotation, errors are known to be introduced by the artifactual association of slowly evolving paralogs and lack of annotation for those more rapidly evolving. A full phylogenetic framework is necessary for accurate inference of function and for many studies that address pattern and mechanism of the evolution of the genome. The automated generation of evolutionary gene clusters, creation of gene trees, determination of orthology and paralogy relationships, and the correlation of this information with gene annotations, expression information, and genomic context is an important resource to the scientific community.

  12. Identification and analysis of a highly conserved chemotaxis gene cluster in Shewanella species.

    SciTech Connect

    Li, J.; Romine, Margaret F.; Ward, M.

    2007-08-01

    A conserved cluster of chemotaxis genes was identified from the genome sequences of fifteen Shewanella species. An in-frame deletion of the cheA-3 gene, which is located in this cluster, was created in S. oneidensis MR-1 and the gene shown to be essential for chemotactic responses to anaerobic electron acceptors. The CheA-3 protein showed strong similarity to Vibrio cholerae CheA-2 and P. aeruginosa CheA-1, two proteins that are also essential for chemotaxis. The genes encoding these proteins were shown to be located in chemotaxis gene clusters closely related to the cheA-3-containing cluster in Shewanella species. The results of this study suggest that a combination of gene neighborhood and homology analyses may be used to predict which cheA genes are essential for chemotaxis in groups of closely related microorganisms.

  13. A hypothesis to explain how laeA specifically regulates certain secondary metabolite biosynthesis gene clusters

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Biosynthesis of mycotoxins involves transcriptional co-regulation of sets of clustered genes. We hypothesize that specific control of transcription of genes in these clusters by LaeA, a global regulator of secondary metabolite production and development in aspergilli and other filamentous fungi, re...

  14. Paerucumarin, a new metabolite produced by the pvc gene cluster from Pseudomonas aeruginosa.

    PubMed

    Clarke-Pearson, Michael F; Brady, Sean F

    2008-10-01

    The pvc gene cluster from Pseudomonas aeruginosa has been linked to the biosynthesis of both the pyoverdine chromophore and pseudoverdine. Our reinvestigation of the role this gene cluster plays in P. aeruginosa secondary metabolite biosynthesis shows that its major product is actually paerucumarin, a novel isonitrile functionalized cumarin. PMID:18689486

  15. Paerucumarin, a New Metabolite Produced by the pvc Gene Cluster from Pseudomonas aeruginosa▿ †

    PubMed Central

    Clarke-Pearson, Michael F.; Brady, Sean F.

    2008-01-01

    The pvc gene cluster from Pseudomonas aeruginosa has been linked to the biosynthesis of both the pyoverdine chromophore and pseudoverdine. Our reinvestigation of the role this gene cluster plays in P. aeruginosa secondary metabolite biosynthesis shows that its major product is actually paerucumarin, a novel isonitrile functionalized cumarin. PMID:18689486

  16. Conservation of Hox gene clusters in the self-fertilizing fish Kryptolebias marmoratus (Cyprinodontiformes; Rivulidae).

    PubMed

    Kim, B-M; Lee, B-Y; Lee, J-H; Rhee, J-S; Lee, J-S

    2016-03-01

    In this study, whole Hox gene clusters in the self-fertilizing mangrove killifish Kryptolebias marmoratus (Cyprinodontiformes; Rivulidae), a unique hermaphroditic vertebrate in which both sex organs are functional at the same time, were identified from whole genome and transcriptome sequences. The aim was to increase the understanding of the evolutionary status of conservation of this Hox gene cluster across fish species. PMID:26822496

  17. The tryptophanase gene cluster of Haemophilus influenzae type b: evidence for horizontal gene transfer.

    PubMed

    Martin, K; Morlin, G; Smith, A; Nordyke, A; Eisenstark, A; Golomb, M

    1998-01-01

    Among strains of Haemophilus influenzae, the ability to catabolize tryptophan (as detected by indole production) varies and is correlated with pathogenicity. Tryptophan catabolism is widespread (70 to 75%) among harmless respiratory isolates but is nearly universal (94 to 100%) among strains causing serious disease, including meningitis. As a first step in investigating the relationship between tryptophan catabolism and virulence, we have identified genes in pathogenic H. influenzae which are homologous to the tryptophanase (tna) operon of Escherichia coli. The tna genes are located on a 3.1-kb fragment between nlpD and mutS in the H. influenzae type b (Eagan) genome, are flanked by 43-bp direct repeats of an uptake signal sequence downstream from nlpD, and appear to have been inserted as a mobile unit within this sequence. The organization of this insertion is reminiscent of pathogenicity islands. The tna cluster is found at the same map location in all indole-positive strains of H. influenzae surveyed and is absent from reference type d and e genomes. In contrast to H. influenzae, most other Haemophilus species lack tna genes. Phylogenetic comparisons suggest that the tna cluster was acquired by intergeneric lateral transfer, either by H. influenzae or a recent ancestor, and that E. coli may have acquired its tnaA gene from a related source. Genomes of virulent H. influenzae resemble those of pathogenic enterics in having an island of laterally transferred DNA next to mutS. PMID:9422600

  18. Identification and Characterization of a Novel Diterpene Gene Cluster in Aspergillus nidulans

    PubMed Central

    Bromann, Kirsi; Toivari, Mervi; Viljanen, Kaarina; Vuoristo, Anu; Ruohonen, Laura; Nakari-Setälä, Tiina

    2012-01-01

    Fungal secondary metabolites are a rich source of medically useful compounds due to their pharmaceutical and toxic properties. Sequencing of fungal genomes has revealed numerous secondary metabolite gene clusters, yet products of many of these biosynthetic pathways are unknown since the expression of the clustered genes usually remains silent in normal laboratory conditions. Therefore, to discover new metabolites, it is important to find ways to induce the expression of genes in these otherwise silent biosynthetic clusters. We discovered a novel secondary metabolite in Aspergillus nidulans by predicting a biosynthetic gene cluster with genomic mining. A Zn(II)2Cys6–type transcription factor, PbcR, was identified, and its role as a pathway-specific activator for the predicted gene cluster was demonstrated. Overexpression of pbcR upregulated the transcription of seven genes in the identified cluster and led to the production of a diterpene compound, which was characterized with GC/MS as ent-pimara-8(14),15-diene. A change in morphology was also observed in the strains overexpressing pbcR. The activation of a cryptic gene cluster by overexpression of its putative Zn(II)2Cys6–type transcription factor led to discovery of a novel secondary metabolite in Aspergillus nidulans. Quantitative real-time PCR and DNA array analysis allowed us to predict the borders of the biosynthetic gene cluster. Furthermore, we identified a novel fungal pimaradiene cyclase gene as well as genes encoding 3-hydroxy-3-methyl-glutaryl-coenzyme A (HMG-CoA) reductase and a geranylgeranyl pyrophosphate (GGPP) synthase. None of these genes have been previously implicated in the biosynthesis of terpenes in Aspergillus nidulans. These results identify the first Aspergillus nidulans diterpene gene cluster and suggest a biosynthetic pathway for ent-pimara-8(14),15-diene. PMID:22506079

  19. A modified recombineering protocol for the genetic manipulation of gene clusters in Aspergillus fumigatus.

    PubMed

    Alcazar-Fuoli, Laura; Cairns, Timothy; Lopez, Jordi F; Zonja, Bozo; Pérez, Sandra; Barceló, Damià; Igarashi, Yasuhiro; Bowyer, Paul; Bignell, Elaine

    2014-01-01

    Genomic analyses of fungal genome structure have revealed the presence of physically-linked groups of genes, termed gene clusters, where collective functionality of encoded gene products serves a common biosynthetic purpose. In multiple fungal pathogens of humans and plants gene clusters have been shown to encode pathways for biosynthesis of secondary metabolites including metabolites required for pathogenicity. In the major mould pathogen of humans Aspergillus fumigatus, multiple clusters of co-ordinately upregulated genes were identified as having heightened transcript abundances, relative to laboratory cultured equivalents, during the early stages of murine infection. The aim of this study was to develop and optimise a methodology for manipulation of gene cluster architecture, thereby providing the means to assess their relevance to fungal pathogenicity. To this end we adapted a recombineering methodology which exploits lambda phage-mediated recombination of DNA in bacteria, for the generation of gene cluster deletion cassettes. By exploiting a pre-existing bacterial artificial chromosome (BAC) library of A. fumigatus genomic clones we were able to implement single or multiple intra-cluster gene replacement events at both subtelomeric and telomere distal chromosomal locations, in both wild type and highly recombinogenic A. fumigatus isolates. We then applied the methodology to address the boundaries of a gene cluster producing a nematocidal secondary metabolite, pseurotin A, and to address the role of this secondary metabolite in insect and mammalian responses to A. fumigatus challenge. PMID:25372385

  20. An effective fuzzy kernel clustering analysis approach for gene expression data.

    PubMed

    Sun, Lin; Xu, Jiucheng; Yin, Jiaojiao

    2015-01-01

    Fuzzy clustering is an important tool for analyzing microarray data. A major problem in applying fuzzy clustering method to microarray gene expression data is the choice of parameters with cluster number and centers. This paper proposes a new approach to fuzzy kernel clustering analysis (FKCA) that identifies desired cluster number and obtains more steady results for gene expression data. First of all, to optimize characteristic differences and estimate optimal cluster number, Gaussian kernel function is introduced to improve spectrum analysis method (SAM). By combining subtractive clustering with max-min distance mean, maximum distance method (MDM) is proposed to determine cluster centers. Then, the corresponding steps of improved SAM (ISAM) and MDM are given respectively, whose superiority and stability are illustrated through performing experimental comparisons on gene expression data. Finally, by introducing ISAM and MDM into FKCA, an effective improved FKCA algorithm is proposed. Experimental results from public gene expression data and UCI database show that the proposed algorithms are feasible for cluster analysis, and the clustering accuracy is higher than the other related clustering algorithms. PMID:26405958

  1. Genes for iron-sulphur cluster assembly are targets of abiotic stress in rice, Oryza sativa.

    PubMed

    Liang, Xuejiao; Qin, Lu; Liu, Peiwei; Wang, Meihuan; Ye, Hong

    2014-03-01

    Iron-sulphur (Fe-S) cluster assembly occurs in chloroplasts, mitochondria and cytosol, involving dozens of genes in higher plants. In this study, we have identified 41 putative Fe-S cluster assembly genes in rice (Oryza sativa) genome, and the expression of all genes was verified. To investigate the role of Fe-S cluster assembly as a metabolic pathway, we applied abiotic stresses to rice seedlings and analysed Fe-S cluster assembly gene expression by qRT-PCR. Our data showed that genes for Fe-S cluster assembly in chloroplasts of leaves are particularly sensitive to heavy metal treatments, and that Fe-S cluster assembly genes in roots were up-regulated in response to iron toxicity, oxidative stress and some heavy metal assault. The effect of each stress treatment on the Fe-S cluster assembly machinery demonstrated an unexpected tissue or organelle specificity, suggesting that the physiological relevance of the Fe-S cluster assembly is more complex than thought. Furthermore, our results may reveal potential candidate genes for molecular breeding of rice. PMID:24028141

  2. A recently transferred cluster of bacterial genes in Trichomonas vaginalis - lateral gene transfer and the fate of acquired genes

    PubMed Central

    2014-01-01

    Background Lateral Gene Transfer (LGT) has recently gained recognition as an important contributor to some eukaryote proteomes, but the mechanisms of acquisition and fixation in eukaryotic genomes are still uncertain. A previously defined norm for LGTs in microbial eukaryotes states that the majority are genes involved in metabolism, the LGTs are typically localized one by one, surrounded by vertically inherited genes on the chromosome, and phylogenetics shows that a broad collection of bacterial lineages have contributed to the transferome. Results A unique 34 kbp long fragment with 27 clustered genes (TvLF) of prokaryote origin was identified in the sequenced genome of the protozoan parasite Trichomonas vaginalis. Using a PCR based approach we confirmed the presence of the orthologous fragment in four additional T. vaginalis strains. Detailed sequence analyses unambiguously suggest that TvLF is the result of one single, recent LGT event. The proposed donor is a close relative to the firmicute bacterium Peptoniphilus harei. High nucleotide sequence similarity between T. vaginalis strains, as well as to P. harei, and the absence of homologs in other Trichomonas species, suggests that the transfer event took place after the radiation of the genus Trichomonas. Some genes have undergone pseudogenization and degradation, indicating that they may not be retained in the future. Functional annotations reveal that genes involved in informational processes are particularly prone to degradation. Conclusions We conclude that, although the majority of eukaryote LGTs are single gene occurrences, they may be acquired in clusters of several genes that are subsequently cleansed of evolutionarily less advantageous genes. PMID:24898731

  3. A network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression

    PubMed Central

    2014-01-01

    Background Cancer subtype information is critically important for understanding tumor heterogeneity. Existing methods to identify cancer subtypes have primarily focused on utilizing generic clustering algorithms (such as hierarchical clustering) to identify subtypes based on gene expression data. The network-level interaction among genes, which is key to understanding the molecular perturbations in cancer, has been rarely considered during the clustering process. The motivation of our work is to develop a method that effectively incorporates molecular interaction networks into the clustering process to improve cancer subtype identification. Results We have developed a new clustering algorithm for cancer subtype identification, called “network-assisted co-clustering for the identification of cancer subtypes” (NCIS). NCIS combines gene network information to simultaneously group samples and genes into biologically meaningful clusters. Prior to clustering, we assign weights to genes based on their impact in the network. Then a new weighted co-clustering algorithm based on a semi-nonnegative matrix tri-factorization is applied. We evaluated the effectiveness of NCIS on simulated datasets as well as large-scale Breast Cancer and Glioblastoma Multiforme patient samples from The Cancer Genome Atlas (TCGA) project. NCIS was shown to better separate the patient samples into clinically distinct subtypes and achieve higher accuracy on the simulated datasets to tolerate noise, as compared to consensus hierarchical clustering. Conclusions The weighted co-clustering approach in NCIS provides a unique solution to incorporate gene network information into the clustering process. Our tool will be useful to comprehensively identify cancer subtypes that would otherwise be obscured by cancer heterogeneity, using high-throughput and high-dimensional gene expression data. PMID:24491042

  4. Arrangement of the Clostridium baratii F7 Toxin Gene Cluster with Identification of a σ Factor That Recognizes the Botulinum Toxin Gene Cluster Promoters

    DOE PAGESBeta

    Dover, Nir; Barash, Jason R.; Burke, Julianne N.; Hill, Karen K.; Detter, John C.; Arnon, Stephen S.

    2014-05-22

    Botulinum neurotoxin (BoNT) is the most poisonous substances known and its eight toxin types (A to H) are distinguished by the inability of polyclonal antibodies that neutralize one toxin type to neutralize any of the other seven toxin types. Infant botulism, an intestinal toxemia orphan disease, is the most common form of human botulism in the United States. It results from swallowed spores of Clostridium botulinum (or rarely, neurotoxigenic Clostridium butyricum or Clostridium baratii) that germinate and temporarily colonize the lumen of the large intestine, where, as vegetative cells, they produce botulinum toxin. Botulinum neurotoxin is encoded by the bontmore » gene that is part of a toxin gene cluster that includes several accessory genes. In this paper, we sequenced for the first time the complete botulinum neurotoxin gene cluster of nonproteolytic C. baratii type F7. Like the type E and the nonproteolytic type F6 botulinum toxin gene clusters, the C. baratii type F7 had an orfX toxin gene cluster that lacked the regulatory botR gene which is found in proteolytic C. botulinum strains and codes for an alternative σ factor. In the absence of botR, we identified a putative alternative regulatory gene located upstream of the C. baratii type F7 toxin gene cluster. This putative regulatory gene codes for a predicted σ factor that contains DNA-binding-domain homologues to the DNA-binding domains both of BotR and of other members of the TcdR-related group 5 of the σ70 family that are involved in the regulation of toxin gene expression in clostridia. We showed that this TcdR-related protein in association with RNA polymerase core enzyme specifically binds to the C. baratii type F7 botulinum toxin gene cluster promoters. Finally, this TcdR-related protein may therefore be involved in regulating the expression of the genes of the botulinum toxin gene cluster in neurotoxigenic C. baratii.« less

  5. A Putative Gene Cluster from a Lyngbya wollei Bloom that Encodes Paralytic Shellfish Toxin Biosynthesis

    PubMed Central

    Mihali, Troco K.; Carmichael, Wayne W.; Neilan, Brett A.

    2011-01-01

    Saxitoxin and its analogs cause the paralytic shellfish-poisoning syndrome, adversely affecting human health and coastal shellfish industries worldwide. Here we report the isolation, sequencing, annotation, and predicted pathway of the saxitoxin biosynthetic gene cluster in the cyanobacterium Lyngbya wollei. The gene cluster spans 36 kb and encodes enzymes for the biosynthesis and export of the toxins. The Lyngbya wollei saxitoxin gene cluster differs from previously identified saxitoxin clusters as it contains genes that are unique to this cluster, whereby the carbamoyltransferase is truncated and replaced by an acyltransferase, explaining the unique toxin profile presented by Lyngbya wollei. These findings will enable the creation of toxin probes, for water monitoring purposes, as well as proof-of-concept for the combinatorial biosynthesis of these natural occurring alkaloids for the production of novel, biologically active compounds. PMID:21347365

  6. High-throughput platform for the discovery of elicitors of silent bacterial gene clusters

    PubMed Central

    Seyedsayamdost, Mohammad R.

    2014-01-01

    Over the past decade, bacterial genome sequences have revealed an immense reservoir of biosynthetic gene clusters, sets of contiguous genes that have the potential to produce drugs or drug-like molecules. However, the majority of these gene clusters appear to be inactive for unknown reasons prompting terms such as “cryptic” or “silent” to describe them. Because natural products have been a major source of therapeutic molecules, methods that rationally activate these silent clusters would have a profound impact on drug discovery. Herein, a new strategy is outlined for awakening silent gene clusters using small molecule elicitors. In this method, a genetic reporter construct affords a facile read-out for activation of the silent cluster of interest, while high-throughput screening of small molecule libraries provides potential inducers. This approach was applied to two cryptic gene clusters in the pathogenic model Burkholderia thailandensis. The results not only demonstrate a prominent activation of these two clusters, but also reveal that the majority of elicitors are themselves antibiotics, most in common clinical use. Antibiotics, which kill B. thailandensis at high concentrations, act as inducers of secondary metabolism at low concentrations. One of these antibiotics, trimethoprim, served as a global activator of secondary metabolism by inducing at least five biosynthetic pathways. Further application of this strategy promises to uncover the regulatory networks that activate silent gene clusters while at the same time providing access to the vast array of cryptic molecules found in bacteria. PMID:24808135

  7. Identification of gene-gene and gene-environment interactions within the fibrinogen gene cluster for fibrinogen levels in three ethnically diverse populations.

    PubMed

    Jeff, Janina M; Brown-Gentry, Kristin; Crawford, Dana C

    2015-01-01

    Elevated levels of plasma fibrinogen are associated with clot formation in the absence of inflammation or injury and is a biomarker for arterial clotting, the leading cause of cardiovascular disease. Fibrinogen levels are heritable with >50% attributed to genetic factors, however little is known about possible genetic modifiers that might explain the missing heritability. The fibrinogen gene cluster is comprised of three genes (FGA, FGB, and FGG) that make up the fibrinogen polypeptide essential for fibrinogen production in the blood. Given the known interaction with these genes, we tested 25 variants in the fibrinogen gene cluster for gene x gene and gene x environment interactions in 620 non-Hispanic blacks, 1,385 non-Hispanic whites, and 664 Mexican Americans from a cross-sectional dataset enriched with environmental data, the Third National Health and Nutrition Examination Survey (NHANES III). Using a multiplicative approach, we added cross product terms (gene x gene or gene x environment) to a linear regression model and declared significance at p < 0.05. We identified 19 unique gene x gene and 13 unique gene x environment interactions that impact fibrinogen levels in at least one population at p < 0.05. Over 90% of the gene x gene interactions identified include a variant in the rate-limiting gene, FGB that is essential for the formation of the fibrinogen polypeptide. We also detected gene x environment interactions with fibrinogen variants and sex, smoking, and body mass index. These findings highlight the potential for the discovery of genetic modifiers for complex phenotypes in multiple populations and give a better understanding of the interaction between genes and/or the environment for fibrinogen levels. The need for more powerful and robust methods to identify genetic modifiers is still warranted. PMID:25592583

  8. IDENTIFICATION OF GENE-GENE AND GENE-ENVIRONMENT INTERACTIONS WITHIN THE FIBRINOGEN GENE CLUSTER FOR FIBRINOGEN LEVELS IN THREE ETHNICALLY DIVERSE POPULATIONS

    PubMed Central

    Jeff, Janina M.; Brown-Gentry, Kristin; Crawford, Dana C.

    2014-01-01

    Elevated levels of plasma fibrinogen are associated with clot formation in the absence of inflammation or injury and is a biomarker for arterial clotting, the leading cause of cardiovascular disease. Fibrinogen levels are heritable with >50% attributed to genetic factors, however little is known about possible genetic modifiers that might explain the missing heritability. The fibrinogen gene cluster is comprised of three genes (FGA, FGB, and FGG) that make up the fibrinogen polypeptide essential for fibrinogen production in the blood. Given the known interaction with these genes, we tested 25 variants in the fibrinogen gene cluster for gene × gene and gene × environment interactions in 620 non-Hispanic blacks, 1,385 non-Hispanic whites, and 664 Mexican Americans from a cross-sectional dataset enriched with environmental data, the Third National Health and Nutrition Examination Survey (NHANES III). Using a multiplicative approach, we added cross product terms (gene × gene or gene × environment) to a linear regression model and declared significance at p < 0.05. We identified 19 unique gene × gene and 13 unique gene × environment interactions that impact fibrinogen levels in at least one population at p <0.05. Over 90% of the gene × gene interactions identified include a variant in the rate-limiting gene, FGB that is essential for the formation of the fibrinogen polypeptide. We also detected gene × environment interactions with fibrinogen variants and sex, smoking, and body mass index. These findings highlight the potential for the discovery of genetic modifiers for complex phenotypes in multiple populations and give a better understanding of the interaction between genes and/or the environment for fibrinogen levels. The need for more powerful and robust methods to identify genetic modifiers is still warranted. PMID:25592583

  9. Assembly of iron-sulfur clusters. Identification of an iscSUA-hscBA-fdx gene cluster from Azotobacter vinelandii.

    PubMed

    Zheng, L; Cash, V L; Flint, D H; Dean, D R

    1998-05-22

    An enzyme having the same L-cysteine desulfurization activity previously described for the NifS protein was purified from a strain of Azotobacter vinelandii deleted for the nifS gene. This protein was designated IscS to indicate its proposed role in iron-sulfur cluster assembly. Like NifS, IscS is a pyridoxal-phosphate containing homodimer. Information gained from microsequencing of oligopeptides obtained by tryptic digestion of purified IscS was used to design a strategy for isolation and DNA sequence analysis of a 7,886-base pair A. vinelandii genomic segment that includes the iscS gene. The iscS gene is contained within a gene cluster that includes homologs to nifU and another gene contained within the major nif cluster of A. vinelandii previously designated orf6. These genes have been designated iscU and iscA, respectively. Information available from complete genome sequences of Escherichia coli and Hemophilus influenzae reveals that they also encode iscSUA gene clusters. A wide conservation of iscSUA genes in nature and evidence that NifU and NifS participate in the mobilization of iron and sulfur for nitrogenase-specific iron-sulfur cluster formation suggest that the products of the iscSUA genes could play a general role in the formation or repair of iron-sulfur clusters. The proposal that IscS is involved in mobilization of sulfur for iron-sulfur cluster formation in A. vinelandii is supported by the presence of a cysE-like homolog in another gene cluster located immediately upstream from the one containing the iscSUA genes. O-Acetylserine synthase is the product of the cysE gene, and it catalyzes the rate-limiting step in cysteine biosynthesis. A similar cysE-like gene is also located within the nif gene cluster of A. vinelandii. The likely role of such cysE-like gene products is to increase the cysteine pool needed for iron-sulfur cluster formation. Another feature of the iscSUA gene cluster region from A. vinelandii is that E. coli genes previously

  10. Birth of Four Chimeric Plastid Gene Clusters in Japanese Umbrella Pine

    PubMed Central

    Hsu, Chih-Yao; Wu, Chung-Shien; Chaw, Shu-Miaw

    2016-01-01

    Many genes in the plastid genomes (plastomes) of plants are organized as gene clusters, in which genes are co-transcribed, resembling bacterial operons. These plastid operons are highly conserved, even among conifers, whose plastomes are highly rearranged relative to other seed plants. We have determined the complete plastome sequence of Sciadopitys verticillata (Japanese umbrella pine), the sole member of Sciadopityaceae. The Sciadopitys plastome is characterized by extensive inversions, pseudogenization of four tRNA genes after tandem duplications, and a unique pair of 370-bp inverted repeats involved in the formation of isomeric plastomes. We showed that plastomic inversions in Sciadopitys have led to shuffling of the remote conserved operons, resulting in the birth of four chimeric gene clusters. Our data also demonstrated that the relocated genes can be co-transcribed in these chimeric gene clusters. The plastome of Sciadopitys advances our current understanding of how the conifer plastomes have evolved toward increased diversity and complexity. PMID:27269365

  11. Improved efficiency in amplification of Escherichia coli o-antigen gene clusters using genome-wide sequence comparison

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: In many bacteria including E. coli, genes encoding O-antigens are clustered in the chromosome, with a 39-bp JUMPstart sequence and gnd gene located upstream and downstream of the cluster, respectively. For determining the DNA sequence of the E. coli O-antigen gene cluster, one set of P...

  12. Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data

    PubMed Central

    Zeng, Beiyan; Chen, Yiping P.; Smith, Oscar H.

    2003-01-01

    Kernel density smoothing techniques have been used in classification or supervised learning of gene expression profile (GEP) data, but their applications to clustering or unsupervised learning of those data have not been explored and assessed. Here we report a kernel density clustering method for analysing GEP data and compare its performance with the three most widely-used clustering methods: hierarchical clustering, K-means clustering, and multivariate mixture model-based clustering. Using several methods to measure agreement, between-cluster isolation, and withincluster coherence, such as the Adjusted Rand Index, the Pseudo F test, the r2 test, and the profile plot, we have assessed the effectiveness of kernel density clustering for recovering clusters, and its robustness against noise on clustering both simulated and real GEP data. Our results show that the kernel density clustering method has excellent performance in recovering clusters from simulated data and in grouping large real expression profile data sets into compact and well-isolated clusters, and that it is the most robust clustering method for analysing noisy expression profile data compared to the other three methods assessed. PMID:18629292

  13. Isolation and Characterization of the Gibberellin Biosynthetic Gene Cluster in Sphaceloma manihoticola▿ †

    PubMed Central

    Bömke, Christiane; Rojas, Maria Cecilia; Gong, Fan; Hedden, Peter; Tudzynski, Bettina

    2008-01-01

    Gibberellins (GAs) are tetracyclic diterpenoid phytohormones that were first identified as secondary metabolites of the fungus Fusarium fujikuroi (teleomorph, Gibberella fujikuroi). GAs were also found in the cassava pathogen Sphaceloma manihoticola, but the spectrum of GAs differed from that in F. fujikuroi. In contrast to F. fujikuroi, the GA biosynthetic pathway has not been studied in detail in S. manihoticola, and none of the GA biosynthetic genes have been cloned from the species. Here, we present the identification of the GA biosynthetic gene cluster from S. manihoticola consisting of five genes encoding a bifunctional ent-copalyl/ent-kaurene synthase (CPS/KS), a pathway-specific geranylgeranyl diphosphate synthase (GGS2), and three cytochrome P450 monooxygenases. The functions of all of the genes were analyzed either by a gene replacement approach or by complementing the corresponding F. fujikuroi mutants. The cluster organization and gene functions are similar to those in F. fujikuroi. However, the two border genes in the Fusarium cluster encoding the GA4 desaturase (DES) and the 13-hydroxylase (P450-3) are absent in the S. manihoticola GA gene cluster, consistent with the spectrum of GAs produced by this fungus. The close similarity between the two GA gene clusters, the identical gene functions, and the conserved intron positions suggest a common evolutionary origin despite the distant relatedness of the two fungi. PMID:18567680

  14. The clustering of functionally related genes contributes to CNV-mediated disease

    PubMed Central

    Andrews, Tallulah; Honti, Frantisek; Pfundt, Rolph; de Leeuw, Nicole; Hehir-Kwa, Jayne; Vulto-van Silfhout, Anneke; de Vries, Bert; Webber, Caleb

    2015-01-01

    Clusters of functionally related genes can be disrupted by a single copy number variant (CNV). We demonstrate that the simultaneous disruption of multiple functionally related genes is a frequent and significant characteristic of de novo CNVs in patients with developmental disorders (P = 1 × 10−3). Using three different functional networks, we identified unexpectedly large numbers of functionally related genes within de novo CNVs from two large independent cohorts of individuals with developmental disorders. The presence of multiple functionally related genes was a significant predictor of a CNV's pathogenicity when compared to CNVs from apparently healthy individuals and a better predictor than the presence of known disease or haploinsufficient genes for larger CNVs. The functionally related genes found in the de novo CNVs belonged to 70% of all clusters of functionally related genes found across the genome. De novo CNVs were more likely to affect functional clusters and affect them to a greater extent than benign CNVs (P = 6 × 10−4). Furthermore, such clusters of functionally related genes are phenotypically informative: Different patients possessing CNVs that affect the same cluster of functionally related genes exhibit more similar phenotypes than expected (P < 0.05). The spanning of multiple functionally similar genes by single CNVs contributes substantially to how these variants exert their pathogenic effects. PMID:25887030

  15. An Effective Tri-Clustering Algorithm Combining Expression Data with Gene Regulation Information

    PubMed Central

    Li, Ao; Tuck, David

    2009-01-01

    Motivation Bi-clustering algorithms aim to identify sets of genes sharing similar expression patterns across a subset of conditions. However direct interpretation or prediction of gene regulatory mechanisms may be difficult as only gene expression data is used. Information about gene regulators may also be available, most commonly about which transcription factors may bind to the promoter region and thus control the expression level of a gene. Thus a method to integrate gene expression and gene regulation information is desirable for clustering and analyzing. Methods By incorporating gene regulatory information with gene expression data, we define regulated expression values (REV) as indicators of how a gene is regulated by a specific factor. Existing bi-clustering methods are extended to a three dimensional data space by developing a heuristic TRI-Clustering algorithm. An additional approach named Automatic Boundary Searching algorithm (ABS) is introduced to automatically determine the boundary threshold. Results Results based on incorporating ChIP-chip data representing transcription factor-gene interactions show that the algorithms are efficient and robust for detecting tri-clusters. Detailed analysis of the tri-cluster extracted from yeast sporulation REV data shows genes in this cluster exhibited significant differences during the middle and late stages. The implicated regulatory network was then reconstructed for further study of defined regulatory mechanisms. Topological and statistical analysis of this network demonstrated evidence of significant changes of TF activities during the different stages of yeast sporulation, and suggests this approach might be a general way to study regulatory networks undergoing transformations. PMID:19838334

  16. Transcriptome Analysis of Aspergillus flavus Reveals veA-Dependent Regulation of Secondary Metabolite Gene Clusters, Including the Novel Aflavarin Cluster

    PubMed Central

    Cary, J. W.; Han, Z.; Yin, Y.; Lohmar, J. M.; Shantappa, S.; Harris-Coward, P. Y.; Mack, B.; Ehrlich, K. C.; Wei, Q.; Arroyo-Manzanares, N.; Uka, V.; Vanhaecke, L.; Bhatnagar, D.; Yu, J.; Nierman, W. C.; Johns, M. A.; Sorensen, D.; Shen, H.; De Saeger, S.; Diana Di Mavungu, J.

    2015-01-01

    The global regulatory veA gene governs development and secondary metabolism in numerous fungal species, including Aspergillus flavus. This is especially relevant since A. flavus infects crops of agricultural importance worldwide, contaminating them with potent mycotoxins. The most well-known are aflatoxins, which are cytotoxic and carcinogenic polyketide compounds. The production of aflatoxins and the expression of genes implicated in the production of these mycotoxins are veA dependent. The genes responsible for the synthesis of aflatoxins are clustered, a signature common for genes involved in fungal secondary metabolism. Studies of the A. flavus genome revealed many gene clusters possibly connected to the synthesis of secondary metabolites. Many of these metabolites are still unknown, or the association between a known metabolite and a particular gene cluster has not yet been established. In the present transcriptome study, we show that veA is necessary for the expression of a large number of genes. Twenty-eight out of the predicted 56 secondary metabolite gene clusters include at least one gene that is differentially expressed depending on presence or absence of veA. One of the clusters under the influence of veA is cluster 39. The absence of veA results in a downregulation of the five genes found within this cluster. Interestingly, our results indicate that the cluster is expressed mainly in sclerotia. Chemical analysis of sclerotial extracts revealed that cluster 39 is responsible for the production of aflavarin. PMID:26209694

  17. Leveraging long sequencing reads to investigate R-gene clustering and variation in sugar beet

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Host-pathogen interactions are of prime importance to modern agriculture. Plants utilize various types of resistance genes to mitigate pathogen damage. Identification of the specific gene responsible for a specific resistance can be difficult due to duplication and clustering within R-gene families....

  18. Cloning and Heterologous Expression of the Thioviridamide Biosynthesis Gene Cluster from Streptomyces olivoviridis

    PubMed Central

    Izawa, Masumi; Kawasaki, Takashi

    2013-01-01

    Thioviridamide is a unique peptide antibiotic containing five thioamide bonds from Streptomyces olivoviridis. Draft genome sequencing revealed a gene (the tvaA gene) encoding the thioviridamide precursor peptide. The thioviridamide biosynthesis gene cluster was identified by heterologous production of thioviridamide in Streptomyces lividans. PMID:23995943

  19. CLONING AND EXPRESSION OF THE CATA AND CATBC GENE CLUSTERS FROM PSEUDOMONAS AERUGINOSA PAO

    EPA Science Inventory

    A 9.9-kilobase (kb) BAMIII estriction endonuclease fragment encoding the catA and catBC gene clusters was selected from a gene bank of the Pseudomonas aeruginosa PAO1c chromosome. he catA, catB, and catC genes encode enzymes that catalyze consecutive reactions in the catechol bra...

  20. Characterization of a plasmid-encoded urease gene cluster found in members of the family Enterobacteriaceae.

    PubMed

    D'Orazio, S E; Collins, C M

    1993-03-01

    Plasmid-encoded urease gene clusters found in uropathogenic isolates of Escherichia coli, Providencia stuartii, and Salmonella cubana demonstrated DNA homology, similar positions of restriction endonuclease cleavage sites, and manners of urease expression and therefore represent the same locus. DNA sequence analysis indicated that the plasmid-encoded urease genes are closely related to the Proteus mirabilis urease genes. PMID:8449894

  1. A rough set based rational clustering framework for determining correlated genes.

    PubMed

    Jeyaswamidoss, Jeba Emilyn; Thangaraj, Kesavan; Ramar, Kadarkarai; Chitra, Muthusamy

    2016-06-01

    Cluster analysis plays a foremost role in identifying groups of genes that show similar behavior under a set of experimental conditions. Several clustering algorithms have been proposed for identifying gene behaviors and to understand their significance. The principal aim of this work is to develop an intelligent rough clustering technique, which will efficiently remove the irrelevant dimensions in a high-dimensional space and obtain appropriate meaningful clusters. This paper proposes a novel biclustering technique that is based on rough set theory. The proposed algorithm uses correlation coefficient as a similarity measure to simultaneously cluster both the rows and columns of a gene expression data matrix and mean squared residue to generate the initial biclusters. Furthermore, the biclusters are refined to form the lower and upper boundaries by determining the membership of the genes in the clusters using mean squared residue. The algorithm is illustrated with yeast gene expression data and the experiment proves the effectiveness of the method. The main advantage is that it overcomes the problem of selection of initial clusters and also the restriction of one object belonging to only one cluster by allowing overlapping of biclusters. PMID:27352972

  2. Hox gene clusters of early vertebrates: do they serve as reliable markers for genome evolution?

    PubMed

    Kuraku, Shigehiro

    2011-06-01

    Hox genes, responsible for regional specification along the anteroposterior axis in embryogenesis, are found as clusters in most eumetazoan genomes sequenced to date. Invertebrates possess a single Hox gene cluster with some exceptions of secondary cluster breakages, while osteichthyans (bony vertebrates) have multiple Hox clusters. In tetrapods, four Hox clusters, derived from the so-called two-round whole genome duplications (2R-WGDs), are observed. Overall, the number of Hox gene clusters has been regarded as a reliable marker of ploidy levels in animal genomes. In fact, this scheme also fits the situations in teleost fishes that experienced an additional WGD. In this review, I focus on cyclostomes and cartilaginous fishes as lineages that would fill the gap between invertebrates and osteichthyans. A recent study highlighted a possible loss of the HoxC cluster in the galeomorph shark lineage, while other aspects of cartilaginous fish Hox clusters usually mark their conserved nature. In contrast, existing resources suggest that the cyclostomes exhibit a different mode of Hox cluster organization. For this group of species, whose genomes could have differently responded to the 2R-WGDs from jawed vertebrates, therefore the number of Hox clusters may not serve as a good indicator of their ploidy level. PMID:21802046

  3. Horizontal Transfer of a Nitrate Assimilation Gene Cluster and Ecological Transitions in Fungi: A Phylogenetic Study

    PubMed Central

    Slot, Jason C.; Hibbett, David S.

    2007-01-01

    High affinity nitrate assimilation genes in fungi occur in a cluster (fHANT-AC) that can be coordinately regulated. The clustered genes include nrt2, which codes for a high affinity nitrate transporter; euknr, which codes for nitrate reductase; and NAD(P)H-nir, which codes for nitrite reductase. Homologs of genes in the fHANT-AC occur in other eukaryotes and prokaryotes, but they have only been found clustered in the oomycete Phytophthora (heterokonts). We performed independent and concatenated phylogenetic analyses of homologs of all three genes in the fHANT-AC. Phylogenetic analyses limited to fungal sequences suggest that the fHANT-AC has been transferred horizontally from a basidiomycete (mushrooms and smuts) to an ancestor of the ascomycetous mold Trichoderma reesei. Phylogenetic analyses of sequences from diverse eukaryotes and eubacteria, and cluster structure, are consistent with a hypothesis that the fHANT-AC was assembled in a lineage leading to the oomycetes and was subsequently transferred to the Dikarya (Ascomycota+Basidiomycota), which is a derived fungal clade that includes the vast majority of terrestrial fungi. We propose that the acquisition of high affinity nitrate assimilation contributed to the success of Dikarya on land by allowing exploitation of nitrate in aerobic soils, and the subsequent transfer of a complete assimilation cluster improved the fitness of T. reesei in a new niche. Horizontal transmission of this cluster of functionally integrated genes supports the “selfish operon” hypothesis for maintenance of gene clusters. PMID:17971860

  4. Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering

    PubMed Central

    de Brevern, Alexandre G; Hazout, Serge; Malpertuy, Alain

    2004-01-01

    Background Microarray technologies produced large amount of data. The hierarchical clustering is commonly used to identify clusters of co-expressed genes. However, microarray datasets often contain missing values (MVs) representing a major drawback for the use of the clustering methods. Usually the MVs are not treated, or replaced by zero or estimated by the k-Nearest Neighbor (kNN) approach. The topic of the paper is to study the stability of gene clusters, defined by various hierarchical clustering algorithms, of microarrays experiments including or not MVs. Results In this study, we show that the MVs have important effects on the stability of the gene clusters. Moreover, the magnitude of the gene misallocations is depending on the aggregation algorithm. The most appropriate aggregation methods (e.g. complete-linkage and Ward) are highly sensitive to MVs, and surprisingly, for a very tiny proportion of MVs (e.g. 1%). In most of the case, the MVs must be replaced by expected values. The MVs replacement by the kNN approach clearly improves the identification of co-expressed gene clusters. Nevertheless, we observe that kNN approach is less suitable for the extreme values of gene expression. Conclusion The presence of MVs (even at a low rate) is a major factor of gene cluster instability. In addition, the impact depends on the hierarchical clustering algorithm used. Some methods should be used carefully. Nevertheless, the kNN approach constitutes one efficient method for restoring the missing expression gene values, with a low error level. Our study highlights the need of statistical treatments in microarray data to avoid misinterpretation. PMID:15324460

  5. Epigenetic regulation of the RHOX homeobox gene cluster and its association with human male infertility

    PubMed Central

    Richardson, Marcy E.; Bleiziffer, Andreas; Tüttelmann, Frank; Gromoll, Jörg; Wilkinson, Miles F.

    2014-01-01

    The X-linked RHOX cluster encodes a set of homeobox genes that are selectively expressed in the reproductive tract. Members of the RHOX cluster regulate target genes important for spermatogenesis promote male fertility in mice. Studies show that demethylating agents strongly upregulate the expression of mouse Rhox genes, suggesting that they are regulated by DNA methylation. However, whether this extends to human RHOX genes, whether DNA methylation directly regulates RHOX gene transcription and how this relates to human male infertility are unknown. To address these issues, we first defined the promoter regions of human RHOX genes and performed gain- and loss-of-function experiments to determine whether human RHOX gene transcription is regulated by DNA methylation. Our results indicated that DNA methylation is necessary and sufficient to silence human RHOX gene expression. To determine whether RHOX cluster methylation associates with male infertility, we evaluated the methylation status of RHOX genes in sperm from a large cohort of infertility patients. Linear regression analysis revealed a strong association between RHOX gene cluster hypermethylation and three independent types of semen abnormalities. Hypermethylation was restricted specifically to the RHOX cluster; we did not observe it in genes immediately adjacent to it on the X chromosome. Our results strongly suggest that human RHOX homeobox genes are under an epigenetic control mechanism that is aberrantly regulated in infertility patients. We propose that hypermethylation of the RHOX gene cluster serves as a marker for idiopathic infertility and that it is a candidate to exert a causal role in male infertility. PMID:23943794

  6. A cross-species bi-clustering approach to identifying conserved co-regulated genes

    PubMed Central

    Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun; Bi, Jinbo

    2016-01-01

    Motivation: A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticated method to date for identifying conserved co-regulated genes is a two-step approach. This approach first identifies gene clusters for each species by a cluster analysis of gene expression data, and subsequently computes the overlaps of clusters identified from different species to reveal common subgroups. This approach is ineffective to deal with the noise in the expression data introduced by the complicated procedures in quantifying gene expression. Furthermore, due to the sequential nature of the approach, the gene clusters identified in the first step may have little overlap among different species in the second step, thus difficult to detect conserved co-regulated genes. Results: We propose a cross-species bi-clustering approach which first denoises the gene expression data of each species into a data matrix. The rows of the data matrices of different species represent the same set of genes that are characterized by their expression patterns over the developmental stages of each species as columns. A novel bi-clustering method is then developed to cluster genes into subgroups by a joint sparse rank-one factorization of all the data matrices. This method decomposes a data matrix into a product of a column vector and a row vector where the column vector is a consistent indicator across the matrices (species) to identify the same gene cluster and the row vector specifies for each species the developmental stages that the clustered genes co-regulate. Efficient optimization algorithm has been developed with convergence analysis. This approach was first validated on

  7. Picocyanobacteria containing a novel pigment gene cluster dominate the brackish water Baltic Sea

    PubMed Central

    Larsson, John; Celepli, Narin; Ininbergs, Karolina; Dupont, Christopher L; Yooseph, Shibu; Bergman, Bigitta; Ekman, Martin

    2014-01-01

    Photoautotrophic picocyanobacteria harvest light via phycobilisomes (PBS) consisting of the pigments phycocyanin (PC) and phycoerythrin (PE), encoded by genes in conserved gene clusters. The presence and arrangement of these gene clusters give picocyanobacteria characteristic light absorption properties and allow the colonization of specific ecological niches. To date, a full understanding of the evolution and distribution of the PBS gene cluster in picocyanobacteria has been hampered by the scarcity of genome sequences from fresh- and brackish water-adapted strains. To remediate this, we analysed genomes assembled from metagenomic samples collected along a natural salinity gradient, and over the course of a growth season, in the Baltic Sea. We found that while PBS gene clusters in picocyanobacteria sampled in marine habitats were highly similar to known references, brackish-adapted genotypes harboured a novel type not seen in previously sequenced genomes. Phylogenetic analyses showed that the novel gene cluster belonged to a clade of uncultivated picocyanobacteria that dominate the brackish Baltic Sea throughout the summer season, but are uncommon in other examined aquatic ecosystems. Further, our data suggest that the PE genes were lost in the ancestor of PC-containing coastal picocyanobacteria and that multiple horizontal gene transfer events have re-introduced PE genes into brackish-adapted strains, including the novel clade discovered here. PMID:24621524

  8. Mapping the chromosome 16 cadherin gene cluster to a minimal deleted region in ductal breast cancer.

    PubMed

    Chalmers, I J; Aubele, M; Hartmann, E; Braungart, E; Werner, M; Höfler, H; Atkinson, M J

    2001-04-01

    The cadherin family of cell adhesion molecules has been implicated in tumor metastasis and progression. Eight family members have been mapped to the long arm of chromosome 16. Using radiation hybrid mapping, we have located six of these genes within a cluster at 16q21-q22.1. In invasive lobular carcinoma of the breast frequent LOH and accompanying mutation affect the CDH1 gene, which is a member of this chromosome 16 gene cluster. CDH1 LOH also occurs in invasive ductal carcinoma, but in the absence of gene mutation. The proximity of other cadherin genes to 16q22.1 suggests that they may be affected by LOH in invasive ductal carcinomas. Using the mapping data, microsatellite markers were selected which span regions of chromosome 16 containing the cadherin genes. In breast cancer tissues, a high rate of allelic loss was found over the gene cluster region, with CDH1 being the most frequently lost marker. In invasive ductal carcinoma a minimal deleted region was identified within part of the chromosome 16 cadherin gene cluster. This provides strong evidence for the existence of a second 16q22 suppressor gene locus within the cadherin cluster. PMID:11343777

  9. Picocyanobacteria containing a novel pigment gene cluster dominate the brackish water Baltic Sea.

    PubMed

    Larsson, John; Celepli, Narin; Ininbergs, Karolina; Dupont, Christopher L; Yooseph, Shibu; Bergman, Bigitta; Ekman, Martin

    2014-09-01

    Photoautotrophic picocyanobacteria harvest light via phycobilisomes (PBS) consisting of the pigments phycocyanin (PC) and phycoerythrin (PE), encoded by genes in conserved gene clusters. The presence and arrangement of these gene clusters give picocyanobacteria characteristic light absorption properties and allow the colonization of specific ecological niches. To date, a full understanding of the evolution and distribution of the PBS gene cluster in picocyanobacteria has been hampered by the scarcity of genome sequences from fresh- and brackish water-adapted strains. To remediate this, we analysed genomes assembled from metagenomic samples collected along a natural salinity gradient, and over the course of a growth season, in the Baltic Sea. We found that while PBS gene clusters in picocyanobacteria sampled in marine habitats were highly similar to known references, brackish-adapted genotypes harboured a novel type not seen in previously sequenced genomes. Phylogenetic analyses showed that the novel gene cluster belonged to a clade of uncultivated picocyanobacteria that dominate the brackish Baltic Sea throughout the summer season, but are uncommon in other examined aquatic ecosystems. Further, our data suggest that the PE genes were lost in the ancestor of PC-containing coastal picocyanobacteria and that multiple horizontal gene transfer events have re-introduced PE genes into brackish-adapted strains, including the novel clade discovered here. PMID:24621524

  10. Sequence analysis of a cluster of twenty-one tRNA genes in Bacillus subtilis.

    PubMed Central

    Green, C J; Vold, B S

    1983-01-01

    The DNA sequence of a cluster of twenty-one tRNA genes distal to a rRNA gene set in B. subtilis was determined. None of the tRNA genes are repeated in the sequence. The only classes of tRNAs that are not represented are those for cysteine, glutamine, tryptophan, and tyrosine. Three of the tRNA genes in this cluster do not have the 3'-CCA sequence encoded in the gene. There is no RNA polymerase terminator sequence in the region between the 5S gene and the first tRNA gene or within the tRNA gene cluster. A terminator sequence was found directly after the last tRNA gene. This rRNA and tRNA gene cluster probably represents one transcriptional unit. However, there may be an RNA polymerase promoter site within this sequence, which raises some interesting questions concerning the regulation of transcription for these tRNA genes. PMID:6310512

  11. The nonribosomal peptide and polyketide synthetic gene clusters in two strains of entomopathogenic fungi in Cordyceps.

    PubMed

    Wang, Wen-Jing; Vogel, Heiko; Yao, Yi-Jian; Ping, Liyan

    2012-11-01

    Species of Cordyceps Fr. are entomopathogenic fungi that parasitize the larvae or pupae of lepidopteran insects. The secondary metabolites, nonribosomal peptides and polyketides are well-known mediators of pathogenesis. The biosynthetic gene clusters of these compounds in two fungal strains (1630 and DSM 1153) formerly known as Cordyceps militaris were screened using polymerase chain reaction with degenerate primers. Two nonribosomal peptide synthetase genes, one polyketide synthetase gene and one hybrid gene cluster were identified, and certain characteristics of the structures of their potential products were predicted. All four genes were actively expressed under laboratory conditions but at markedly different levels. The gene clusters from the two fungal strains were structurally and functionally unrelated, suggesting different evolutionary origins and physiological functions. Phylogenetic and biochemical analyses confirmed that the two fungal strains are not conspecific as currently assigned. PMID:22889355

  12. Identification of a 12-gene fusaric acid biosynthetic gene cluster in Fusarium species through comparative and functional genomics

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In fungi, genes involved in biosynthesis of a secondary metabolite (SM) are often located adjacent to one another in the genome and are coordinately regulated. These SM biosynthetic gene clusters typically encode enzymes, one or more transcription factors, and a transport protein. Fusaric acid is a ...

  13. Variability in mycotoxin biosynthetic genes and gene clusters in Fusarium and its implications for mycotoxin contamination of crops

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Fusarium metabolites fumonisins and trichothecenes are among the mycotoxins of greatest concern to food and feed safety worldwide. As with other fungal secondary metabolites, mycotoxin biosynthetic genes are often located adjacent to one another in gene clusters. Thus, fumonisin biosynthetic gen...

  14. The Biosynthetic Gene Cluster of Zorbamycin, a Member of the Bleomycin Family of Antitumor Antibiotics, from Streptomyces flavoviridis ATCC 21892

    PubMed Central

    Galm, Ute; Wendt-Pienkowski, Evelyn; Wang, Liyan; George, Nicholas P.; Oh, Tae-Jin; Yi, Fan; Tao, Meifeng; Coughlin, Jane M.; Shen, Ben

    2011-01-01

    The biosynthetic gene cluster for the glycopeptide-derived antitumor antibiotic zorbamycin (ZBM) was cloned by screening a cosmid library of Streptomyces flavoviridis ATCC 21892. Sequence analysis revealed 40 ORFs belonging to the ZBM biosynthetic gene cluster. However, only 23 and 22 ORFs showed striking similarities to the biosynthetic gene clusters for the bleomycins (BLMs) and tallysomycins (TLMs), respectively; the remaining ORFs do not show significant homology to ORFs from the related BLM and TLM clusters. The ZBM gene cluster consists of 16 nonribosomal peptide synthetase (NRPS) genes encoding eight complete NRPS modules, three incomplete didomain NRPS modules, and eight freestanding single NRPS domains or associated enzymes, a polyketide synthase (PKS) gene encoding one PKS module, six sugar biosynthesis genes, as well as genes encoding other biosynthesis and resistance proteins. A genetic system using Escherichia coli-Streptomyces flavoviridis intergeneric conjugation was developed to enable ZBM gene cluster boundary determinations and biosynthetic pathway manipulations. PMID:19081934

  15. Classification of Arabidopsis thaliana gene sequences: clustering of coding sequences into two groups according to codon usage improves gene prediction.

    PubMed

    Mathé, C; Peresetsky, A; Déhais, P; Van Montagu, M; Rouzé, P

    1999-02-01

    While genomic sequences are accumulating, finding the location of the genes remains a major issue that can be solved only for about a half of them by homology searches. Prediction methods are thus required, but unfortunately are not fully satisfying. Most prediction methods implicitly assume a unique model for genes. This is an oversimplification as demonstrated by the possibility to group coding sequences into several classes in Escherichia coli and other genomes. As no classification existed for Arabidopsis thaliana, we classified genes according to the statistical features of their coding sequences. A clustering algorithm using a codon usage model was developed and applied to coding sequences from A. thaliana, E. coli, and a mixture of both. By using it, Arabidopsis sequences were clustered into two classes. The CU1 and CU2 classes differed essentially by the choice of pyrimidine bases at the codon silent sites: CU2 genes often use C whereas CU1 genes prefer T. This classification discriminated the Arabidopsis genes according to their expressiveness, highly expressed genes being clustered in CU2 and genes expected to have a lower expression, such as the regulatory genes, in CU1. The algorithm separated the sequences of the Escherichia-Arabidopsis mixed data set into five classes according to the species, except for one class. This mixed class contained 89 % Arabidopsis genes from CU1 and 11 % E. coli genes, mostly horizontally transferred. Interestingly, most genes encoding organelle-targeted proteins, except the photosynthetic and photoassimilatory ones, were clustered in CU1. By tailoring the GeneMark CDS prediction algorithm to the observed coding sequence classes, its quality of prediction was greatly improved. Similar improvement can be expected with other prediction systems. PMID:9925779

  16. Characterization of the fumonisin B2 biosynthetic gene cluster in Aspergillus niger and A. awamori.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Aspergillus niger and A. awamori strains isolated from grapes cultivated in Mediterranean basin were examined for fumonisin B2 (FB2) production and presence/absence of sequences within the fumonisin biosynthetic gene (fum) cluster. Presence of 13 regions in the fum cluster was evaluated by PCR assay...

  17. Clusters of antibiotic resistance genes enriched together stay together in swine agriculture

    DOE PAGESBeta

    Johnson, Timothy A.; Stedtfeld, Robert D.; Wang, Qiong; Cole, James R.; Hashsham, Syed A.; Looft, Torey; Zhu, Yong -Guan; Tiedje, James M.

    2016-04-12

    Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundancemore » of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk.Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of resistance

  18. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

    SciTech Connect

    Data Analysis and Visualization and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,'' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd

    2008-05-12

    The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.

  19. A Stationary Wavelet Entropy-Based Clustering Approach Accurately Predicts Gene Expression

    PubMed Central

    Nguyen, Nha; Vo, An; Choi, Inchan

    2015-01-01

    Abstract Studying epigenetic landscapes is important to understand the condition for gene regulation. Clustering is a useful approach to study epigenetic landscapes by grouping genes based on their epigenetic conditions. However, classical clustering approaches that often use a representative value of the signals in a fixed-sized window do not fully use the information written in the epigenetic landscapes. Clustering approaches to maximize the information of the epigenetic signals are necessary for better understanding gene regulatory environments. For effective clustering of multidimensional epigenetic signals, we developed a method called Dewer, which uses the entropy of stationary wavelet of epigenetic signals inside enriched regions for gene clustering. Interestingly, the gene expression levels were highly correlated with the entropy levels of epigenetic signals. Dewer separates genes better than a window-based approach in the assessment using gene expression and achieved a correlation coefficient above 0.9 without using any training procedure. Our results show that the changes of the epigenetic signals are useful to study gene regulation. PMID:25383910

  20. Molecular Cloning and Physical Mapping of the Daptomycin Gene Cluster from Streptomyces roseosporus

    PubMed Central

    Mchenney, Margaret A.; Hosted, Thomas J.; Dehoff, Bradley S.; Rosteck, Paul R.; Baltz, Richard H.

    1998-01-01

    The daptomycin biosynthetic gene cluster of Streptomyces roseosporus was analyzed by Tn5099 mutagenesis, molecular cloning, partial DNA sequencing, and insertional mutagenesis with cloned segments of DNA. The daptomycin biosynthetic gene cluster spans at least 50 kb and is located about 400 to 500 kb from one end of the ∼7,100-kb linear chromosome. We identified two peptide synthetase coding regions interrupted by a 10- to 20-kb region that may encode other functions in lipopeptide biosynthesis. PMID:9422604

  1. Regulation of transcription of cell division genes in the Escherichia coli dcw cluster.

    PubMed

    Vicente, M; Gomez, M J; Ayala, J A

    1998-04-01

    The Escherichia coli dcw cluster contains cell division genes, such as the phylogenetically ubiquitous ftsZ, and genes involved in peptidoglycan synthesis. Transcription in the cluster proceeds in the same direction as the progress of the replication fork along the chromosome. Regulation is exerted at the transcriptional and post-transcriptional levels. The absence of transcriptional termination signals may, in principle, allow extension of the transcripts initiated at the up-stream promoter (mraZ1p) even to the furthest down-stream gene (envA). Complementation tests suggest that they extend into ftsW in the central part of the cluster. In addition, the cluster contains other promoters individually regulated by cis- and trans-acting signals. Dissociation of the expression of the ftsZ gene, located after ftsQ and A near the 3' end of the cluster, from its natural regulatory signals leads to an alteration in the physiology of cell division. The complexities observed in the regulation of gene expression in the cluster may then have an important biological role. Among them, LexA-binding SOS boxes have been found at the 5' end of the cluster, preceding promoters which direct the expression of ftsI (coding for PBP3, the penicillin-binding protein involved in septum formation). A gearbox promoter, ftsQ1p, forms part of the signals regulating the transcription of ftsQ, A and Z. It is an inversely growth-dependent mechanism driven by RNA polymerase containing sigma s, the factor involved in the expression of stationary phase-specific genes. Although the dcw cluster is conserved to a different extent in a variety of bacteria, the regulation of gene expression, the presence or absence of individual genes, and even the essentiality of some of them, show variations in the phylogenetic scale which may reflect adaptation to specific life cycles. PMID:9614967

  2. Unusual Gene Order and Organization of the Sea Urchin HoxCluster

    SciTech Connect

    Richardson, Paul M.; Lucas, Susan; Cameron, R. Andrew; Rowen,Lee; Nesbitt, Ryan; Bloom, Scott; Rast, Jonathan P.; Berney, Kevin; Arenas-Mena, Cesar; Martinez, Pedro; Davidson, Eric H.; Peterson, KevinJ.; Hood, Leroy

    2005-05-10

    The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3' gene is Hox5. (The gene order is : 5'-Hox1,2, 3, 11/13c, 11/13b, '11/13a, 9/10, 8, 7, 6, 5 - 3)'. The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.

  3. Unusual Gene Order and Organization of the Sea Urchin Hox Cluster

    SciTech Connect

    Cameron, R A; Rowen, L; Nesbitt, R; Bloom, S; Rast, J P; Berney, K; Arenas-Mena, C; Martinez, P; Lucas, S; Richardson, P M; Davidson, E H; Peterson, K J; Hood, L

    2005-10-11

    The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3 gene is Hox5. (The gene order is : 5-Hox1, 2, 3, 11/13c, 11/13b, 11/13a, 9/10, 8, 7, 6, 5 - 3). The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.

  4. Fine genetic mapping localizes cucumber scab resistance gene Ccu into an R gene cluster.

    PubMed

    Kang, Houxiang; Weng, Yiqun; Yang, Yuhong; Zhang, Zhonghua; Zhang, Shengping; Mao, Zhenchuan; Cheng, Guohua; Gu, Xingfang; Huang, Sanwen; Xie, Bingyan

    2011-03-01

    Scab, caused by Cladosporium cucumerinum, is an important disease of cucumber, Cucumis sativus. In this study, we conducted fine genetic mapping of the single dominant scab resistance gene, Ccu, with 148 F(9) recombinant inbred lines (RILs) and 1,944 F(2) plants derived from the resistant cucumber inbred line 9110Gt and the susceptible line 9930, whose draft genome sequence is now available. A framework linkage map was first constructed with simple sequence repeat markers placing Ccu into the terminal 670 kb region of cucumber Chromosome 2. The 9110Gt genome was sequenced at 5× genome coverage with the Solexa next-generation sequencing technology. Sequence analysis of the assembled 9110Gt contigs and the Ccu region of the 9930 genome identified three insertion/deletion (Indel) markers, Indel01, Indel02, and Indel03 that were closely linked with the Ccu locus. On the high-resolution map developed with the F(2) population, the two closest flanking markers, Indel01 and Indel02, were 0.14 and 0.15 cM away from the target gene Ccu, respectively, and the physical distance between the two markers was approximately 140 kb. Detailed annotation of the 180 kb region harboring the Ccu locus identified a cluster of six resistance gene analogs (RGAs) that belong to the nucleotide binding site (NBS) type R genes. Four RGAs were in the region delimited by markers Indel01 and Indel02, and thus were possible candidates of Ccu. Comparative DNA analysis of this cucumber Ccu gene region with a melon (C. melo) bacterial artificial chromosome (BAC) clone revealed a high degree of micro-synteny and conservation of the RGA tandem repeats in this region. PMID:21104067

  5. Evidence that a secondary metabolic biosynthetic gene cluster has grown by gene relocation during evolution of the filamentous fungus Fusarium.

    PubMed

    Proctor, Robert H; McCormick, Susan P; Alexander, Nancy J; Desjardins, Anne E

    2009-12-01

    Trichothecenes are terpene-derived secondary metabolites produced by multiple genera of filamentous fungi, including many plant pathogenic species of Fusarium. These metabolites are of interest because they are toxic to animals and plants and can contribute to pathogenesis of Fusarium on some crop species. Fusarium graminearum and F. sporotrichioides have trichothecene biosynthetic genes (TRI) at three loci: a 12-gene TRI cluster and two smaller TRI loci that consist of one or two genes. Here, comparisons of additional Fusarium species have provided evidence that TRI loci have a complex evolutionary history that has included loss, non-functionalization and rearrangement of genes as well as trans-species polymorphism. The results also indicate that the TRI cluster has expanded in some species by relocation of two genes into it from the smaller loci. Thus, evolutionary forces have driven consolidation of TRI genes into fewer loci in some fusaria but have maintained three distinct TRI loci in others. PMID:19843228

  6. Characterization of two acetyltransferase genes in the pyripyropene biosynthetic gene cluster from Penicillium coprobium

    PubMed Central

    Hu, Jie; Furutani, Ayako; Yamamoto, Kentaro; Oyama, Kazuhiko; Mitomi, Masaaki; Anzai, Hiroyuki

    2014-01-01

    Pyripyropenes potently and selectively inhibit acyl-CoA:cholesterol acyltransferase 2 (ACAT-2). Among multiple isomers of pyripyropene (A to R), pyripyropene A (PyA) has insecticidal properties in addition to its growth inhibition properties against human umbilical vein endothelial cells. Based on the predicted biosynthetic gene cluster of pyripyropene A, two genes (ppb8 and ppb9) encoding two acetyltransferases (ATs) were separately isolated and introduced into the model fungus Aspergillus oryzae, using the protoplast–polyethylene glycol method. The bioconversion of certain predicted intermediates in the transformants revealed the manner by which acetylation occurred in the biosynthetic pathway by the products expressed by these two genes (AT-1 and AT-2). The acetylated products detected by high-performance liquid chromatography (HPLC) in the extracts from AT-1 and AT-2 transformant clones were not present in the extract from the transformant clone with an empty vector. The HLPC charts of each bioconversion study exhibited high peaks at 12, 10.5 and 9 min, respectively. Further ultraviolet absorption and mass spectrometry analyses identified the products as PyE, PyO and PyA, respectively. AT-1 acetylated the C-1 of deacetyl-pyripyropene E (deAc-PyE), while AT-2 played an active role in acetylating the C-11 of 11-deAc-PyO and C-7 of deAc-PyA at two different steps of the biosynthetic pathway. PMID:26019565

  7. The gene cluster of aureocyclicin 4185: the first cyclic bacteriocin of Staphylococcus aureus.

    PubMed

    Potter, Amina; Ceotto, Hilana; Coelho, Marcus Lívio Varella; Guimarães, Allan J; Bastos, Maria do Carmo de Freire

    2014-05-01

    Staphylococcus aureus 4185 was previously shown to produce at least two bacteriocins. One of them is encoded by pRJ101. To detect the bacteriocin-encoding gene cluster, an ~9160 kb region of pRJ101 was sequenced. In silico analyses identified 10 genes (aclX, aclB, aclI, aclT, aclC, aclD, aclA, aclF, aclG and aclH) that might be involved in the production of a novel cyclic bacteriocin named aureocyclicin 4185. The organization of these genes was quite similar to that of the gene cluster responsible for carnocyclin A production and immunity. Four putative proteins encoded by these genes (AclT, AclC, AclD and AclA) also exhibited similarity to proteins encoded by cyclic bacteriocin gene clusters. Mutants derived from insertion of Tn917-lac into aclC, aclF, aclH and aclX were affected in bacteriocin production and growth. AclX is a 205 aa putative protein not encoded by the gene clusters of other cyclic bacteriocins. AclX exhibits 50 % similarity to a permease and has five putative membrane-spanning domains. Transcription analyses suggested that aclX is part of the aureocyclicin 4185 gene cluster, encoding a protein required for bacteriocin production. The aclA gene is the structural gene of aureocyclicin 4185, which shows 65 % similarity to garvicin ML. AclA is proposed to be cleaved off, generating a mature peptide with a predicted Mr of 5607 Da (60 aa). By homology modelling, AclA presents four α-helices, like carnocyclin A. AclA could not be found at detectable levels in the culture supernatant of a strain carrying only pRJ101. To our knowledge, this is the first report of a cyclic bacteriocin gene cluster in the genus Staphylococcus. PMID:24574434

  8. Nanoscale spatial organization of the HoxD gene cluster in distinct transcriptional states.

    PubMed

    Fabre, Pierre J; Benke, Alexander; Joye, Elisabeth; Nguyen Huynh, Thi Hanh; Manley, Suliana; Duboule, Denis

    2015-11-10

    Chromatin condensation plays an important role in the regulation of gene expression. Recently, it was shown that the transcriptional activation of Hoxd genes during vertebrate digit development involves modifications in 3D interactions within and around the HoxD gene cluster. This reorganization follows a global transition from one set of regulatory contacts to another, between two topologically associating domains (TADs) located on either side of the HoxD locus. Here, we use 3D DNA FISH to assess the spatial organization of chromatin at and around the HoxD gene cluster and report that although the two TADs are tightly associated, they appear as spatially distinct units. We measured the relative position of genes within the cluster and found that they segregate over long distances, suggesting that a physical elongation of the HoxD cluster can occur. We analyzed this possibility by super-resolution imaging (STORM) and found that tissues with distinct transcriptional activity exhibit differing degrees of elongation. We also observed that the morphological change of the HoxD cluster in developing digits is associated with its position at the boundary between the two TADs. Such variations in the fine-scale architecture of the gene cluster suggest causal links among its spatial configuration, transcriptional activation, and the flanking chromatin context. PMID:26504220

  9. Isolation and characterization of the gene cluster for biosynthesis of the thiopeptide antibiotic TP-1161.

    PubMed

    Engelhardt, Kerstin; Degnes, Kristin F; Zotchev, Sergey B

    2010-11-01

    Recently, we isolated a new thiopeptide antibiotic, TP-1161, from the fermentation broth of a marine actinomycete typed as a member of the genus Nocardiopsis. Here we report the identification, isolation, and analysis of the TP-1161 biosynthetic gene cluster from this species. The gene cluster was identified by mining a draft genome sequence using the predicted structural peptide sequence of TP-1161. Functional assignment of a ∼16-kb genomic region revealed 13 open reading frames proposed to constitute the TP-1161 biosynthetic locus. While the typical core set of thiopeptide modification enzymes contains one cyclodehydratase/dehydrogenase pair, paralogous genes predicted to encode additional cyclodehydratases and dehydrogenases were identified. Although attempts at heterologous expression of the TP-1161 gene cluster in Streptomyces coelicolor failed, its identity was confirmed through the targeted gene inactivation in the original host. PMID:20851988

  10. Chromosomal clustering and GATA transcriptional regulation of intestine-expressed genes in C. elegans.

    PubMed

    Pauli, Florencia; Liu, Yueyi; Kim, Yoona A; Chen, Pei-Jiun; Kim, Stuart K

    2006-01-01

    We used mRNA tagging to identify genes expressed in the intestine of C. elegans. Animals expressing an epitope-tagged protein that binds the poly-A tail of mRNAs (FLAG::PAB-1) from an intestine-specific promoter (ges-1) were used to immunoprecipitate FLAG::PAB-1/mRNA complexes from the intestine. A total of 1938 intestine-expressed genes (P<0.001) were identified using DNA microarrays. First, we compared the intestine-expressed genes with those expressed in the muscle and germline, and identified 510 genes enriched in all three tissues and 624 intestine-, 230 muscle- and 1135 germ line-enriched genes. Second, we showed that the 1938 intestine-expressed genes were physically clustered on the chromosomes, suggesting that the order of genes in the genome is influenced by the effect of chromatin domains on gene expression. Furthermore, the commonly expressed genes showed more chromosomal clustering than the tissue-enriched genes, suggesting that chromatin domains may influence housekeeping genes more than tissue-specific genes. Third, in order to gain further insight into the regulation of intestinal gene expression, we searched for regulatory motifs. This analysis found that the promoters of the intestine genes were enriched for the GATA transcription factor consensus binding sequence. We experimentally verified these results by showing that the GATA motif is required in cis and that GATA transcription factors are required in trans for expression of these intestinal genes. PMID:16354718

  11. Identification of a 12-gene Fusaric Acid Biosynthetic Gene Cluster in Fusarium Species Through Comparative and Functional Genomics.

    PubMed

    Brown, Daren W; Lee, Seung-Ho; Kim, Lee-Han; Ryu, Jae-Gee; Lee, Soohyung; Seo, Yunhee; Kim, Young Ho; Busman, Mark; Yun, Sung-Hwan; Proctor, Robert H; Lee, Theresa

    2015-03-01

    In fungi, genes involved in biosynthesis of a secondary metabolite (SM) are often located adjacent to one another in the genome and are coordinately regulated. These SM biosynthetic gene clusters typically encode enzymes, one or more transcription factors, and a transport protein. Fusaric acid is a polyketide-derived SM produced by multiple species of the fungal genus Fusarium. This SM is of concern because it is toxic to animals and, therefore, is considered a mycotoxin and may contribute to plant pathogenesis. Preliminary descriptions of the fusaric acid (FA) biosynthetic gene (FUB) cluster have been reported in two Fusarium species, the maize pathogen F. verticillioides and the rice pathogen F. fujikuroi. The cluster consisted of five genes and did not include a transcription factor or transporter gene. Here, analysis of the FUB region in F. verticillioides, F. fujikuroi, and F. oxysporum, a plant pathogen with multiple hosts, indicates the FUB cluster consists of at least 12 genes (FUB1 to FUB12). Deletion analysis confirmed that nine FUB genes, including two Zn(II)2Cys6 transcription factor genes, are required for production of wild-type levels of FA. Comparisons of FUB cluster homologs across multiple Fusarium isolates and species revealed insertion of non-FUB genes at one or two locations in some homologs. Although the ability to produce FA contributed to the phytotoxicity of F. oxysporum culture extracts, lack of production did not affect virulence of F. oxysporum on cactus or F. verticillioides on maize seedlings. These findings provide new insights into the genetic and biochemical processes required for FA production. PMID:25372119

  12. The Fumagillin Gene Cluster, an Example of Hundreds of Genes under veA Control in Aspergillus fumigatus

    PubMed Central

    Dhingra, Sourabh; Lind, Abigail L.; Lin, Hsiao-Ching; Tang, Yi; Rokas, Antonis; Calvo, Ana M.

    2013-01-01

    Aspergillus fumigatus is the causative agent of invasive aspergillosis, leading to infection-related mortality in immunocompromised patients. We previously showed that the conserved and unique-to-fungi veA gene affects different cell processes such as morphological development, gliotoxin biosynthesis and protease activity, suggesting a global regulatory effect on the genome of this medically relevant fungus. In this study, RNA sequencing analysis revealed that veA controls the expression of hundreds of genes in A. fumigatus, including those comprising more than a dozen known secondary metabolite gene clusters. Chemical analysis confirmed that veA controls the synthesis of other secondary metabolites in this organism in addition to gliotoxin. Among the secondary metabolite gene clusters regulated by veA is the elusive but recently identified gene cluster responsible for the biosynthesis of fumagillin, a meroterpenoid known for its anti-angiogenic activity by binding to human methionine aminopeptidase 2. The fumagillin gene cluster contains a veA-dependent regulatory gene, fumR (Afu8g00420), encoding a putative C6 type transcription factor. Deletion of fumR results in silencing of the gene cluster and elimination of fumagillin biosynthesis. We found expression of fumR to also be dependent on laeA, a gene encoding another component of the fungal velvet complex. The results in this study argue that veA is a global regulator of secondary metabolism in A. fumigatus, and that veA may be a conduit via which chemical development is coupled to morphological development and other cellular processes. PMID:24116213

  13. Two Horizontally Transferred Xenobiotic Resistance Gene Clusters Associated with Detoxification of Benzoxazolinones by Fusarium Species

    PubMed Central

    Glenn, Anthony E.; Davis, C. Britton; Gao, Minglu; Gold, Scott E.; Mitchell, Trevor R.; Proctor, Robert H.; Stewart, Jane E.; Snook, Maurice E.

    2016-01-01

    Microbes encounter a broad spectrum of antimicrobial compounds in their environments and often possess metabolic strategies to detoxify such xenobiotics. We have previously shown that Fusarium verticillioides, a fungal pathogen of maize known for its production of fumonisin mycotoxins, possesses two unlinked loci, FDB1 and FDB2, necessary for detoxification of antimicrobial compounds produced by maize, including the γ-lactam 2-benzoxazolinone (BOA). In support of these earlier studies, microarray analysis of F. verticillioides exposed to BOA identified the induction of multiple genes at FDB1 and FDB2, indicating the loci consist of gene clusters. One of the FDB1 cluster genes encoded a protein having domain homology to the metallo-β-lactamase (MBL) superfamily. Deletion of this gene (MBL1) rendered F. verticillioides incapable of metabolizing BOA and thus unable to grow on BOA-amended media. Deletion of other FDB1 cluster genes, in particular AMD1 and DLH1, did not affect BOA degradation. Phylogenetic analyses and topology testing of the FDB1 and FDB2 cluster genes suggested two horizontal transfer events among fungi, one being transfer of FDB1 from Fusarium to Colletotrichum, and the second being transfer of the FDB2 cluster from Fusarium to Aspergillus. Together, the results suggest that plant-derived xenobiotics have exerted evolutionary pressure on these fungi, leading to horizontal transfer of genes that enhance fitness or virulence. PMID:26808652

  14. Two Horizontally Transferred Xenobiotic Resistance Gene Clusters Associated with Detoxification of Benzoxazolinones by Fusarium Species.

    PubMed

    Glenn, Anthony E; Davis, C Britton; Gao, Minglu; Gold, Scott E; Mitchell, Trevor R; Proctor, Robert H; Stewart, Jane E; Snook, Maurice E

    2016-01-01

    Microbes encounter a broad spectrum of antimicrobial compounds in their environments and often possess metabolic strategies to detoxify such xenobiotics. We have previously shown that Fusarium verticillioides, a fungal pathogen of maize known for its production of fumonisin mycotoxins, possesses two unlinked loci, FDB1 and FDB2, necessary for detoxification of antimicrobial compounds produced by maize, including the γ-lactam 2-benzoxazolinone (BOA). In support of these earlier studies, microarray analysis of F. verticillioides exposed to BOA identified the induction of multiple genes at FDB1 and FDB2, indicating the loci consist of gene clusters. One of the FDB1 cluster genes encoded a protein having domain homology to the metallo-β-lactamase (MBL) superfamily. Deletion of this gene (MBL1) rendered F. verticillioides incapable of metabolizing BOA and thus unable to grow on BOA-amended media. Deletion of other FDB1 cluster genes, in particular AMD1 and DLH1, did not affect BOA degradation. Phylogenetic analyses and topology testing of the FDB1 and FDB2 cluster genes suggested two horizontal transfer events among fungi, one being transfer of FDB1 from Fusarium to Colletotrichum, and the second being transfer of the FDB2 cluster from Fusarium to Aspergillus. Together, the results suggest that plant-derived xenobiotics have exerted evolutionary pressure on these fungi, leading to horizontal transfer of genes that enhance fitness or virulence. PMID:26808652

  15. The Genome of Tolypocladium inflatum: Evolution, Organization, and Expression of the Cyclosporin Biosynthetic Gene Cluster

    PubMed Central

    Bushley, Kathryn E.; Raja, Rajani; Jaiswal, Pankaj; Cumbie, Jason S.; Nonogaki, Mariko; Boyd, Alexander E.; Owensby, C. Alisha; Knaus, Brian J.; Elser, Justin; Miller, Daniel; Di, Yanming; McPhail, Kerry L.; Spatafora, Joseph W.

    2013-01-01

    The ascomycete fungus Tolypocladium inflatum, a pathogen of beetle larvae, is best known as the producer of the immunosuppressant drug cyclosporin. The draft genome of T. inflatum strain NRRL 8044 (ATCC 34921), the isolate from which cyclosporin was first isolated, is presented along with comparative analyses of the biosynthesis of cyclosporin and other secondary metabolites in T. inflatum and related taxa. Phylogenomic analyses reveal previously undetected and complex patterns of homology between the nonribosomal peptide synthetase (NRPS) that encodes for cyclosporin synthetase (simA) and those of other secondary metabolites with activities against insects (e.g., beauvericin, destruxins, etc.), and demonstrate the roles of module duplication and gene fusion in diversification of NRPSs. The secondary metabolite gene cluster responsible for cyclosporin biosynthesis is described. In addition to genes necessary for cyclosporin biosynthesis, it harbors a gene for a cyclophilin, which is a member of a family of immunophilins known to bind cyclosporin. Comparative analyses support a lineage specific origin of the cyclosporin gene cluster rather than horizontal gene transfer from bacteria or other fungi. RNA-Seq transcriptome analyses in a cyclosporin-inducing medium delineate the boundaries of the cyclosporin cluster and reveal high levels of expression of the gene cluster cyclophilin. In medium containing insect hemolymph, weaker but significant upregulation of several genes within the cyclosporin cluster, including the highly expressed cyclophilin gene, was observed. T. inflatum also represents the first reference draft genome of Ophiocordycipitaceae, a third family of insect pathogenic fungi within the fungal order Hypocreales, and supports parallel and qualitatively distinct radiations of insect pathogens. The T. inflatum genome provides additional insight into the evolution and biosynthesis of cyclosporin and lays a foundation for further investigations of the role

  16. Shared Gene Structures and Clusters of Mutually Exclusive Spliced Exons within the Metazoan Muscle Myosin Heavy Chain Genes

    PubMed Central

    Kollmar, Martin; Hatje, Klas

    2014-01-01

    Multicellular animals possess two to three different types of muscle tissues. Striated muscles have considerable ultrastructural similarity and contain a core set of proteins including the muscle myosin heavy chain (Mhc) protein. The ATPase activity of this myosin motor protein largely dictates muscle performance at the molecular level. Two different solutions to adjusting myosin properties to different muscle subtypes have been identified so far: Vertebrates and nematodes contain many independent differentially expressed Mhc genes while arthropods have single Mhc genes with clusters of mutually exclusive spliced exons (MXEs). The availability of hundreds of metazoan genomes now allowed us to study whether the ancient bilateria already contained MXEs, how MXE complexity subsequently evolved, and whether additional scenarios to control contractile properties in different muscles could be proposed, By reconstructing the Mhc genes from 116 metazoans we showed that all intron positions within the motor domain coding regions are conserved in all bilateria analysed. The last common ancestor of the bilateria already contained a cluster of MXEs coding for part of the loop-2 actin-binding sequence. Subsequently the protostomes and later the arthropods gained many further clusters while MXEs got completely lost independently in several branches (vertebrates and nematodes) and species (for example the annelid Helobdella robusta and the salmon louse Lepeophtheirus salmonis). Several bilateria have been found to encode multiple Mhc genes that might all or in part contain clusters of MXEs. Notable examples are a cluster of six tandemly arrayed Mhc genes, of which two contain MXEs, in the owl limpet Lottia gigantea and four Mhc genes with three encoding MXEs in the predatory mite Metaseiulus occidentalis. Our analysis showed that similar solutions to provide different myosin isoforms (multiple genes or clusters of MXEs or both) have independently been developed several times

  17. CASSIS and SMIPS: promoter-based prediction of secondary metabolite gene clusters in eukaryotic genomes

    PubMed Central

    Wolf, Thomas; Shelest, Vladimir; Nath, Neetika; Shelest, Ekaterina

    2016-01-01

    Motivation: Secondary metabolites (SM) are structurally diverse natural products of high pharmaceutical importance. Genes involved in their biosynthesis are often organized in clusters, i.e., are co-localized and co-expressed. In silico cluster prediction in eukaryotic genomes remains problematic mainly due to the high variability of the clusters’ content and lack of other distinguishing sequence features. Results: We present Cluster Assignment by Islands of Sites (CASSIS), a method for SM cluster prediction in eukaryotic genomes, and Secondary Metabolites by InterProScan (SMIPS), a tool for genome-wide detection of SM key enzymes (‘anchor’ genes): polyketide synthases, non-ribosomal peptide synthetases and dimethylallyl tryptophan synthases. Unlike other tools based on protein similarity, CASSIS exploits the idea of co-regulation of the cluster genes, which assumes the existence of common regulatory patterns in the cluster promoters. The method searches for ‘islands’ of enriched cluster-specific motifs in the vicinity of anchor genes. It was validated in a series of cross-validation experiments and showed high sensitivity and specificity. Availability and implementation: CASSIS and SMIPS are freely available at https://sbi.hki-jena.de/cassis. Contact: thomas.wolf@leibniz-hki.de or ekaterina.shelest@leibniz-hki.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26656005

  18. Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects

    PubMed Central

    2012-01-01

    Background Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models. Results We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases. Conclusions Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data. PMID:23151154

  19. Parsing a multifunctional biosynthetic gene cluster from rice: Biochemical characterization of CYP71Z6 & 7.

    PubMed

    Wu, Yisheng; Hillwig, Matthew L; Wang, Qiang; Peters, Reuben J

    2011-11-01

    Rice (Oryza sativa) contains a biosynthetic gene cluster associated with production of at least two groups of diterpenoid phytoalexins, the antifungal phytocassanes and antibacterial oryzalides. While cytochromes P450 (CYP) from this cluster are known to be involved in phytocassane production, such mono-oxygenase activity relevant to oryzalide biosynthesis was unknown. Here we report biochemical characterization demonstrating that CYP71Z6 from this cluster acts as an ent-isokaurene C2-hydroxylase that is presumably involved in the biosynthesis of oryzalides. Our results further suggest that the closely related and co-clustered CYP71Z7 likely acts as a C2-hydroxylase involved in a latter step of phytocassane biosynthesis. Thus, CYP71Z6 & 7 appear to have evolved distinct roles in rice diterpenoid metabolism, offering insight into plant biosynthetic gene cluster evolution. PMID:21985968

  20. Parsing a multifunctional biosynthetic gene cluster from rice: Biochemical characterization of CYP71Z6 & 7

    PubMed Central

    Wu, Yisheng; Hillwig, Matthew L.; Wang, Qiang; Peters, Reuben J.

    2011-01-01

    Rice (Oryza sativa) contains a biosynthetic gene cluster associated with production of at least two groups of diterpenoid phytoalexins, the antifungal phytocassanes and antibacterial oryzalides. While cytochromes P450 (CYP) from this cluster are known to be involved in phytocassane production, such mono-oxygenase activity relevant to oryzalide biosynthesis was unknown. Here we report biochemical characterization demonstrating that CYP71Z6 from this cluster acts as an ent-isokaurene C2-hydroxylase that is presumably involved in the biosynthesis of oryzalides. Our results further suggest that the closely related and co-clustered CYP71Z7 likely acts as a C2-hydroxylase involved in a latter step of phytocassane biosynthesis. Thus, CYP71Z6 & 7 appear to have evolved distinct roles in rice diterpenoid metabolism, offering insight into plant biosynthetic gene cluster evolution. PMID:21985968

  1. Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters

    PubMed Central

    Cimermancic, Peter; Medema, Marnix H.; Claesen, Jan; Kurita, Kenji; Wieland Brown, Laura C.; Mavrommatis, Konstantinos; Pati, Amrita; Godfrey, Paul A.; Koehrsen, Michael; Clardy, Jon; Birren, Bruce W.; Takano, Eriko; Sali, Andrej; Linington, Roger G.; Fischbach, Michael A.

    2014-01-01

    Summary Although biosynthetic gene clusters (BGCs) have been discovered for hundreds of bacterial metabolites, our knowledge of their diversity remains limited. Here, we used a novel algorithm to systematically identify BGCs in the extensive extant microbial sequencing data. Network analysis of the predicted BGCs revealed large gene cluster families, the vast majority uncharacterized. We experimentally characterized the most prominent family, consisting of two subfamilies of hundreds of BGCs distributed throughout the Proteobacteria; their products are aryl polyenes, lipids with an aryl head group conjugated to a polyene tail. We identified a distant relationship to a third subfamily of aryl polyene BGCs, and together the three subfamilies represent the largest known family of biosynthetic gene clusters, with more than 1,000 members. Although these clusters are widely divergent in sequence, their small molecule products are remarkably conserved, indicating for the first time the important roles these compounds play in Gram-negative cell biology. PMID:25036635

  2. Mass distributed clustering: a new algorithm for repeated measurements in gene expression data.

    PubMed

    Matsumoto, Shinya; Aisaki, Ken-ichi; Kanno, Jun

    2005-01-01

    The availability of whole-genome sequence data and high-throughput techniques such as DNA microarray enable researchers to monitor the alteration of gene expression by a certain organ or tissue in a comprehensive manner. The quantity of gene expression data can be greater than 30,000 genes per one measurement, making data clustering methods for analysis essential. Biologists usually design experimental protocols so that statistical significance can be evaluated; often, they conduct experiments in triplicate to generate a mean and standard deviation. Existing clustering methods usually use these mean or median values, rather than the original data, and take significance into account by omitting data showing large standard deviations, which eliminates potentially useful information. We propose a clustering method that uses each of the triplicate data sets as a probability distribution function instead of pooling data points into a median or mean. This method permits truly unsupervised clustering of the data from DNA microarrays. PMID:16901101

  3. Genomics-driven discovery of the pneumocandin biosynthetic gene cluster in the fungus Glarea lozoyensis

    PubMed Central

    2013-01-01

    Background The antifungal therapy caspofungin is a semi-synthetic derivative of pneumocandin B0, a lipohexapeptide produced by the fungus Glarea lozoyensis, and was the first member of the echinocandin class approved for human therapy. The nonribosomal peptide synthetase (NRPS)-polyketide synthases (PKS) gene cluster responsible for pneumocandin biosynthesis from G. lozoyensis has not been elucidated to date. In this study, we report the elucidation of the pneumocandin biosynthetic gene cluster by whole genome sequencing of the G. lozoyensis wild-type strain ATCC 20868. Results The pneumocandin biosynthetic gene cluster contains a NRPS (GLNRPS4) and a PKS (GLPKS4) arranged in tandem, two cytochrome P450 monooxygenases, seven other modifying enzymes, and genes for L-homotyrosine biosynthesis, a component of the peptide core. Thus, the pneumocandin biosynthetic gene cluster is significantly more autonomous and organized than that of the recently characterized echinocandin B gene cluster. Disruption mutants of GLNRPS4 and GLPKS4 no longer produced the pneumocandins (A0 and B0), and the Δglnrps4 and Δglpks4 mutants lost antifungal activity against the human pathogenic fungus Candida albicans. In addition to pneumocandins, the G. lozoyensis genome encodes a rich repertoire of natural product-encoding genes including 24 PKSs, six NRPSs, five PKS-NRPS hybrids, two dimethylallyl tryptophan synthases, and 14 terpene synthases. Conclusions Characterization of the gene cluster provides a blueprint for engineering new pneumocandin derivatives with improved pharmacological properties. Whole genome estimation of the secondary metabolite-encoding genes from G. lozoyensis provides yet another example of the huge potential for drug discovery from natural products from the fungal kingdom. PMID:23688303

  4. A novel gene cluster in Fusarium graminearum expressed under mycotoxin induction conditions

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We have identified a cluster of eight genes (gene loci fg08077 - fg08084) in Fusarium graminearum that is concomitantly up-regulated (Northern and qPCR analysis) under growth conditions that promote mycotoxin production. Proteomics experiments (iTRAQ analysis) have confirmed the up-regulation of pr...

  5. A T7 RNA polymerase-based toolkit for the concerted expression of clustered genes.

    PubMed

    Arvani, Solmaz; Markert, Annette; Loeschcke, Anita; Jaeger, Karl-Erich; Drepper, Thomas

    2012-06-15

    Bacterial genes whose enzymes are either assembled into complex multi-domain proteins or form biosynthetic pathways are frequently organized within large chromosomal clusters. The functional expression of clustered genes, however, remains challenging since it generally requires an expression system that facilitates the coordinated transcription of numerous genes irrespective of their natural promoters and terminators. Here, we report on the development of a novel expression system that is particularly suitable for the homologous expression of multiple genes organized in a contiguous cluster. The new expression toolkit consists of an Ω interposon cassette carrying a T7 RNA polymerase specific promoter which is designed for promoter tagging of clustered genes and a small set of broad-host-range plasmids providing the respective polymerase in different bacteria. The uptake hydrogenase gene locus of the photosynthetic non-sulfur purple bacterium Rhodobacter capsulatus which consists of 16 genes was used as an example to demonstrate functional expression only by T7 RNA polymerase but not by bacterial RNA polymerase. Our findings clearly indicate that due to its unique properties T7 RNA polymerase can be applied for overexpression of large and complex bacterial gene regions. PMID:22285639

  6. Variation in the fumonisin biosynthetic gene cluster in fumonisin-producing and nonproducing black aspergilli

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The ability to produce fumonisin mycotoxins varies among members of the black aspergilli. Previously, analyses of selected genes in the fumonisin biosynthetic gene (fum) cluster in black aspergilli from California grapes indicated that fumonisin-nonproducing isolates of Aspergillus welwitschiae lack...

  7. Degeneration of aflatoxin gene clusters in Aspergillus flavus from Africa and North America.

    PubMed

    Adhikari, Bishwo N; Bandyopadhyay, Ranajit; Cotty, Peter J

    2016-12-01

    Aspergillus flavus is the most common causal agent of aflatoxin contamination of food and feed. However, aflatoxin-producing potential varies widely among A. flavus genotypes with many producing no aflatoxins. Some non-aflatoxigenic genotypes are used as biocontrol agents to prevent contamination. Aflatoxin biosynthesis genes are tightly clustered in a highly conserved order. Gene deletions and presence of single nucleotide polymorphisms (SNPs) in aflatoxin biosynthesis genes are often associated with A. flavus inability to produce aflatoxins. In order to identify mechanisms of non-aflatoxigenicity in non-aflatoxigenic genotypes of value in aflatoxin biocontrol, complete cluster sequences of 35 A. flavus genotypes from Africa and North America were analyzed. Inability of some genotypes to produce aflatoxin resulted from deletion of biosynthesis genes. In other genotypes, non-aflatoxigenicity originated from SNP formation. The process of degeneration differed across the gene cluster; genes involved in early biosynthesis stages were more likely to be deleted while genes involved in later stages displayed high frequencies of SNPs. Comparative analyses of aflatoxin gene clusters provides insight into the diversity of mechanisms of non-aflatoxigenicity in A. flavus genotypes used as biological control agents. The sequences provide resources for both diagnosis of non-aflatoxigenicity and monitoring of biocontrol genotypes during biopesticide manufacture and in the environment. PMID:27576895

  8. Identification of the Herboxidiene Biosynthetic Gene Cluster in Streptomyces chromofuscus ATCC 49982

    PubMed Central

    Shao, Lei; Zi, Jiachen; Zeng, Jia

    2012-01-01

    The 53-kb biosynthetic gene cluster for the novel anticholesterol natural product herboxidiene was identified in Streptomyces chromofuscus ATCC 49982 by genome sequencing and gene inactivation. In addition to herboxidiene, a biosynthetic intermediate, 18-deoxy-herboxidiene, was also isolated from the fermentation broth of S. chromofuscus ATCC 49982 as a minor metabolite. PMID:22247174

  9. Cloning and identification of the lobophorin biosynthetic gene cluster from marine Streptomyces olivaceus strain FXJ7.023.

    PubMed

    Yue, Changwu; Niu, Jing; Liu, Ning; Lü, Yuhong; Liu, Minghao; Li, Yuanyuan

    2016-01-01

    A full length about 105 kb gene cluster containing 35 open reading frames involved in the biosynthesis of lobophorins was cloned and sequenced from a fosmid genomic library of Streptomyces olivaceus strain FXJ7.023. The cluster was identified by genome wide annotation and analysis of secondary metabolite biosynthesis gene clusters by anti SMASH and knockout of loading module-contained region of polyketide skeleton synthesis gene (the starter of lobS1). Gene cluster comparative analysis suggested that the cluster encoded the complete genes for lobophorin polyketide assembly, modification, substrate catalysis, regulation, transportation and resistance, and shows great identity to the newest reported lobophorin biosynthetic gene cluster from Streptomyces sp. SCSIO 01127, but with a significant gene rearrangement in the PKS modules. PMID:27005505

  10. Organization of a large gene cluster encoding ribosomal proteins in the cyanobacterium Synechococcus sp. strain PCC 6301: comparison of gene clusters among cyanobacteria, eubacteria and chloroplast genomes.

    PubMed

    Sugita, M; Sugishita, H; Fujishiro, T; Tsuboi, M; Sugita, C; Endo, T; Sugiura, M

    1997-08-11

    The structure of a large gene cluster containing 22 ribosomal protein (r-protein) genes of the cyanobacterium Synechococcus sp. strain PCC6301 is presented. Based on DNA and protein sequence analyses, genes encoding r-proteins L3, L4, L23, L2, S19, L22, S3, L16, L29, S17, L14, L24, L5, S8, L6, L18, S5, L15, L36, S13, S11, L17, SecY, adenylate kinase (AK) and the alpha subunit of RNA polymerase were identified. The gene order is similar to that of the E. coli S10, spc and alpha operons. Unlike the corresponding E. coli operons, the genes for r-proteins S4, S10, S14 and L30 are not present in this cluster. The organization of Synechococcus r-protein genes also resembles that of chloroplast (cp) r-protein genes of red and brown algal species. This strongly supports the endosymbiotic theory that the cp genome evolved from an ancient photosynthetic bacterium. PMID:9300823

  11. The Interleukin 1 Gene Cluster Contains a Major Susceptibility Locus for Ankylosing Spondylitis

    PubMed Central

    Timms, Andrew E.; Crane, Alison M.; Sims, Anne-Marie; Cordell, Heather J.; Bradbury, Linda A.; Abbott, Aaron; Coyne, Mark R. E.; Beynon, Owen; Herzberg, Ibi; Duff, Gordon W.; Calin, Andrei; Cardon, Lon R.; Wordsworth, B. Paul; Brown, Matthew A.

    2004-01-01

    Ankylosing spondylitis (AS) is a common and highly heritable inflammatory arthropathy. Although the gene HLA-B27 is almost essential for the inheritance of the condition, it alone is not sufficient to explain the pattern of familial recurrence of the disease. We have previously demonstrated suggestive linkage of AS to chromosome 2q13, a region containing the interleukin 1 (IL-1) family gene cluster, which includes several strong candidates for involvement in the disease. In the current study, we describe strong association and transmission of IL-1 family gene cluster single-nucleotide polymorphisms and haplotypes with AS. PMID:15309690

  12. A putative greigite-type magnetosome gene cluster from the candidate phylum Latescibacteria.

    PubMed

    Lin, Wei; Pan, Yongxin

    2015-04-01

    The intracellular biomineralization of magnetite and/or greigite magnetosomes in magnetotactic bacteria (MTB) is strictly controlled by a group of conserved genes, termed magnetosome genes, which are organized as clusters (or islands) in MTB genomes. So far, all reported MTB are affiliated within the Proteobacteria phylum, the Nitrospirae phylum and the candidate division OP3. Here, we report the discovery of a putative magnetosome gene cluster structure from the draft genome of an uncultivated bacterium belonging to the candidate phylum Latescibacteria (formerly candidate division WS3) recently recovered by Rinke and colleagues, which contains 10 genes with homology to magnetosome mam genes of magnetotactic Proteobacteria and Nitrospirae. Moreover, these genes are phylogenetically closely related to greigite-type magnetosome genes that were only found from the Deltaproteobacteria MTB before, suggesting that the greigite genes may originate earlier than previously imagined. These findings indicate that some members of Latescibacteria may be capable of forming greigite magnetosomes, and thus may play previously unrecognized roles in environmental iron and sulfur cycles. The conserved genomic structure of magnetosome gene cluster in Latescibacteria phylum supports the hypothesis of horizontal transfer of these genes among distantly related bacterial groups in nature. PMID:25382584

  13. Co-clustering phenome–genome for phenotype classification and disease gene discovery

    PubMed Central

    Hwang, TaeHyun; Atluri, Gowtham; Xie, MaoQiang; Dey, Sanjoy; Hong, Changjin; Kumar, Vipin; Kuang, Rui

    2012-01-01

    Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype–gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype–gene association matrix under the prior knowledge from phenotype similarity network and protein–protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype–gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein–protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways. PMID:22735708

  14. Human histone gene organization: Nonregular arrangement within a large cluster

    SciTech Connect

    Albig, W.; Meergans, K.; Doenecke, D.

    1997-03-01

    We have previously located the genes of the five human main type H1 genes and the gene encoding the testicular subtype H1t to the region 21.1 to 22.2 on the short arm of chromosome 6. To investigate the organization of the histone genes in this region, we isolated two YACs from a human YAC library by PCR screening with primers specific for histone H1.1. This screen revealed two YAC clones. YAC Y23 (corresponding to ICRFy901D1223) contains an insert of about 480 kb, whereas the smaller YAC 4A (corresponding to ICRFy900C104) spans about 340 kb and is completely covered by YAC Y23. We have subcloned the YAC inserts in cosmids, determined the linear orientation of the cosmids by cosmid walking, and constructed a restriction map of the entire region by mapping the individual cosmids using partial digests and hybridization with labeled oligonucleotides complementary to the cos site of the vector. Hybridization analysis, subcloning, restriction mapping, and sequencing revealed that most of the previously isolated phage and cosmid clones containing histone genes are part of this YAC including the clones containing the four human main type H1 histone genes H1.1 to H1.4, the H1t gene, and core histone genes. Thirty-five histone genes map within 260 kb of the YAC Y23 insert. All newly identified histone genes were sequenced, and the sequences were deposited with the EMBL nucleotide sequence database. The histone H1.5 gene is not part of this region, and we therefore conclude that the H1.5 gene and the associated core histone genes form a separate subcluster within this chromosomal region. 53 refs., 4 figs., 1 tab.

  15. Microbisporicin gene cluster reveals unusual features of lantibiotic biosynthesis in actinomycetes

    PubMed Central

    Foulston, Lucy C.; Bibb, Mervyn J.

    2010-01-01

    Lantibiotics are ribosomally synthesized, posttranslationally modified peptide antibiotics. The biosynthetic gene cluster for microbisporicin, a potent lantibiotic produced by the actinomycete Microbispora corallina containing chlorinated tryptophan and dihydroxyproline residues, was identified by genome scanning and isolated from an M. corallina cosmid library. Heterologous expression in Nonomuraea sp. ATCC 39727 confirmed that all of the genes required for microbisporicin biosynthesis were present in the cluster. Deletion, in M. corallina, of the gene (mibA) predicted to encode the prepropeptide abolished microbisporicin production. Further deletion analysis revealed insights into the biosynthesis of this unusual and potentially clinically useful lantibiotic, shedding light on mechanisms of regulation and self-resistance. In particular, we report an example of the involvement of a tryptophan halogenase in the modification of a ribosomally synthesized peptide and the pathway-specific regulation of an antibiotic biosynthetic gene cluster by an extracytoplasmic function σ factor–anti-σ factor complex. PMID:20628010

  16. Identification and manipulation of the pleuromutilin gene cluster from Clitopilus passeckerianus for increased rapid antibiotic production

    PubMed Central

    Bailey, Andy M.; Alberti, Fabrizio; Kilaru, Sreedhar; Collins, Catherine M.; de Mattos-Shipley, Kate; Hartley, Amanda J.; Hayes, Patrick; Griffin, Alison; Lazarus, Colin M.; Cox, Russell J.; Willis, Christine L.; O’Dwyer, Karen; Spence, David W.; Foster, Gary D.

    2016-01-01

    Semi-synthetic derivatives of the tricyclic diterpene antibiotic pleuromutilin from the basidiomycete Clitopilus passeckerianus are important in combatting bacterial infections in human and veterinary medicine. These compounds belong to the only new class of antibiotics for human applications, with novel mode of action and lack of cross-resistance, representing a class with great potential. Basidiomycete fungi, being dikaryotic, are not generally amenable to strain improvement. We report identification of the seven-gene pleuromutilin gene cluster and verify that using various targeted approaches aimed at increasing antibiotic production in C. passeckerianus, no improvement in yield was achieved. The seven-gene pleuromutilin cluster was reconstructed within Aspergillus oryzae giving production of pleuromutilin in an ascomycete, with a significant increase (2106%) in production. This is the first gene cluster from a basidiomycete to be successfully expressed in an ascomycete, and paves the way for the exploitation of a metabolically rich but traditionally overlooked group of fungi. PMID:27143514

  17. Identification and manipulation of the pleuromutilin gene cluster from Clitopilus passeckerianus for increased rapid antibiotic production

    NASA Astrophysics Data System (ADS)

    Bailey, Andy M.; Alberti, Fabrizio; Kilaru, Sreedhar; Collins, Catherine M.; de Mattos-Shipley, Kate; Hartley, Amanda J.; Hayes, Patrick; Griffin, Alison; Lazarus, Colin M.; Cox, Russell J.; Willis, Christine L.; O’Dwyer, Karen; Spence, David W.; Foster, Gary D.

    2016-05-01

    Semi-synthetic derivatives of the tricyclic diterpene antibiotic pleuromutilin from the basidiomycete Clitopilus passeckerianus are important in combatting bacterial infections in human and veterinary medicine. These compounds belong to the only new class of antibiotics for human applications, with novel mode of action and lack of cross-resistance, representing a class with great potential. Basidiomycete fungi, being dikaryotic, are not generally amenable to strain improvement. We report identification of the seven-gene pleuromutilin gene cluster and verify that using various targeted approaches aimed at increasing antibiotic production in C. passeckerianus, no improvement in yield was achieved. The seven-gene pleuromutilin cluster was reconstructed within Aspergillus oryzae giving production of pleuromutilin in an ascomycete, with a significant increase (2106%) in production. This is the first gene cluster from a basidiomycete to be successfully expressed in an ascomycete, and paves the way for the exploitation of a metabolically rich but traditionally overlooked group of fungi.

  18. Structure and gene cluster of the O-antigen of Escherichia coli O137.

    PubMed

    Perepelov, Andrei V; Guo, Xi; Senchenkova, Sof'ya N; Li, Yayue; Shashkov, Alexander S; Liu, Bin; Knirel, Yuriy A

    2016-03-01

    The O-polysaccharide (O-antigen) was isolated from the lipopolysaccharide of Escherichia coli O137 and studied by sugar analysis and NMR spectroscopy. The following structure of the branched tetrasaccharide repeating unit was established: Formula: see text] Both structure and gene cluster of the E. coli O137 polysaccharide are related to those of the E. coli K40 polysaccharide (Amor et al., 1999), which lacks the side-chain glucosylation but contains serine that is amide-linked to GlcA. Functions of genes in the O137-antigen gene cluster were assigned by a comparison with those in K40 and sequences in the available databases. Particularly, predicted glycosyltransferases encoded in the gene cluster were assigned to the formation of three glycosidic linkages in the O-polysaccharide repeating unit. PMID:26845703

  19. Localization and mapping of CO/sub 2/ fixation genes within two gene clusters in Rhodobacter sphaeroides

    SciTech Connect

    Gibson, J.L.; Tabita, F.R.

    1988-05-01

    Two fructose 1,6-bisphosphatase structural genes (fbpA and fbpB) have been identified within two unlinked gene clusters that were previously shown to contain the Rhodobacter sphaeroides sequences that code form I and form II ribulose 1,5-bisphosphate carboxylase-oxygenase and phosphoribulokinase. The fbpA and fbpB genes were localized to a region immediately upstream from the corresponding prkA and prkB sequences and were found to be transcribed in the same direction as the phosphoribulokinase and ribulose 1,5-bisphosphate carboxylase-oxygenase genes based on inducible expression of fructose 1,6-bisphosphatase activity directed by the lac promoter. A recombinant plasmid was constructed that contained the tandem fbpA and prkA genes inserted downstream from the lac promoter in plasmid pUC18. Both gene products were expressed in Escherichia coli upon induction of transcription with isopropyl ..beta..-D-thiogalactoside, demonstrating that the two genes can be cotranscribed. A Zymomonas mobilis glyceraldehyde 3-phosphate-dehydrogenase gene (gap) hybridized to a DNA sequence located approximately 1 kilobase upstream from the form II ribulose 1,5-bisphosphate carboxylase-oxygenase gene. Although no corresponding gap sequence was found within the form I gene cluster, an additional region of homology was detected immediately upstream from the sequences that encode the form I and form II ribulose 1,5-bisphosphate carboxylase-oxygenases.

  20. Cloning and characterization of a Pseudomonas mendocina KR1 gene cluster encoding toluene-4-monooxygenase

    SciTech Connect

    Kwangmu Yen; Karl, M.R.; Blatt, L.M.; Simon, M.J.; Winter, R.B.; Fausset, P.R.; Lu, H.S.; Harcourt, A.A.; Chen, K.K. )

    1991-09-01

    Pseudomonas mendocina KR1 metabolizes toluene as a carbon source by a previously unknown pathway. The initial step of the pathway is hydroxylation of toluene to form p-cresol by a multicomponent toluene-4-monooxygenase (T4MO) system. The authors have cloned and characterized a gene cluster from KR 1 that determines the T4MO activity. To clone the T4MO genes, KR1 DNA libraries were constructed in Escherichia coli HB 101 by using a broad-host-range vector and transferred to a KR1 mutant able to grow on p-cresol but no on toluene. An insert consisting of two SacI fragments of identical size was shown to complement the mutant for growth on toluene. One of the SacI fragments, when cloned into the E. coli vector pUC19, was found to direct the synthesis of indigo dye. The indigo-forming property was correlated with the presence of T4MO activity. The T4MO genes were mapped to a 3.6-kb region, and the direction of transcription was determined. DNA sequencing and N-terminal amino acid determination identified a five-gene cluster, tmoABCDE, within this region. Expression of this cluster carrying a single mutation in each gene demonstrated that each of the five genes is essential for T4MO activity. Other evidence presented indicated that none of the tmo genes was involved in the regulation of the tmo gene cluster, in the control of substrate transport of the T4MO system, or in major processing of the products of the tmo genes. It was tentatively concluded that the tmoABCDE genes encode structural polypeptides of the T4MO enzyme system. One of the tmo genes was tentatively identified as a ferredoxin gene.

  1. Clusters of Antibiotic Resistance Genes Enriched Together Stay Together in Swine Agriculture

    PubMed Central

    Johnson, Timothy A.; Stedtfeld, Robert D.; Wang, Qiong; Cole, James R.; Hashsham, Syed A.; Looft, Torey; Zhu, Yong-Guan

    2016-01-01

    ABSTRACT   Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundance of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk. PMID:27073098

  2. The Genetic and Molecular Organization of the Dopa Decarboxylase Gene Cluster of Drosophila Melanogaster

    PubMed Central

    Stathakis, D. G.; Pentz, E. S.; Freeman, M. E.; Kullman, J.; Hankins, G. R.; Pearlson, N. J.; Wright, TRF.

    1995-01-01

    We report the complete molecular organization of the Dopa decarboxylase gene cluster. Mutagenesis screens recovered 77 new Df(2L)TW130 recessive lethal mutations. These new alleles combined with 263 previously isolated mutations in the cluster to define 18 essential genes. In addition, seven new deficiencies were isolated and characterized. Deficiency mapping, restriction fragment length polymorphism (RFLP) analysis and P-element-mediated germline transformation experiments determined the gene order for all 18 loci. Genomic and cDNA restriction endonuclease mapping, Northern blot analysis and DNA sequencing provided information on exact gene location, mRNA size and transcriptional direction for most of these loci. In addition, this analysis identified two transcription units that had not previously been identified by extensive mutagenesis screening. Most of the loci are contained within two dense subclusters. We discuss the effectiveness of mutagens and strategies used in our screens, the variable mutability of loci within the genome of Drosophila melanogaster, the cytological and molecular organization of the Ddc gene cluster, the validity of the one band-one gene hypothesis and a possible purpose for the clustering of genes in the Ddc region. PMID:8647399

  3. Enteropathogenic Escherichia coli: identification of a gene cluster coding for bundle-forming pilus morphogenesis.

    PubMed Central

    Sohel, I; Puente, J L; Ramer, S W; Bieber, D; Wu, C Y; Schoolnik, G K

    1996-01-01

    Sequence flanking the bfpA locus on the enteroadherent factor plasmid of the enteropathogenic Escherichia coli (EPEC) strain B171-8 (O111:NM) was obtained to identify genes that might be required for bundle-forming pilus (BFP) biosynthesis. Deletion experiments led to the identification of a contiguous cluster of at least 12 open reading frames, including bfpA, that could direct the synthesis of a morphologically normal BFP filament. Within the bfp gene cluster, we identified open reading frames that share homology with other type IV pilus accessory genes and with genes required for transformation competence and protein secretion. Immediately upstream of the bfp gene cluster, we identified a potential replication origin including genes that are predicted to encode proteins homologous with replicase and resolvase. Restriction fragment length polymorphism analysis of DNA from six additional EPEC serotypes showed that the organization of the bfp gene cluster and its juxtaposition with a potential plasmid origin of replication are highly conserved features of the EPEC biotype. PMID:8626330

  4. An effective hybrid approach of gene selection and classification for microarray data based on clustering and particle swarm optimization.

    PubMed

    Han, Fei; Yang, Shanxiu; Guan, Jian

    2015-01-01

    In this paper, a hybrid approach based on clustering and Particle Swarm Optimisation (PSO) is proposed to perform gene selection and classification for microarray data. In the new method, firstly, genes are partitioned into a predetermined number of clusters by K-means method. Since the genes in each cluster have much redundancy, Max-Relevance Min-Redundancy (mRMR) strategy is used to reduce redundancy of the clustered genes. Then, PSO is used to perform further gene selection from the remaining clustered genes. Because of its better generalisation performance with much faster convergence rate than other learning algorithms for neural networks, Extreme Learning Machine (ELM) is chosen to evaluate candidate gene subsets selected by PSO and perform samples classification in this study. The proposed method selects less redundant genes as well as increases prediction accuracy and its efficiency and effectiveness are verified by extensive comparisons with other classical methods on three open microarray data. PMID:26547970

  5. Epigenetic characterization of the growth hormone gene identifies SmcHD1 as a regulator of autosomal gene clusters.

    PubMed

    Massah, Shabnam; Hollebakken, Robert; Labrecque, Mark P; Kolybaba, Addie M; Beischlag, Timothy V; Prefontaine, Gratien G

    2014-01-01

    Regulatory elements for the mouse growth hormone (GH) gene are located distally in a putative locus control region (LCR) in addition to key elements in the promoter proximal region. The role of promoter DNA methylation for GH gene regulation is not well understood. Pit-1 is a POU transcription factor required for normal pituitary development and obligatory for GH gene expression. In mammals, Pit-1 mutations eliminate GH production resulting in a dwarf phenotype. In this study, dwarf mice illustrated that Pit-1 function was obligatory for GH promoter hypomethylation. By monitoring promoter methylation levels during developmental GH expression we found that the GH promoter became hypomethylated coincident with gene expression. We identified a promoter differentially methylated region (DMR) that was used to characterize a methylation-dependent DNA binding activity. Upon DNA affinity purification using the DMR and nuclear extracts, we identified structural maintenance of chromosomes hinge domain containing -1 (SmcHD1). To better understand the role of SmcHD1 in genome-wide gene expression, we performed microarray analysis and compared changes in gene expression upon reduced levels of SmcHD1 in human cells. Knock-down of SmcHD1 in human embryonic kidney (HEK293) cells revealed a disproportionate number of up-regulated genes were located on the X-chromosome, but also suggested regulation of genes on non-sex chromosomes. Among those, we identified several genes located in the protocadherin β cluster. In addition, we found that imprinted genes in the H19/Igf2 cluster associated with Beckwith-Wiedemann and Silver-Russell syndromes (BWS & SRS) were dysregulated. For the first time using human cells, we showed that SmcHD1 is an important regulator of imprinted and clustered genes. PMID:24818964

  6. Epigenetic Characterization of the Growth Hormone Gene Identifies SmcHD1 as a Regulator of Autosomal Gene Clusters

    PubMed Central

    Massah, Shabnam; Hollebakken, Robert; Labrecque, Mark P.; Kolybaba, Addie M.; Beischlag, Timothy V.; Prefontaine, Gratien G.

    2014-01-01

    Regulatory elements for the mouse growth hormone (GH) gene are located distally in a putative locus control region (LCR) in addition to key elements in the promoter proximal region. The role of promoter DNA methylation for GH gene regulation is not well understood. Pit-1 is a POU transcription factor required for normal pituitary development and obligatory for GH gene expression. In mammals, Pit-1 mutations eliminate GH production resulting in a dwarf phenotype. In this study, dwarf mice illustrated that Pit-1 function was obligatory for GH promoter hypomethylation. By monitoring promoter methylation levels during developmental GH expression we found that the GH promoter became hypomethylated coincident with gene expression. We identified a promoter differentially methylated region (DMR) that was used to characterize a methylation-dependent DNA binding activity. Upon DNA affinity purification using the DMR and nuclear extracts, we identified structural maintenance of chromosomes hinge domain containing -1 (SmcHD1). To better understand the role of SmcHD1 in genome-wide gene expression, we performed microarray analysis and compared changes in gene expression upon reduced levels of SmcHD1 in human cells. Knock-down of SmcHD1 in human embryonic kidney (HEK293) cells revealed a disproportionate number of up-regulated genes were located on the X-chromosome, but also suggested regulation of genes on non-sex chromosomes. Among those, we identified several genes located in the protocadherin β cluster. In addition, we found that imprinted genes in the H19/Igf2 cluster associated with Beckwith-Wiedemann and Silver-Russell syndromes (BWS & SRS) were dysregulated. For the first time using human cells, we showed that SmcHD1 is an important regulator of imprinted and clustered genes. PMID:24818964

  7. Variation in the Trichothecene Mycotoxin Biosynthetic Gene Cluster in Fusarium

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Trichothecene mycotoxins are produced by some plant pathogenic species of the fungus Fusarium and can contribute to its virulence on some plants. In Fusarium graminearum and F. sporotrichioides trichothecene biosynthetic enzymes are encoded at three loci: the single-gene TRI101 locus; the two-gene ...

  8. Bacterial Biosynthetic Gene Clusters Encoding the Anti-cancer Haterumalide Class of Molecules

    PubMed Central

    Matilla, Miguel A.; Stöckmann, Henning; Leeper, Finian J.; Salmond, George P. C.

    2012-01-01

    Haterumalides are halogenated macrolides with strong antitumor properties, making them attractive targets for chemical synthesis. Unfortunately, current synthetic routes to these molecules are inefficient. The potent haterumalide, oocydin A, was previously identified from two plant-associated bacteria through its high bioactivity against plant pathogenic fungi and oomycetes. In this study, we describe oocydin A (ooc) biosynthetic gene clusters identified by genome sequencing, comparative genomics, and chemical analysis in four plant-associated enterobacteria of the Serratia and Dickeya genera. Disruption of the ooc gene cluster abolished oocydin A production and bioactivity against fungi and oomycetes. The ooc gene clusters span between 77 and 80 kb and encode five multimodular polyketide synthase (PKS) proteins, a hydroxymethylglutaryl-CoA synthase cassette and three flavin-dependent tailoring enzymes. The presence of two free-standing acyltransferase proteins classifies the oocydin A gene cluster within the growing family of trans-AT PKSs. The amino acid sequences and organization of the PKS domains are consistent with the chemical predictions and functional peculiarities associated with trans-acyltransferase PKS. Based on extensive in silico analysis of the gene cluster, we propose a biosynthetic model for the production of oocydin A and, by extension, for other members of the haterumalide family of halogenated macrolides exhibiting anti-cancer, anti-fungal, and other interesting biological properties. PMID:23012376

  9. Functional Gene Networks: R/Bioc package to generate and analyse gene networks derived from functional enrichment and clustering

    PubMed Central

    Aibar, Sara; Fontanillo, Celia; Droste, Conrad; De Las Rivas, Javier

    2015-01-01

    Summary: Functional Gene Networks (FGNet) is an R/Bioconductor package that generates gene networks derived from the results of functional enrichment analysis (FEA) and annotation clustering. The sets of genes enriched with specific biological terms (obtained from a FEA platform) are transformed into a network by establishing links between genes based on common functional annotations and common clusters. The network provides a new view of FEA results revealing gene modules with similar functions and genes that are related to multiple functions. In addition to building the functional network, FGNet analyses the similarity between the groups of genes and provides a distance heatmap and a bipartite network of functionally overlapping genes. The application includes an interface to directly perform FEA queries using different external tools: DAVID, GeneTerm Linker, TopGO or GAGE; and a graphical interface to facilitate the use. Availability and implementation: FGNet is available in Bioconductor, including a tutorial. URL: http://bioconductor.org/packages/release/bioc/html/FGNet.html Contact: jrivas@usal.es Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25600944

  10. The Eucalyptus grandis NBS-LRR Gene Family: Physical Clustering and Expression Hotspots

    PubMed Central

    Christie, Nanette; Tobias, Peri A.; Naidoo, Sanushka; Külheim, Carsten

    2016-01-01

    Eucalyptus grandis is a commercially important hardwood species and is known to be susceptible to a number of pests and pathogens. Determining mechanisms of defense is therefore a research priority. The published genome for E. grandis has aided the identification of one important class of resistance (R) genes that incorporate nucleotide binding sites and leucine-rich repeat domains (NBS-LRR). Using an iterative search process we identified NBS-LRR gene models within the E. grandis genome. We characterized the gene models and identified their genomic arrangement. The gene expression patterns were examined in E. grandis clones, challenged with a fungal pathogen (Chrysoporthe austroafricana) and insect pest (Leptocybe invasa). One thousand two hundred and fifteen putative NBS-LRR coding sequences were located which aligned into two large classes, Toll or interleukin-1 receptor (TIR) and coiled-coil (CC) based on NB-ARC domains. NBS-LRR gene-rich regions were identified with 76% organized in clusters of three or more genes. A further 272 putative incomplete resistance genes were also identified. We determined that E. grandis has a higher ratio of TIR to CC classed genes compared to other woody plant species as well as a smaller percentage of single NBS-LRR genes. Transcriptome profiles indicated expression hotspots, within physical clusters, including expression of many incomplete genes. The clustering of putative NBS-LRR genes correlates with differential expression responses in resistant and susceptible plants indicating functional relevance for the physical arrangement of this gene family. This analysis of the repertoire and expression of E. grandis putative NBS-LRR genes provides an important resource for the identification of novel and functional R-genes; a key objective for strategies to enhance resilience. PMID:26793216

  11. vanI: a novel D-Ala-D-Lac vancomycin resistance gene cluster found in Desulfitobacterium hafniense.

    PubMed

    Kruse, Thomas; Levisson, Mark; de Vos, Willem M; Smidt, Hauke

    2014-09-01

    The glycopeptide vancomycin was until recently considered a drug of last resort against Gram-positive bacteria. Increasing numbers of bacteria, however, are found to carry genes that confer resistance to this antibiotic. So far, 10 different vancomycin resistance clusters have been described. A chromosomal vancomycin resistance gene cluster was previously described for the anaerobic Desulfitobacterium hafniense Y51. We demonstrate that this gene cluster, characterized by its d-Ala-d-Lac ligase-encoding vanI gene, is present in all strains of D. hafniense, D. chlororespirans and some strains of Desulfosporosinus spp. This gene cluster was not found in vancomycin-sensitive Desulfitobacterium or Desulfosporosinus spp., and we show that this antibiotic resistance can be exploited as an intrinsic selection marker for Desulfitobacterium hafniense and D. chlororespirans. The gene cluster containing vanI is phylogenetically only distantly related with those described from soil and gut bacteria, but clusters instead with vancomycin resistance genes found within the phylum Actinobacteria that include several vancomycin-producing bacteria. It lacks a vanH homologue, encoding a D-lactate dehydrogenase, previously thought to always be present within vancomycin resistance gene clusters. The location of vanH outside the resistance gene cluster likely hinders horizontal gene transfer. Hence, the vancomycin resistance cluster in D. hafniense should be regarded a novel one that we here designated vanI after its unique d-Ala-d-Lac ligase. PMID:25042042

  12. Regulation of Three Nitrogenase Gene Clusters in the Cyanobacterium Anabaena variabilis ATCC 29413

    PubMed Central

    Thiel, Teresa; Pratte, Brenda S.

    2014-01-01

    The filamentous cyanobacterium Anabaena variabilis ATCC 29413 fixes nitrogen under aerobic conditions in specialized cells called heterocysts that form in response to an environmental deficiency in combined nitrogen. Nitrogen fixation is mediated by the enzyme nitrogenase, which is very sensitive to oxygen. Heterocysts are microxic cells that allow nitrogenase to function in a filament comprised primarily of vegetative cells that produce oxygen by photosynthesis. A. variabilis is unique among well-characterized cyanobacteria in that it has three nitrogenase gene clusters that encode different nitrogenases, which function under different environmental conditions. The nif1 genes encode a Mo-nitrogenase that functions only in heterocysts, even in filaments grown anaerobically. The nif2 genes encode a different Mo-nitrogenase that functions in vegetative cells, but only in filaments grown under anoxic conditions. An alternative V-nitrogenase is encoded by vnf genes that are expressed only in heterocysts in an environment that is deficient in Mo. Thus, these three nitrogenases are expressed differentially in response to environmental conditions. The entire nif1 gene cluster, comprising at least 15 genes, is primarily under the control of the promoter for the first gene, nifB1. Transcriptional control of many of the downstream nif1 genes occurs by a combination of weak promoters within the coding regions of some downstream genes and by RNA processing, which is associated with increased transcript stability. The vnf genes show a similar pattern of transcriptional and post-transcriptional control of expression suggesting that the complex pattern of regulation of the nif1 cluster is conserved in other cyanobacterial nitrogenase gene clusters. PMID:25513762

  13. Sequencing and mapping hemoglobin gene clusters in the australian model dasyurid marsupial sminthopsis macroura

    SciTech Connect

    De Leo, A.A.; Wheeler, D.; Lefevre, C.; Cheng, Jan-Fang; Hope, R.; Kuliwaba, J.; Nicholas, K.R.; Westermanc, M.; Graves, J.A.M.

    2004-07-26

    Comparing globin genes and their flanking sequences across many species has allowed globin gene evolution to be reconstructed in great detail. Marsupial globin sequences have proved to be of exceptional significance. A previous finding of a beta-like omega gene in the alpha cluster in the tammar wallaby suggested that the alpha and beta cluster evolved via genome duplication and loss rather than tandem duplication. To confirm and extend this important finding we isolated and sequenced BACs containing the alpha and beta loci from the distantly related Australian marsupial Sminthopsis macroura. We report that the alpha gene lies in the same BAC as the beta-like omega gene, implying that the alpha-omega juxtaposition is likely to be conserved in all marsupials. The LUC7L gene was found 3' of the S. macroura alpha locus, a gene order shared with humans but not mouse, chicken or fugu. Sequencing a BAC contig that contained the S. macroura beta globin and epsilon globin loci showed that the globin cluster is flanked by olfactory genes, demonstrating a gene arrangement conserved for over 180 MY. Analysis of the region 5' to the S. macroura epsilon globin gene revealed a region similar to the eutherian LCR, containing sequences and potential transcription factor binding sites with homology to eutherian hypersensitive sites 1 to 5. FISH mapping of BACs containing S. macroura alpha and beta globin genes located the beta globin cluster on chromosome 3q and the alpha locus close to the centromere on 1q, resolving contradictory map locations obtained by previous radioactive in situ hybridization.

  14. Regulation of Three Nitrogenase Gene Clusters in the Cyanobacterium Anabaena variabilis ATCC 29413.

    PubMed

    Thiel, Teresa; Pratte, Brenda S

    2014-01-01

    The filamentous cyanobacterium Anabaena variabilis ATCC 29413 fixes nitrogen under aerobic conditions in specialized cells called heterocysts that form in response to an environmental deficiency in combined nitrogen. Nitrogen fixation is mediated by the enzyme nitrogenase, which is very sensitive to oxygen. Heterocysts are microxic cells that allow nitrogenase to function in a filament comprised primarily of vegetative cells that produce oxygen by photosynthesis. A. variabilis is unique among well-characterized cyanobacteria in that it has three nitrogenase gene clusters that encode different nitrogenases, which function under different environmental conditions. The nif1 genes encode a Mo-nitrogenase that functions only in heterocysts, even in filaments grown anaerobically. The nif2 genes encode a different Mo-nitrogenase that functions in vegetative cells, but only in filaments grown under anoxic conditions. An alternative V-nitrogenase is encoded by vnf genes that are expressed only in heterocysts in an environment that is deficient in Mo. Thus, these three nitrogenases are expressed differentially in response to environmental conditions. The entire nif1 gene cluster, comprising at least 15 genes, is primarily under the control of the promoter for the first gene, nifB1. Transcriptional control of many of the downstream nif1 genes occurs by a combination of weak promoters within the coding regions of some downstream genes and by RNA processing, which is associated with increased transcript stability. The vnf genes show a similar pattern of transcriptional and post-transcriptional control of expression suggesting that the complex pattern of regulation of the nif1 cluster is conserved in other cyanobacterial nitrogenase gene clusters. PMID:25513762

  15. Breaking the Silence: Protein Stabilization Uncovers Silenced Biosynthetic Gene Clusters in the Fungus Aspergillus nidulans

    PubMed Central

    Gerke, Jennifer; Bayram, Özgür; Feussner, Kirstin; Landesfeind, Manuel; Shelest, Ekaterina; Feussner, Ivo

    2012-01-01

    The genomes of filamentous fungi comprise numerous putative gene clusters coding for the biosynthesis of chemically and structurally diverse secondary metabolites (SMs), which are rarely expressed under laboratory conditions. Previous approaches to activate these genes were based primarily on artificially targeting the cellular protein synthesis apparatus. Here, we applied an alternative approach of genetically impairing the protein degradation apparatus of the model fungus Aspergillus nidulans by deleting the conserved eukaryotic csnE/CSN5 deneddylase subunit of the COP9 signalosome. This defect in protein degradation results in the activation of a previously silenced gene cluster comprising a polyketide synthase gene producing the antibiotic 2,4-dihydroxy-3-methyl-6-(2-oxopropyl)benzaldehyde (DHMBA). The csnE/CSN5 gene is highly conserved in fungi, and therefore, the deletion is a feasible approach for the identification of new SMs. PMID:23001671

  16. Cloning and characterization of the goadsporin biosynthetic gene cluster from Streptomyces sp. TP-A0584.

    PubMed

    Onaka, Hiroyasu; Nakaho, Mizuho; Hayashi, Keiko; Igarashi, Yasuhiro; Furumai, Tamotsu

    2005-12-01

    The biosynthetic gene cluster of goadsporin, a polypeptide antibiotic containing thiazole and oxazole rings, was cloned from Streptomyces sp. TP-A0584. The cluster contains a structural gene, godA, and nine god (goadsporin) genes involved in post-translational modification, immunity and transcriptional regulation. Although the gene organization is similar to typical bacteriocin biosynthetic gene clusters, each goadsporin biosynthetic gene shows low homology to these genes. Goadsporin biosynthesis is initiated by the translation of godA, and the subsequent cyclization, dehydration and acetylation are probably catalysed by godD, godE, godF, godG and godH gene products. godI shows high similarity to the 54 kDa subunit of the signal recognition particle and plays an important role in goadsporin immunity. Furthermore, four goadsporin analogues were produced by site-directed mutagenesis of godA, suggesting that this biosynthesis machinery is used for the heterocyclization of peptides. PMID:16339937

  17. A remarkably stable TipE gene cluster: evolution of insect Para sodium channel auxiliary subunits

    PubMed Central

    2011-01-01

    Background First identified in fruit flies with temperature-sensitive paralysis phenotypes, the Drosophila melanogaster TipE locus encodes four voltage-gated sodium (NaV) channel auxiliary subunits. This cluster of TipE-like genes on chromosome 3L, and a fifth family member on chromosome 3R, are important for the optional expression and functionality of the Para NaV channel but appear quite distinct from auxiliary subunits in vertebrates. Here, we exploited available arthropod genomic resources to trace the origin of TipE-like genes by mapping their evolutionary histories and examining their genomic architectures. Results We identified a remarkably conserved synteny block of TipE-like orthologues with well-maintained local gene arrangements from 21 insect species. Homologues in the water flea, Daphnia pulex, suggest an ancestral pancrustacean repertoire of four TipE-like genes; a subsequent gene duplication may have generated functional redundancy allowing gene losses in the silk moth and mosquitoes. Intronic nesting of the insect TipE gene cluster probably occurred following the divergence from crustaceans, but in the flour beetle and silk moth genomes the clusters apparently escaped from nesting. Across Pancrustacea, TipE gene family members have experienced intronic nesting, escape from nesting, retrotransposition, translocation, and gene loss events while generally maintaining their local gene neighbourhoods. D. melanogaster TipE-like genes exhibit coordinated spatial and temporal regulation of expression distinct from their host gene but well-correlated with their regulatory target, the Para NaV channel, suggesting that functional constraints may preserve the TipE gene cluster. We identified homology between TipE-like NaV channel regulators and vertebrate Slo-beta auxiliary subunits of big-conductance calcium-activated potassium (BKCa) channels, which suggests that ion channel regulatory partners have evolved distinct lineage-specific characteristics

  18. Steroid degradation gene cluster of Comamonas testosteroni consisting of 18 putative genes from meta-cleavage enzyme gene tesB to regulator gene tesR.

    PubMed

    Horinouchi, Masae; Kurita, Tomokazu; Yamamoto, Takako; Hatori, Emi; Hayashi, Toshiaki; Kudo, Toshiaki

    2004-11-12

    Steroid degradation genes of Comamonas testosteroni TA441 are encoded in at least two gene clusters: one containing the meta-cleavage enzyme gene tesB and ORF1, 2, 3; and another consisting of ORF18, 17, tesI, H, A2, and tesA1, D, E, F, G (tesA2 to ORF18 and tesA1 to tesG are encoded in opposite directions). Analysis of transposon mutants with low steroid degradation revealed 13 ORFs and a gene (ORF4, 5, 21, 22, 23, 25, 26, 27, 28, 30, 31, 32, 33, and tesR) involved in steroid degradation in the downstream region of ORF3. TesR, which is almost identical to that of TeiR, a positive regulator of Delta1-dehydrogenase (corresponds to TesH in TA441) and 3alpha-dehydrogenase (currently not identified in TA441), in C. testosteroni ATCC11996 (Pruneda-Paz, 2004), was shown to be necessary for induction of the steroid degradation gene clusters identified in TA441, tesB to tesR, tesA1 to tesG, and tesA2 to ORF18. At least some of the ORFs from ORF3 to ORF33 were suggested to be involved in 9,17-dioxo-1,2,3,4,10,19-hexanorandrostan-5-oic acid degradation. PMID:15474469

  19. Isolation and characterization of meridamycin biosynthetic gene cluster from Streptomyces sp. NRRL 30748.

    PubMed

    He, Min; Haltli, Bradley; Summers, Mia; Feng, Xidong; Hucul, John

    2006-08-01

    Meridamycin is a non-immunosuppressive, FKBP12-binding natural macrolide with potential therapeutic applications in a variety of medical conditions. To set the stage for structural modification of meridamycin by genetic engineering, we have cloned and completely sequenced approximately 117 kb of DNA encompassing the meridamycin biosynthetic gene cluster from the producing strain, Streptomyces sp. NRRL 30748. Clustered in the center of the cloned DNA stretch are six genes responsible for the construction of the core structure of meridamycin, including merP encoding a non-ribosomal peptide synthase for pipecolate-incorporation, four PKS genes (merA-D) together encoding 1 loading module and 14 extension modules, and merE encoding a cytochrome P450 monooxygenase. A number of genes with potential pathway-specific regulatory or resistance functions have also been identified. The absence of the gene encoding lysine cyclodeaminase in the sequenced gene cluster and the rest of the genome of NRRL 30748 indicated the synthesis of pipecolate in this strain is not through the common lysine cyclodeamination route previously described for rapamycin and FK506/FK520 biosynthesis. An efficient conjugation method has been developed for Streptomyces sp. NRRL 30748 to facilitate the genetic manipulation of meridamycin biosynthetic gene cluster. Disruption of merP resulted in the complete abolition of meridamycin production, proving the identity of the gene cluster. A novel meridamycin analogue, C36-keto-meridamycin, has been successfully generated through deletion of a DNA fragment encoding KR1 domain of MerA from the chromosomal DNA. PMID:16806745

  20. Physical and genetic map of the major nif gene cluster from Azotobacter vinelandii.

    PubMed

    Jacobson, M R; Brigle, K E; Bennett, L T; Setterquist, R A; Wilson, M S; Cash, V L; Beynon, J; Newton, W E; Dean, D R

    1989-02-01

    Determination of a 28,793-base-pair DNA sequence of a region from the Azotobacter vinelandii genome that includes and flanks the nitrogenase structural gene region was completed. This information was used to revise the previously proposed organization of the major nif cluster. The major nif cluster from A. vinelandii encodes 15 nif-specific genes whose products bear significant structural identity to the corresponding nif-specific gene products from Klebsiella pneumoniae. These genes include nifH, nifD, nifK, nifT, nifY, nifE, nifN, nifX, nifU, nifS, nifV, nifW, nifZ, nifM, and nifF. Although there are significant spatial differences, the identified A. vinelandii nif-specific genes have the same sequential arrangement as the corresponding nif-specific genes from K. pneumoniae. Twelve other potential genes whose expression could be subject to nif-specific regulation were also found interspersed among the identified nif-specific genes. These potential genes do not encode products that are structurally related to the identified nif-specific gene products. Eleven potential nif-specific promoters were identified within the major nif cluster, and nine of these are preceded by an appropriate upstream activator sequence. A + T-rich regions were identified between 8 of the 11 proposed nif promoter sequences and their upstream activator sequences. Site-directed deletion-and-insertion mutagenesis was used to establish a genetic map of the major nif cluster. PMID:2644218

  1. Physical and genetic map of the major nif gene cluster from Azotobacter vinelandii.

    PubMed Central

    Jacobson, M R; Brigle, K E; Bennett, L T; Setterquist, R A; Wilson, M S; Cash, V L; Beynon, J; Newton, W E; Dean, D R

    1989-01-01

    Determination of a 28,793-base-pair DNA sequence of a region from the Azotobacter vinelandii genome that includes and flanks the nitrogenase structural gene region was completed. This information was used to revise the previously proposed organization of the major nif cluster. The major nif cluster from A. vinelandii encodes 15 nif-specific genes whose products bear significant structural identity to the corresponding nif-specific gene products from Klebsiella pneumoniae. These genes include nifH, nifD, nifK, nifT, nifY, nifE, nifN, nifX, nifU, nifS, nifV, nifW, nifZ, nifM, and nifF. Although there are significant spatial differences, the identified A. vinelandii nif-specific genes have the same sequential arrangement as the corresponding nif-specific genes from K. pneumoniae. Twelve other potential genes whose expression could be subject to nif-specific regulation were also found interspersed among the identified nif-specific genes. These potential genes do not encode products that are structurally related to the identified nif-specific gene products. Eleven potential nif-specific promoters were identified within the major nif cluster, and nine of these are preceded by an appropriate upstream activator sequence. A + T-rich regions were identified between 8 of the 11 proposed nif promoter sequences and their upstream activator sequences. Site-directed deletion-and-insertion mutagenesis was used to establish a genetic map of the major nif cluster. PMID:2644218

  2. K-Boost: a scalable algorithm for high-quality clustering of microarray gene expression data.

    PubMed

    Geraci, Filippo; Leoncini, Mauro; Montangero, Manuela; Pellegrini, Marco; Renda, M Elena

    2009-06-01

    Microarray technology for profiling gene expression levels is a popular tool in modern biological research. Applications range from tissue classification to the detection of metabolic networks, from drug discovery to time-critical personalized medicine. Given the increase in size and complexity of the data sets produced, their analysis is becoming problematic in terms of time/quality trade-offs. Clustering genes with similar expression profiles is a key initial step for subsequent manipulations and the increasing volumes of data to be analyzed requires methods that are at the same time efficient (completing an analysis in minutes rather than hours) and effective (identifying significant clusters with high biological correlations). In this paper, we propose K-Boost, a clustering algorithm based on a combination of the furthest-point-first (FPF) heuristic for solving the metric k-center problem, a stability-based method for determining the number of clusters, and a k-means-like cluster refinement. K-Boost runs in O (|N| x k) time, where N is the input matrix and k is the number of proposed clusters. Experiments show that this low complexity is usually coupled with a very good quality of the computed clusterings, which we measure using both internal and external criteria. Supporting data can be found as online Supplementary Material at www.liebertonline.com. PMID:19522668

  3. Comprehensive annotation of secondary metabolite biosynthetic genes and gene clusters of Aspergillus nidulans, A. fumigatus, A. niger and A. oryzae

    PubMed Central

    2013-01-01

    Background Secondary metabolite production, a hallmark of filamentous fungi, is an expanding area of research for the Aspergilli. These compounds are potent chemicals, ranging from deadly toxins to therapeutic antibiotics to potential anti-cancer drugs. The genome sequences for multiple Aspergilli have been determined, and provide a wealth of predictive information about secondary metabolite production. Sequence analysis and gene overexpression strategies have enabled the discovery of novel secondary metabolites and the genes involved in their biosynthesis. The Aspergillus Genome Database (AspGD) provides a central repository for gene annotation and protein information for Aspergillus species. These annotations include Gene Ontology (GO) terms, phenotype data, gene names and descriptions and they are crucial for interpreting both small- and large-scale data and for aiding in the design of new experiments that further Aspergillus research. Results We have manually curated Biological Process GO annotations for all genes in AspGD with recorded functions in secondary metabolite production, adding new GO terms that specifically describe each secondary metabolite. We then leveraged these new annotations to predict roles in secondary metabolism for genes lacking experimental characterization. As a starting point for manually annotating Aspergillus secondary metabolite gene clusters, we used antiSMASH (antibiotics and Secondary Metabolite Analysis SHell) and SMURF (Secondary Metabolite Unknown Regions Finder) algorithms to identify potential clusters in A. nidulans, A. fumigatus, A. niger and A. oryzae, which we subsequently refined through manual curation. Conclusions This set of 266 manually curated secondary metabolite gene clusters will facilitate the investigation of novel Aspergillus secondary metabolites. PMID:23617571

  4. Delineation of metabolic gene clusters in plant genomes by chromatin signatures

    PubMed Central

    Yu, Nan; Nützmann, Hans-Wilhelm; MacDonald, James T.; Moore, Ben; Field, Ben; Berriri, Souha; Trick, Martin; Rosser, Susan J.; Kumar, S. Vinod; Freemont, Paul S.; Osbourn, Anne

    2016-01-01

    Plants are a tremendous source of diverse chemicals, including many natural product-derived drugs. It has recently become apparent that the genes for the biosynthesis of numerous different types of plant natural products are organized as metabolic gene clusters, thereby unveiling a highly unusual form of plant genome architecture and offering novel avenues for discovery and exploitation of plant specialized metabolism. Here we show that these clustered pathways are characterized by distinct chromatin signatures of histone 3 lysine trimethylation (H3K27me3) and histone 2 variant H2A.Z, associated with cluster repression and activation, respectively, and represent discrete windows of co-regulation in the genome. We further demonstrate that knowledge of these chromatin signatures along with chromatin mutants can be used to mine genomes for cluster discovery. The roles of H3K27me3 and H2A.Z in repression and activation of single genes in plants are well known. However, our discovery of highly localized operon-like co-regulated regions of chromatin modification is unprecedented in plants. Our findings raise intriguing parallels with groups of physically linked multi-gene complexes in animals and with clustered pathways for specialized metabolism in filamentous fungi. PMID:26895889

  5. Delineation of metabolic gene clusters in plant genomes by chromatin signatures.

    PubMed

    Yu, Nan; Nützmann, Hans-Wilhelm; MacDonald, James T; Moore, Ben; Field, Ben; Berriri, Souha; Trick, Martin; Rosser, Susan J; Kumar, S Vinod; Freemont, Paul S; Osbourn, Anne

    2016-03-18

    Plants are a tremendous source of diverse chemicals, including many natural product-derived drugs. It has recently become apparent that the genes for the biosynthesis of numerous different types of plant natural products are organized as metabolic gene clusters, thereby unveiling a highly unusual form of plant genome architecture and offering novel avenues for discovery and exploitation of plant specialized metabolism. Here we show that these clustered pathways are characterized by distinct chromatin signatures of histone 3 lysine trimethylation (H3K27me3) and histone 2 variant H2A.Z, associated with cluster repression and activation, respectively, and represent discrete windows of co-regulation in the genome. We further demonstrate that knowledge of these chromatin signatures along with chromatin mutants can be used to mine genomes for cluster discovery. The roles of H3K27me3 and H2A.Z in repression and activation of single genes in plants are well known. However, our discovery of highly localized operon-like co-regulated regions of chromatin modification is unprecedented in plants. Our findings raise intriguing parallels with groups of physically linked multi-gene complexes in animals and with clustered pathways for specialized metabolism in filamentous fungi. PMID:26895889

  6. Mapping of the {alpha}{sub 4} subunit gene (GABRA4) to human chromosome 4 defines an {alpha}{sub 2}-{alpha}{sub 4}-{beta}{sub 1}-{gamma}{sub 1} gene cluster: Further evidence that modern GABA{sub a} receptor gene clusters are derived from an ancestral cluster

    SciTech Connect

    McLean, P.J.; Farb, D.H.; Russek, S.J.

    1995-04-10

    We demonstrated previously that an {alpha}{sub 1}-{beta}{sub 2}-{gamma}{sub 2} gene cluster of the {gamma}-aminobutyric acid (GABA{sub A}) receptor is located on human chromosome 5q34-q35 and that an ancestral {alpha}-{beta}-{gamma} gene cluster probably spawned clusters on chromosomes 4, 5, and 15. Here, we report that the {alpha}{sub 4} gene (GABRA4) maps to human chromosome 4p14-q12, defining a cluster comprising the {alpha}{sub 2}, {alpha}{sub 4}, {beta}{sub 1}, and {gamma}{sub 1} genes. The existence of an {alpha}{sub 2}-{alpha}{sub 4}-{beta}{sub 1}-{gamma}{sub 2} cluster on chromosome 4 and an {alpha}{sub 1}-{alpha}{sub 6}-{beta}{sub 2}-{gamma}{sub 2} cluster on chromosome 5 provides further evidence that the number of ancestral GABA{sub A} receptor subunit genes has been expanded by duplication within an ancestral gene cluster. Moreover, if duplication of the {alpha} gene occurred before duplication of the ancestral gene cluster, then a heretofore undiscovered subtype of a subunit should be located on human chromosome 15q11-q13 within an {alpha}{sub 5}-{alpha}{sub x}-{beta}{sub 3}-{gamma}{sub 3} gene cluster at the locus for Angelman and Prader-Willi syndromes. 34 refs., 6 figs., 1 tab.

  7. DNase I hypersensitive sites within the inducible qa gene cluster of Neurospora crassa.

    PubMed Central

    Baum, J A; Giles, N H

    1986-01-01

    DNase I hypersensitive regions were mapped within the 17.3-kilobase qa (quinic acid) gene cluster of Neurospora crassa. The 5'-flanking regions of the five qa structural genes and the two qa regulatory genes each contain DNase I hypersensitive sites under noninducing conditions and generally exhibit increases in DNase I cleavage upon induction of transcription with quinic acid. The two large intergenic regions of the qa gene cluster appear to be similarly organized with respect to the positions of constitutive and inducible DNase I hypersensitive sites. Inducible hypersensitive sites on the 5' side of one qa gene, qa-x, appear to be differentially regulated. Employing these and previously published data, we have identified a conserved sequence element that may mediate the activator function of the qa-1F regulatory gene. Variants of the 16-base-pair consensus sequence are consistently found within DNase I-protected regions adjacent to inducible DNase I hypersensitive sites within the gene cluster. Images PMID:2944110

  8. Organization of the human keratin type II gene cluster at 12q13

    SciTech Connect

    Yoon, S.J.; LeBlanc-Straceski, J.; Krauter, K.

    1994-12-01

    Keratin proteins constitute intermediate filaments and are the major differentiation products of mammalian epithelial cells. The epithelial keratins are classified into two groups, type I and type II, and one member of each group is expressed in a given epithelial cell differentiation stage. Mutations in type I and type II keratin genes have now been implicated in three different human genetic disorders, epidermolysis bullosa simplex, epidermolytic hyperkeratosis, and epidermolytic palmoplantar keratoderma. Members of the type I keratins are mapped to human chromosome 17, and the type II keratin genes are mapped to chromosome 12. To understand the organization of the type II keratin genes on chromosome 12, we isolated several yeast artificial chromosomes carrying these keratin genes and examined them in detail. We show that eight already known type II keratin genes are located in a cluster at 12q13, and their relative organization reflects their evolutionary relationship. We also determined that a type I keratin gene, KRT8, is located next to its partner, KRT18, in this cluster. Careful examination of the cluster also revealed that there may be a number of additional keratin genes at this locus that have not been described previously. 41 refs., 3 figs., 1 tab.

  9. Identification of certain cancer-mediating genes using Gaussian fuzzy cluster validity index.

    PubMed

    Ghosh, Anupam; De, Rajat K

    2015-10-01

    In this article, we have used an index, called Gaussian fuzzy index (GFI), recently developed by the authors, based on the notion of fuzzy set theory, for validating the clusters obtained by a clustering algorithm applied on cancer gene expression data. GFI is then used for the identification of genes that have altered quite significantly from normal state to carcinogenic state with respect to their mRNA expression patterns. The effectiveness of the methodology has been demonstrated on three gene expression cancer datasets dealing with human lung, colon and leukemia. The performance of GFI is compared with 19 exiting cluster validity indices. The results are appropriately validated biologically and statistically. In this context, we have used biochemical pathways, p-value statistics of GO attributes, t-test and zscore for the validation of the results. It has been reported that GFI is capable of identifying high-quality enriched clusters of genes, and thereby is able to select more cancer-mediating genes. PMID:26564976

  10. Isolation of Hox cluster genes from insects reveals an accelerated sequence evolution rate.

    PubMed

    Hadrys, Heike; Simon, Sabrina; Kaune, Barbara; Schmitt, Oliver; Schöner, Anja; Jakob, Wolfgang; Schierwater, Bernd

    2012-01-01

    Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda) that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera). We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx) from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution. PMID:22685537

  11. microRNAs in the Same Clusters Evolve to Coordinately Regulate Functionally Related Genes.

    PubMed

    Wang, Yirong; Luo, Junjie; Zhang, Hong; Lu, Jian

    2016-09-01

    MicroRNAs (miRNAs) are endogenously expressed small noncoding RNAs. The genomic locations of animal miRNAs are significantly clustered in discrete loci. We found duplication and de novo formation were important mechanisms to create miRNA clusters and the clustered miRNAs tend to be evolutionarily conserved. We proposed a "functional co-adaptation" model to explain how clustering helps newly emerged miRNAs survive and develop functions. We presented evidence that abundance of miRNAs in the same clusters were highly correlated and those miRNAs exerted cooperative repressive effects on target genes in human tissues. By transfecting miRNAs into human and fly cells and extensively profiling the transcriptome alteration with deep-sequencing, we further demonstrated the functional co-adaptation between new and old miRNAs in the miR-17-92 cluster. Our population genomic analysis suggest that positive Darwinian selection might be the driving force underlying the formation and evolution of miRNA clustering. Our model provided novel insights into mechanisms and evolutionary significance of miRNA clustering. PMID:27189568

  12. microRNAs in the Same Clusters Evolve to Coordinately Regulate Functionally Related Genes

    PubMed Central

    Wang, Yirong; Luo, Junjie; Zhang, Hong; Lu, Jian

    2016-01-01

    MicroRNAs (miRNAs) are endogenously expressed small noncoding RNAs. The genomic locations of animal miRNAs are significantly clustered in discrete loci. We found duplication and de novo formation were important mechanisms to create miRNA clusters and the clustered miRNAs tend to be evolutionarily conserved. We proposed a “functional co-adaptation” model to explain how clustering helps newly emerged miRNAs survive and develop functions. We presented evidence that abundance of miRNAs in the same clusters were highly correlated and those miRNAs exerted cooperative repressive effects on target genes in human tissues. By transfecting miRNAs into human and fly cells and extensively profiling the transcriptome alteration with deep-sequencing, we further demonstrated the functional co-adaptation between new and old miRNAs in the miR-17–92 cluster. Our population genomic analysis suggest that positive Darwinian selection might be the driving force underlying the formation and evolution of miRNA clustering. Our model provided novel insights into mechanisms and evolutionary significance of miRNA clustering. PMID:27189568

  13. IL-1 gene cluster is not linked to aggressive periodontitis.

    PubMed

    Scapoli, C; Borzani, I; Guarnelli, M E; Mamolini, E; Annunziata, M; Guida, L; Trombelli, L

    2010-05-01

    The interleukin-1 (IL-1) gene family has been associated with susceptibility to periodontal diseases, including aggressive periodontitis (AgP); however, the results are still conflicting. The present study investigated the association between IL-1 genes and AgP using 70 markers spanning the 1.1-Mb region, where the IL-1 gene family maps, and exploring both the linkage disequilibrium (LD) and the haplotype structure in a case-control study including 95 patients and 121 control individuals. No association between AgP and IL1A, IL1B, and IL1RN genes was found in either single-point or haplotype analyses. Also, the LD map of the region 2q13-14 under the Malécot model for multiple markers showed no causal association between AgP and polymorphisms within the region (p = 0.207). In conclusion, our findings failed to support the existence of a causative variant for generalized AgP within the 2q13-14 region in an Italian Caucasian population. PMID:20335539

  14. Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets

    PubMed Central

    2014-01-01

    Background Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. Results We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways. PMID:25221624

  15. Diversity and depth-specific distribution of SAR11 cluster rRNA genes from marine planktonic bacteria

    SciTech Connect

    Field, K.G.; Gordon, D.; Wright, T.

    1997-01-01

    Small-subunit (SSU) ribosomal DNA (rDNA) gene clusters are phylogenetically related sets of SSU rRNA genes, commonly encountered in genes amplified from natural populations. Genetic variability in gene clusters could result form artifacts (polymerase error or PCR chimera formation), microevolution (variation among rrn copies within strains), or macroevolution (genetic divergence correlated with long-term evolutionary divergence). To better understand gene clusters, this study assessed genetic diversity and distribution of a single environmental SSU rDNA gene cluster, the SAR11 cluster. SAR11 cluster genes, from an uncultured group of the {alpha} subclass of the class Proteobacteria, have been recovered from coastal and midoceanic waters of the North Atlantic and Pacific. We cloned and bidirectionally sequenced 23 new SAR11 cluster 16S rRNA genes, from 80 and 250 m im the Sargasso Sea and from surface coastal waters of the Atlantic and Pacific, and analyzed them with previously published sequences. Two SAR11 genes were obviously PCR chimeras, but the biological (nonchimeric) origins of most subgroups within the cluster were confirmed by independent recovery from separate gene libraries. Using group-specific oligonucleotide probes, we analyzed depth profiles of nucleic acids, targeting both amplified rDNAs and bulk RNAs. Two subgroups within the SAR11 cluster showed different highly depth-specific distributions. We conclude that some of the genetic diversity within the SAR11 gene cluster represents macroevolutionary divergence correlated with niche specialization. Furthermore, we demonstrate the utility for marine microbial ecology of oligonucleotide probes based on gene sequences amplified from natural populations and show that a detailed knowledge of sequence variability may be needed to effectively design these probes. 48 refs., 7 figs., 3 tabs.

  16. Teaching Gene Technology in an Outreach Lab: Students' Assigned Cognitive Load Clusters and the Clusters' Relationships to Learner Characteristics, Laboratory Variables, and Cognitive Achievement

    ERIC Educational Resources Information Center

    Scharfenberg, Franz-Josef; Bogner, Franz X.

    2013-01-01

    This study classified students into different cognitive load (CL) groups by means of cluster analysis based on their experienced CL in a gene technology outreach lab which has instructionally been designed with regard to CL theory. The relationships of the identified student CL clusters to learner characteristics, laboratory variables, and…

  17. The Magea gene cluster regulates male germ cell apoptosis without affecting the fertility in mice.

    PubMed

    Hou, Siyuan; Xian, Li; Shi, Peiliang; Li, Chaojun; Lin, Zhaoyu; Gao, Xiang

    2016-01-01

    While apoptosis is essential for male germ cell development, improper activation of apoptosis in the testis can affect spermatogenesis and cause reproduction defects. Members of the MAGE-A (melanoma antigen family A) gene family are frequently clustered in mammalian genomes and are exclusively expressed in the testes of normal animals but abnormally activated in a wide variety of cancers. We investigated the potential roles of these genes in spermatogenesis by generating a mouse model with a 210-kb genomic deletion encompassing six members of the Magea gene cluster (Magea1, Magea2, Magea3, Magea5, Magea6 and Magea8). Male mice carrying the deletion displayed smaller testes from 2 months old with a marked increase in apoptotic germ cells in the first wave of spermatogenesis. Furthermore, we found that Magea genes prevented stress-induced spermatogenic apoptosis after N-ethyl-N-nitrosourea (ENU) treatment during the adult stage. Mechanistically, deletion of the Magea gene cluster resulted in a dramatic increase in apoptotic germ cells, predominantly spermatocytes, with activation of p53 and induction of Bax in the testes. These observations demonstrate that the Magea genes are crucial in maintaining normal testicular size and protecting germ cells from excessive apoptosis under genotoxic stress. PMID:27226137

  18. The Magea gene cluster regulates male germ cell apoptosis without affecting the fertility in mice

    PubMed Central

    Hou, Siyuan; Xian, Li; Shi, Peiliang; Li, Chaojun; Lin, Zhaoyu; Gao, Xiang

    2016-01-01

    While apoptosis is essential for male germ cell development, improper activation of apoptosis in the testis can affect spermatogenesis and cause reproduction defects. Members of the MAGE-A (melanoma antigen family A) gene family are frequently clustered in mammalian genomes and are exclusively expressed in the testes of normal animals but abnormally activated in a wide variety of cancers. We investigated the potential roles of these genes in spermatogenesis by generating a mouse model with a 210-kb genomic deletion encompassing six members of the Magea gene cluster (Magea1, Magea2, Magea3, Magea5, Magea6 and Magea8). Male mice carrying the deletion displayed smaller testes from 2 months old with a marked increase in apoptotic germ cells in the first wave of spermatogenesis. Furthermore, we found that Magea genes prevented stress-induced spermatogenic apoptosis after N-ethyl-N-nitrosourea (ENU) treatment during the adult stage. Mechanistically, deletion of the Magea gene cluster resulted in a dramatic increase in apoptotic germ cells, predominantly spermatocytes, with activation of p53 and induction of Bax in the testes. These observations demonstrate that the Magea genes are crucial in maintaining normal testicular size and protecting germ cells from excessive apoptosis under genotoxic stress. PMID:27226137

  19. Duplication of partial spinosyn biosynthetic gene cluster in Saccharopolyspora spinosa enhances spinosyn production.

    PubMed

    Tang, Ying; Xia, Liqiu; Ding, Xuezhi; Luo, Yushuang; Huang, Fan; Jiang, Yuanwei

    2011-12-01

    Spinosyns, the secondary metabolites produced by Saccharopolyspora spinosa, are the active ingredients in a family of insect control agents. Most of the S. spinosa genes involved in spinosyn biosynthesis are found in a contiguous c. 74-kb cluster. To increase the spinosyn production through overexpression of their biosynthetic genes, part of its gene cluster (c. 18 kb) participating in the conversion of the cyclized polyketide to spinosyn was obtained by direct cloning via Red/ET recombination rather than by constructing and screening the genomic library. The resultant plasmid pUCAmT-spn was introduced into S. spinosa CCTCC M206084 from Escherichia coli S17-1 by conjugal transfer. The subsequent single-crossover homologous recombination caused a duplication of the partial gene cluster. Integration of this plasmid enhanced production of spinosyns with a total of 388 (± 25.0) mg L(-1) for spinosyns A and D in the exconjugant S. spinosa trans1 compared with 100 (± 7.7) mg L(-1) in the parental strain. Quantitative real time polymerase chain reaction analysis of three selected genes (spnH, spnI, and spnK) confirmed the positive effect of the overexpression of these genes on the spinosyn production. This study provides a simple avenue for enhancing spinosyn production. The strategies could also be used to improve the yield of other secondary metabolites. PMID:22092858

  20. Regulation of alkyl-dihydrothiazole-carboxylates (ATCs) by iron and the pyochelin gene cluster in Pseudomonas aeruginosa.

    PubMed

    Vinayavekhin, Nawaporn; Saghatelian, Alan

    2009-08-21

    Using the pyochelin (pch) gene cluster as an example, we demonstrate the utility of untargeted metabolomics in the discovery and characterization of secondary metabolites regulated by biosynthetic gene clusters. Comparison of the extracellular metabolomes of pch gene cluster mutants to the wild-type Pseudomonas aeruginosa (strain PA 14) identified 198 ions regulated by the pch genes. In addition to known metabolites, we characterized the structure of a pair of novel metabolites regulated by the pch gene cluster as 2-alkyl-4,5-dihydrothiazole-4-carboxylates (ATCs), using a combination of mass spectrometry, chemical synthesis, and stable isotope labeling. Subsequent assays revealed that ATCs bind iron and are regulated by iron levels in the media in a similar fashion as other metabolites associated with the pch gene cluster. Further genetic complementation and overexpression analyses of the pch genes revealed ATC production to be dependent on the pchE gene in the pch gene cluster. Overall, these studies highlight the ability of untargeted metabolomics to reveal regulatory connections between gene clusters and secondary metabolites, including novel metabolites. PMID:19621937

  1. Organization of the qa Gene Cluster in NEUROSPORA CRASSA: Direction of Transcription of the qa-3 Gene

    PubMed Central

    Strøman, Per; Reinert, William; Case, Mary E.; Giles, Norman H.

    1979-01-01

    In Neurospora crassa, the enzyme quinate (shikimate) dehydrogenase catalyzes the first reaction in the inducible quinic acid catabolic pathway and is encoded in the qa-3 gene of the qa cluster. In this cluster, the order of genes has been established as qa-1 qa-3 qa-4 qa-2. Amino-terminal sequences have been determined for purified quinate dehydrogenase from wild type and from UV-induced revertants in two different qa-3 mutants. These two mutants (M16 and M45) map at opposite ends of the qa-3 locus. In addition, mapping data (Case et al. 1978) indicate that the end of the qa-3 gene specified by M45 is closer to the adjacent qa-1 gene than is the end specified by the M16 mutant site. In one of the revertants (R45 from qa-3 mutant M45), the aminoterminal sequence for the first ten amino acids is identical to that of wild type. The other revertant (R1 from qa-3 mutant M16) differs from wild type at the amino-terminal end by a single altered residue at position three in the sequence. The observed change involves the substitution of an isoleucine in M16-R1 for a proline in wild type. This substitution requires a two-nucleotide change in the corresponding wild-type codon.——The combined genetic and biochemical data indicate that the qa-3 mutants M16 and M45 carry amino acid substitutions near the amino-terminal and carboxyl-terminal ends of the quinate dehydrogenase enzyme, respectively. On this basis we conclude that transcription of the qa-3 gene proceeds from the end specified by the M16 mutant site in the direction of the qa-1 gene. It appears probable that transcription is initiated from a promoter site within the qa cluster, possibly immediately adjacent to the qa-3 gene. PMID:159203

  2. Cloning of a Vibrio cholerae vibriobactin gene cluster: identification of genes required for early steps in siderophore biosynthesis.

    PubMed Central

    Wyckoff, E E; Stoebner, J A; Reed, K E; Payne, S M

    1997-01-01

    Vibrio cholerae secretes the catechol siderophore vibriobactin in response to iron limitation. Vibriobactin is structurally similar to enterobactin, the siderophore produced by Escherichia coli, and both organisms produce 2,3-dihydroxybenzoic acid (DHBA) as an intermediate in siderophore biosynthesis. To isolate and characterize V. cholerae genes involved in vibriobactin biosynthesis, we constructed a genomic cosmid bank of V. cholerae DNA and isolated clones that complemented mutations in E. coli enterobactin biosynthesis genes. V. cholerae homologs of entA, entB, entC, entD, and entE were identified on overlapping cosmid clones. Our data indicate that the vibriobactin genes are clustered, like the E. coli enterobactin genes, but the organization of the genes within these clusters is different. In this paper, we present the organization and sequences of genes involved in the synthesis and activation of DHBA. In addition, a V. cholerae strain with a chromosomal mutation in vibA was constructed by marker exchange. This strain was unable to produce vibriobactin or DHBA, confirming that in V. cholerae VibA catalyzes an early step in vibriobactin biosynthesis. PMID:9371453

  3. TreeParser-Aided Klee Diagrams Display Taxonomic Clusters in DNA Barcode and Nuclear Gene Datasets

    PubMed Central

    Stoeckle, Mark Y.; Coffran, Cameron

    2013-01-01

    Indicator vector analysis of a nucleotide sequence alignment generates a compact heat map, called a Klee diagram, with potential insight into clustering patterns in evolution. However, so far this approach has examined only mitochondrial cytochrome c oxidase I (COI) DNA barcode sequences. To further explore, we developed TreeParser, a freely-available web-based program that sorts a sequence alignment according to a phylogenetic tree generated from the dataset. We applied TreeParser to nuclear gene and COI barcode alignments from birds and butterflies. Distinct blocks in the resulting Klee diagrams corresponded to species and higher-level taxonomic divisions in both groups, and this enabled graphic comparison of phylogenetic information in nuclear and mitochondrial genes. Our results demonstrate TreeParser-aided Klee diagrams objectively display taxonomic clusters in nucleotide sequence alignments. This approach may help establish taxonomy in poorly studied groups and investigate higher-level clustering which appears widespread but not well understood. PMID:24022383

  4. Expanded Natural Product Diversity Revealed by Analysis of Lanthipeptide-Like Gene Clusters in Actinobacteria

    PubMed Central

    Zhang, Qi; Doroghazi, James R.; Zhao, Xiling; Walker, Mark C.

    2015-01-01

    Lanthionine-containing peptides (lanthipeptides) are a rapidly growing family of polycyclic peptide natural products belonging to the large class of ribosomally synthesized and posttranslationally modified peptides (RiPPs). Lanthipeptides are widely distributed in taxonomically distant species, and their currently known biosynthetic systems and biological activities are diverse. Building on the recent natural product gene cluster family (GCF) project, we report here large-scale analysis of lanthipeptide-like biosynthetic gene clusters from Actinobacteria. Our analysis suggests that lanthipeptide biosynthetic pathways, and by extrapolation the natural products themselves, are much more diverse than currently appreciated and contain many different posttranslational modifications. Furthermore, lanthionine synthetases are much more diverse in sequence and domain topology than currently characterized systems, and they are used by the biosynthetic machineries for natural products other than lanthipeptides. The gene cluster families described here significantly expand the chemical diversity and biosynthetic repertoire of lanthionine-related natural products. Biosynthesis of these novel natural products likely involves unusual and unprecedented biochemistries, as illustrated by several examples discussed in this study. In addition, class IV lanthipeptide gene clusters are shown not to be silent, setting the stage to investigate their biological activities. PMID:25888176

  5. Identification and functional analysis of brassicicene C biosynthetic gene cluster in Alternaria brassicicola.

    PubMed

    Minami, Atsushi; Tajima, Naoto; Higuchi, Yusuke; Toyomasu, Tomonobu; Sassa, Takeshi; Kato, Nobuo; Dairi, Tohru

    2009-02-01

    The biosynthetic gene cluster of brassicicene C was identified in Alternaria brassicicola strain ATCC 96836 from genome database search. In vivo and in vitro study clearly revealed the function of Orf8 and Orf6 as a fusicoccadiene synthase and methyltransferase, respectively. The understanding toward the biosynthetic pathway promises construction of this type of diterpene compounds with genetic engineering. PMID:19097780

  6. Characterization of the Tunicamycin Gene Cluster Unveiling Unique Steps Involved in its Biosynthesis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Tunicamycin, a potent reversible translocase I inhibitor, is produced by several Actinomycetes species. The tunicamycin structure is highly unusual, and contains an 11-carbon dialdose sugar and an aß-1,1-glycosidic linkage. Here we report the identification of a gene cluster essential for tunicamy...

  7. Histone and ribosomal RNA repetitive gene clusters linked in tandem array

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Histones are the major protein component of chromatin structure. The histone family is made up of a quintet of proteins, four core histones (H2A, H2B, H3 & H4) and the linker histones (H1). Spacers are found between the coding regions. Among insects this quintet of genes is usually clustered and ...

  8. Bifunctional Gene Cluster lnqBCDEF Mediates Bacteriocin Production and Immunity with Differential Genetic Requirements

    PubMed Central

    Iwatani, Shun; Horikiri, Yuko; Zendo, Takeshi; Nakayama, Jiro

    2013-01-01

    A comprehensive gene disruption of lacticin Q biosynthetic cluster lnqQBCDEF was carried out. The results demonstrated the necessity of the complete set of lnqQBCDEF for lacticin Q production, whereas immunity was flexible, with LnqEF (ABC transporter) being essential for and LnqBCD partially contributing to immunity. PMID:23335763

  9. Beta-globin gene cluster haplotype frequencies in Khalkhs and Buryats of Mongolia.

    PubMed

    Shimizu, Koji; Tokimasa, Kozue; Takeuchi, Yukiko; Gereksaikhan, Tudevdagva; Tanabe, Yuichi; Omoto, Keiichi; Imanishi, Tadashi; Harihara, Shinji; Hao, Luping; Jing, Feng

    2006-12-01

    Beta-globin gene cluster haplotype frequencies of 169 Khalkhs and 145 Buryats were estimated, and their characteristics were compared with those of Evenkis, Oroqens, Koreans, Japanese, and three Colombian Amerindian groups. The present study suggests that Colombian Amerindians diverged first from Asian populations and then Buryats diverged from other Asian populations. PMID:17564253

  10. Identification and analysis of the paulomycin biosynthetic gene cluster and titer improvement of the paulomycins in Streptomyces paulus NRRL 8115.

    PubMed

    Li, Jine; Xie, Zhoujie; Wang, Min; Ai, Guomin; Chen, Yihua

    2015-01-01

    The paulomycins are a group of glycosylated compounds featuring a unique paulic acid moiety. To locate their biosynthetic gene clusters, the genomes of two paulomycin producers, Streptomyces paulus NRRL 8115 and Streptomyces sp. YN86, were sequenced. The paulomycin biosynthetic gene clusters were defined by comparative analyses of the two genomes together with the genome of the third paulomycin producer Streptomyces albus J1074. Subsequently, the identity of the paulomycin biosynthetic gene cluster was confirmed by inactivation of two genes involved in biosynthesis of the paulomycose branched chain (pau11) and the ring A moiety (pau18) in Streptomyces paulus NRRL 8115. After determining the gene cluster boundaries, a convergent biosynthetic model was proposed for paulomycin based on the deduced functions of the pau genes. Finally, a paulomycin high-producing strain was constructed by expressing an activator-encoding gene (pau13) in S. paulus, setting the stage for future investigations. PMID:25822496

  11. Identification and Analysis of the Paulomycin Biosynthetic Gene Cluster and Titer Improvement of the Paulomycins in Streptomyces paulus NRRL 8115

    PubMed Central

    Li, Jine; Xie, Zhoujie; Wang, Min; Ai, Guomin; Chen, Yihua

    2015-01-01

    The paulomycins are a group of glycosylated compounds featuring a unique paulic acid moiety. To locate their biosynthetic gene clusters, the genomes of two paulomycin producers, Streptomyces paulus NRRL 8115 and Streptomyces sp. YN86, were sequenced. The paulomycin biosynthetic gene clusters were defined by comparative analyses of the two genomes together with the genome of the third paulomycin producer Streptomyces albus J1074. Subsequently, the identity of the paulomycin biosynthetic gene cluster was confirmed by inactivation of two genes involved in biosynthesis of the paulomycose branched chain (pau11) and the ring A moiety (pau18) in Streptomyces paulus NRRL 8115. After determining the gene cluster boundaries, a convergent biosynthetic model was proposed for paulomycin based on the deduced functions of the pau genes. Finally, a paulomycin high-producing strain was constructed by expressing an activator-encoding gene (pau13) in S. paulus, setting the stage for future investigations. PMID:25822496

  12. Evidence for birth-and-death evolution and horizontal transfer of the fumonisin mycotoxin biosynthetic gene cluster in Fusarium

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In fungi, genes required for synthesis of secondary metabolites are often clustered. The FUM gene cluster is required for synthesis of fumonisins, a family of toxic secondary metabolites produced predominantly by species in the Fusarium (Gibberella) fujikuroi species complex (FFSC). Fumonisins are a...

  13. Birth, death and horizontal transfer of the fumonisin biosynthetic gene cluster during the evolutionary diversification of Fusarium

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In fungi, genes required for synthesis of secondary metabolites are often clustered. The FUM gene cluster is required for synthesis of a family of toxic secondary metabolites, fumonisins, produced by species of Fusarium in the Gibberella fujikuroi species complex (GFSC). Fumonisins are a health and ...

  14. Clustering of two genes putatively involved in cyanate detoxification evolved recently and independently in multiple fungal lineages

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Fungi that have the enzymes cyanase and carbonic anhydrase show a limited capacity to detoxify cyanate, a fungicide employed by both plants and humans. Here, we describe a novel two-gene cluster that comprises duplicated cyanase and carbonic anhydrase copies, which we name the CCA gene cluster, trac...

  15. The impact of polyploidy on the evolution of a complex NB-LRR resistance gene cluster in soybean

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comparative genomics approach was used to investigate the evolution of a complex NB-LRR gene cluster found in soybean (Glycine max), common bean (Phaseolus vulgaris), and other legumes. In soybean, the cluster is associated with several disease resistance (R) genes of known function including Rpg1...

  16. Clusters of genes encoding fructan biosynthesizing enzymes in wheat and barley.

    PubMed

    Huynh, Bao-Lam; Mather, Diane E; Schreiber, Andreas W; Toubia, John; Baumann, Ute; Shoaei, Zahra; Stein, Nils; Ariyadasa, Ruvini; Stangoulis, James C R; Edwards, James; Shirley, Neil; Langridge, Peter; Fleury, Delphine

    2012-10-01

    Fructans are soluble carbohydrates with health benefits and possible roles in plant adaptation. Fructan biosynthetic genes were isolated using comparative genomics and physical mapping followed by BAC sequencing in barley. Genes encoding sucrose:sucrose 1-fructosyltransferase (1-SST), fructan:fructan 1-fructosyltransferase (1-FFT) and sucrose:fructan 6-fructosyltransferase (6-SFT) were clustered together with multiple copies of vacuolar invertase genes and a transposable element on two barley BAC. Intron-exon structures of the genes were similar. Phylogenetic analysis of the fructosyltransferases and invertases in the Poaceae showed that the fructan biosynthetic genes may have evolved from vacuolar invertases. Quantitative real-time PCR was performed using leaf RNA extracted from three wheat cultivars grown under different conditions. The 1-SST, 1-FFT and 6-SFT genes had correlated expression patterns in our wheat experiment and in existing barley transcriptome database. Single nucleotide polymorphism (SNP) markers were developed and successfully mapped to a major QTL region affecting wheat grain fructan accumulation in two independent wheat populations. The alleles controlling high- and low- fructan in parental lines were also found to be associated in fructan production in a diverse set of 128 wheat lines. To the authors' knowledge, this is the first report on the mapping and sequencing of a fructan biosynthetic gene cluster and in particular, the isolation of a novel 1-FFT gene from barley. PMID:22864927

  17. From the Flavobacterium genus to the phylum Bacteroidetes: genomic analysis of dnd gene clusters.

    PubMed

    Barbier, Paul; Lunazzi, Aurélie; Fujiwara-Nagata, Erina; Avendaño-Herrera, Ruben; Bernardet, Jean-François; Touchon, Marie; Duchaud, Eric

    2013-11-01

    Phosphorothioate modification of DNA and the corresponding DNA degradation (Dnd) phenotype that occurs during gel electrophoresis are caused by dnd genes. Although widely distributed among Bacteria and Archaea, dnd genes have been found in only very few, taxonomically unrelated, bacterial species so far. Here, we report the presence of dnd genes and their associated Dnd phenotype in two Flavobacterium species. Comparison with dnd gene clusters previously described led us to report a noncanonical genetic organization and to identify a gene likely encoding a hybrid DndE protein. Hence, we showed that dnd genes are also present in members of the family Flavobacteriaceae, a bacterial group occurring in a variety of habitats with an interesting diversity of lifestyle. Two main types of genomic organization of dnd loci were uncovered probably denoting their spreading in the phylum Bacteroidetes via distinct genetic transfer events. PMID:23965156

  18. Cloning and Analysis of the Planosporicin Lantibiotic Biosynthetic Gene Cluster of Planomonospora alba

    PubMed Central

    Sherwood, Emma J.; Hesketh, Andrew R.

    2013-01-01

    The increasing prevalence of antibiotic resistance in bacterial pathogens has renewed focus on natural products with antimicrobial properties. Lantibiotics are ribosomally synthesized peptide antibiotics that are posttranslationally modified to introduce (methyl)lanthionine bridges. Actinomycetes are renowned for their ability to produce a large variety of antibiotics, many with clinical applications, but are known to make only a few lantibiotics. One such compound is planosporicin produced by Planomonospora alba, which inhibits cell wall biosynthesis in Gram-positive pathogens. Planosporicin is a type AI lantibiotic structurally similar to those which bind lipid II, the immediate precursor for cell wall biosynthesis. The gene cluster responsible for planosporicin biosynthesis was identified by genome mining and subsequently isolated from a P. alba cosmid library. A minimal cluster of 15 genes sufficient for planosporicin production was defined by heterologous expression in Nonomuraea sp. strain ATCC 39727, while deletion of the gene encoding the precursor peptide from P. alba, which abolished planosporicin production, was also used to confirm the identity of the gene cluster. Deletion of genes encoding likely biosynthetic enzymes identified through bioinformatic analysis revealed that they, too, are essential for planosporicin production in the native host. Reverse transcription-PCR (RT-PCR) analysis indicated that the planosporicin gene cluster is transcribed in three operons. Expression of one of these, pspEF, which encodes an ABC transporter, in Streptomyces coelicolor A3(2) conferred some degree of planosporicin resistance on the heterologous host. The inability to delete these genes from P. alba suggests that they play an essential role in immunity in the natural producer. PMID:23475977

  19. Identification of the phd gene cluster responsible for phenylpropanoid utilization in Corynebacterium glutamicum.

    PubMed

    Kallscheuer, Nicolai; Vogt, Michael; Kappelmann, Jannick; Krumbach, Karin; Noack, Stephan; Bott, Michael; Marienhagen, Jan

    2016-02-01

    Phenylpropanoids as abundant, lignin-derived compounds represent sustainable feedstocks for biotechnological production processes. We found that the biotechnologically important soil bacterium Corynebacterium glutamicum is able to grow on phenylpropanoids such as p-coumaric acid, ferulic acid, caffeic acid, and 3-(4-hydroxyphenyl)propionic acid as sole carbon and energy sources. Global gene expression analyses identified a gene cluster (cg0340-cg0341 and cg0344-cg0347), which showed increased transcription levels in response to phenylpropanoids. The gene cg0340 (designated phdT) encodes for a putative transporter protein, whereas cg0341 and cg0344-cg0347 (phdA-E) encode enzymes involved in the β-oxidation of phenylpropanoids. The phd gene cluster is transcriptionally controlled by a MarR-type repressor encoded by cg0343 (phdR). Cultivation experiments conducted with C. glutamicum strains carrying single-gene deletions showed that loss of phdA, phdB, phdC, or phdE abolished growth of C. glutamicum with all phenylpropanoid substrates tested. The deletion of phdD (encoding for putative acyl-CoA dehydrogenase) additionally abolished growth with the α,β-saturated phenylpropanoid 3-(4-hydroxyphenyl)propionic acid. However, the observed growth defect of all constructed single-gene deletion strains could be abolished through plasmid-borne expression of the respective genes. These results and the intracellular accumulation of pathway intermediates determined via LC-ESI-MS/MS in single-gene deletion mutants showed that the phd gene cluster encodes for a CoA-dependent, β-oxidative deacetylation pathway, which is essential for the utilization of phenylpropanoids in C. glutamicum. PMID:26610800

  20. Molecular cloning of the Escherichia coli B L-fucose-D-arabinose gene cluster.

    PubMed Central

    Elsinghorst, E A; Mortlock, R P

    1994-01-01

    To metabolize the uncommon pentose D-arabinose, enteric bacteria often recruit the enzymes of the L-fucose pathway by a regulatory mutation. However, Escherichia coli B can grow on D-arabinose without the requirement of a mutation, using some of the L-fucose enzymes and a D-ribulokinase that is distinct from the L-fuculokinase of the L-fucose pathway. To study this naturally occurring D-arabinose pathway, we cloned and partially characterized the E. coli B L-fucose-D-arabinose gene cluster and compared it with the L-fucose gene cluster of E. coli K-12. The order of the fucA, -P, -I, and -K genes was the same in the two E. coli strains. However, the E. coli B gene cluster contained a 5.2-kb segment located between the fucA and fucP genes that was not present in E. coli K-12. This segment carried the darK gene, which encodes the D-ribulokinase needed for growth on D-arabinose by E. coli B. The darK gene was not homologous with any of the L-fucose genes or with chromosomal DNA from other D-arabinose-utilizing bacteria. D-Ribulokinase and L-fuculokinase were purified to apparent homogeneity and partially characterized. The molecular weights, substrate specificities, and kinetic parameters of these two enzymes were very dissimilar, which together with DNA hybridization analysis, suggested that these enzymes are not related. D-Arabinose metabolism by E. coli B appears to be the result of acquisitive evolution, but the source of the darK gene has not been determined. Images PMID:7961494

  1. Characterisation of the paralytic shellfish toxin biosynthesis gene clusters in Anabaena circinalis AWQC131C and Aphanizomenon sp. NH-5

    PubMed Central

    Mihali, Troco K; Kellmann, Ralf; Neilan, Brett A

    2009-01-01

    Background Saxitoxin and its analogues collectively known as the paralytic shellfish toxins (PSTs) are neurotoxic alkaloids and are the cause of the syndrome named paralytic shellfish poisoning. PSTs are produced by a unique biosynthetic pathway, which involves reactions that are rare in microbial metabolic pathways. Nevertheless, distantly related organisms such as dinoflagellates and cyanobacteria appear to produce these toxins using the same pathway. Hypothesised explanations for such an unusual phylogenetic distribution of this shared uncommon metabolic pathway, include a polyphyletic origin, an involvement of symbiotic bacteria, and horizontal gene transfer. Results We describe the identification, annotation and bioinformatic characterisation of the putative paralytic shellfish toxin biosynthesis clusters in an Australian isolate of Anabaena circinalis and an American isolate of Aphanizomenon sp., both members of the Nostocales. These putative PST gene clusters span approximately 28 kb and contain genes coding for the biosynthesis and export of the toxin. A putative insertion/excision site in the Australian Anabaena circinalis AWQC131C was identified, and the organization and evolution of the gene clusters are discussed. A biosynthetic pathway leading to the formation of saxitoxin and its analogues in these organisms is proposed. Conclusion The PST biosynthesis gene cluster presents a mosaic structure, whereby genes have apparently transposed in segments of varying size, resulting in different gene arrangements in all three sxt clusters sequenced so far. The gene cluster organizational structure and sequence similarity seems to reflect the phylogeny of the producer organisms, indicating that the gene clusters have an ancient origin, or that their lateral transfer was also an ancient event. The knowledge we gain from the characterisation of the PST biosynthesis gene clusters, including the identity and sequence of the genes involved in the biosynthesis, may

  2. Organization of the Escherichia coli K-12 gene cluster responsible for production of the extracellular polysaccharide colanic acid.

    PubMed

    Stevenson, G; Andrianopoulos, K; Hobbs, M; Reeves, P R

    1996-08-01

    Colanic acid (CA) is an extracellular polysaccharide produced by most Escherichia coli strains as well as by other species of the family Enterobacteriaceae. We have determined the sequence of a 23-kb segment of the E. coli K-12 chromosome which includes the cluster of genes necessary for production of CA. The CA cluster comprises 19 genes. Two other sequenced genes (orf1.3 and galF), which are situated between the CA cluster and the O-antigen cluster, were shown to be unnecessary for CA production. The CA cluster includes genes for synthesis of GDP-L-fucose, one of the precursors of CA, and the gene for one of the enzymes in this pathway (GDP-D-mannose 4,6-dehydratase) was identified by biochemical assay. Six of the inferred proteins show sequence similarity to glycosyl transferases, and two others have sequence similarity to acetyl transferases. Another gene (wzx) is predicted to encode a protein with multiple transmembrane segments and may function in export of the CA repeat unit from the cytoplasm into the periplasm in a process analogous to O-unit export. The first three genes of the cluster are predicted to encode an outer membrane lipoprotein, a phosphatase, and an inner membrane protein with an ATP-binding domain. Since homologs of these genes are found in other extracellular polysaccharide gene clusters, they may have a common function, such as export of polysaccharide from the cell. PMID:8759852

  3. Classification and Clustering on Microarray Data for Gene Functional Prediction Using R.

    PubMed

    López-Kleine, Liliana; Kleine, Liliana López; Montaño, Rosa; Torres-Avilés, Francisco

    2016-01-01

    Gene expression data (microarrays and RNA-sequencing data) as well as other kinds of genomic data can be extracted from publicly available genomic data. Here, we explain how to apply multivariate cluster and classification methods on gene expression data. These methods have become very popular and are implemented in freely available software in order to predict the participation of gene products in a specific functional category of interest. Taking into account the availability of data and of these methods, every biological study should apply them in order to obtain knowledge on the organism studied and functional category of interest. A special emphasis is made on the nonlinear kernel classification methods. PMID:25762300

  4. Direct cloning and refactoring of a silent lipopeptide biosynthetic gene cluster yields the antibiotic taromycin A.

    PubMed

    Yamanaka, Kazuya; Reynolds, Kirk A; Kersten, Roland D; Ryan, Katherine S; Gonzalez, David J; Nizet, Victor; Dorrestein, Pieter C; Moore, Bradley S

    2014-02-01

    Recent developments in next-generation sequencing technologies have brought recognition of microbial genomes as a rich resource for novel natural product discovery. However, owing to the scarcity of efficient procedures to connect genes to molecules, only a small fraction of secondary metabolomes have been investigated to date. Transformation-associated recombination (TAR) cloning takes advantage of the natural in vivo homologous recombination of Saccharomyces cerevisiae to directly capture large genomic loci. Here we report a TAR-based genetic platform that allows us to directly clone, refactor, and heterologously express a silent biosynthetic pathway to yield a new antibiotic. With this method, which involves regulatory gene remodeling, we successfully expressed a 67-kb nonribosomal peptide synthetase biosynthetic gene cluster from the marine actinomycete Saccharomonospora sp. CNQ-490 and produced the dichlorinated lipopeptide antibiotic taromycin A in the model expression host Streptomyces coelicolor. The taromycin gene cluster (tar) is highly similar to the clinically approved antibiotic daptomycin from Streptomyces roseosporus, but has notable structural differences in three amino acid residues and the lipid side chain. With the activation of the tar gene cluster and production of taromycin A, this study highlights a unique "plug-and-play" approach to efficiently gaining access to orphan pathways that may open avenues for novel natural product discoveries and drug development. PMID:24449899

  5. The Amylase gene cluster on the evolving sex chromosomes of Drosophila miranda.

    PubMed

    Steinemann, S; Steinemann, M

    1999-01-01

    On the basis of chromosomal homology, the Amylase gene cluster in Drosophila miranda must be located on the secondary sex chromosome pair, neo-X (X2) and neo-Y, but is autosomally inherited in all other Drosophila species. Genetic evidence indicates no active amylase on the neo-Y chromosome and the X2-chromosomal locus already shows dosage compensation. Several lines of evidence strongly suggest that the Amy gene cluster has been lost already from the evolving neo-Y chromosome. This finding shows that a relatively new neo-Y chromosome can start to lose genes and hence gradually lose homology with the neo-X. The X2-chromosomal Amy1 is intact and Amy2 contains a complete coding sequence, but has a deletion in the 3'-flanking region. Amy3 is structurally eroded and hampered by missing regulatory motifs. Functional analysis of the X2-chromosomal Amy1 and Amy2 regions from D. miranda in transgenic D. melanogaster flies reveals ectopic AMY1 expression. AMY1 shows the same electrophoretic mobility as the single amylase band in D. miranda, while ectopic AMY2 expression is characterized by a different mobility. Therefore, only the Amy1 gene of the resident Amy cluster remains functional and hence Amy1 is the dosage compensated gene. PMID:9872956

  6. GIP2, a Putative Transcription Factor That Regulates the Aurofusarin Biosynthetic Gene Cluster in Gibberella zeae

    PubMed Central

    Kim, Jung-Eun; Jin, Jianming; Kim, Hun; Kim, Jin-Cheol; Yun, Sung-Hwan; Lee, Yin-Won

    2006-01-01

    Gibberella zeae (anamorph: Fusarium graminearum) is an important pathogen of maize, wheat, and rice. Colonies of G. zeae produce yellow-to-tan mycelia with the white-to-carmine red margins. In this study, we focused on nine putative open reading frames (ORFs) closely linked to PKS12 and GIP1, which are required for aurofusarin biosynthesis in G. zeae. Among them is an ORF designated GIP2 (for Gibberella zeae pigment gene 2), which encodes a putative protein of 398 amino acids that carries a Zn(II)2Cys6 binuclear cluster DNA-binding domain commonly found in transcription factors of yeasts and filamentous fungi. Targeted gene deletion and complementation analyses confirmed that GIP2 is required for aurofusarin biosynthesis. Expression of GIP2 in carrot medium correlated with aurofusarin production by G. zeae and was restricted to vegetative mycelia. Inactivation of the 10 contiguous genes in the ΔGIP2 strain delineates an aurofusarin biosynthetic gene cluster. Overexpression of GIP2 in both the ΔGIP2 and the wild-type strains increases aurofusarin production and reduces mycelial growth. Thus, GIP2 is a putative positive regulator of the aurofusarin biosynthetic gene cluster, and aurofusarin production is negatively correlated with vegetative growth by G. zeae. PMID:16461721

  7. Modularity of Plant Metabolic Gene Clusters: A Trio of Linked Genes That Are Collectively Required for Acylation of Triterpenes in Oat[W][OA

    PubMed Central

    Mugford, Sam T.; Louveau, Thomas; Melton, Rachel; Qi, Xiaoquan; Bakht, Saleha; Hill, Lionel; Tsurushima, Tetsu; Honkanen, Suvi; Rosser, Susan J.; Lomonossoff, George P.; Osbourn, Anne

    2013-01-01

    Operon-like gene clusters are an emerging phenomenon in the field of plant natural products. The genes encoding some of the best-characterized plant secondary metabolite biosynthetic pathways are scattered across plant genomes. However, an increasing number of gene clusters encoding the synthesis of diverse natural products have recently been reported in plant genomes. These clusters have arisen through the neo-functionalization and relocation of existing genes within the genome, and not by horizontal gene transfer from microbes. The reasons for clustering are not yet clear, although this form of gene organization is likely to facilitate co-inheritance and co-regulation. Oats (Avena spp) synthesize antimicrobial triterpenoids (avenacins) that provide protection against disease. The synthesis of these compounds is encoded by a gene cluster. Here we show that a module of three adjacent genes within the wider biosynthetic gene cluster is required for avenacin acylation. Through the characterization of these genes and their encoded proteins we present a model of the subcellular organization of triterpenoid biosynthesis. PMID:23532069

  8. Modularity of plant metabolic gene clusters: a trio of linked genes that are collectively required for acylation of triterpenes in oat.

    PubMed

    Mugford, Sam T; Louveau, Thomas; Melton, Rachel; Qi, Xiaoquan; Bakht, Saleha; Hill, Lionel; Tsurushima, Tetsu; Honkanen, Suvi; Rosser, Susan J; Lomonossoff, George P; Osbourn, Anne

    2013-03-01

    Operon-like gene clusters are an emerging phenomenon in the field of plant natural products. The genes encoding some of the best-characterized plant secondary metabolite biosynthetic pathways are scattered across plant genomes. However, an increasing number of gene clusters encoding the synthesis of diverse natural products have recently been reported in plant genomes. These clusters have arisen through the neo-functionalization and relocation of existing genes within the genome, and not by horizontal gene transfer from microbes. The reasons for clustering are not yet clear, although this form of gene organization is likely to facilitate co-inheritance and co-regulation. Oats (Avena spp) synthesize antimicrobial triterpenoids (avenacins) that provide protection against disease. The synthesis of these compounds is encoded by a gene cluster. Here we show that a module of three adjacent genes within the wider biosynthetic gene cluster is required for avenacin acylation. Through the characterization of these genes and their encoded proteins we present a model of the subcellular organization of triterpenoid biosynthesis. PMID:23532069

  9. The organization and transcription of the galactose gene cluster of Kluyveromyces lactis.

    PubMed Central

    Webster, T D; Dickson, R C

    1988-01-01

    The yeast Kluyveromyces lactis grows on galactose by inducing the Leloir pathway enzymes-kinase, epimerase, and transferase. To investigate the molecular mechanism for regulating expression of this metabolic pathway we isolated GAL1, GAL7, GAL10, which code for kinase, transferase, and epimerase, respectively, and characterized their size, organization, and transcriptional regulation. Our results indicate that induction of the Leloir pathway in K. lactis occurs at the level of transcription and that the organization and regulation of the GAL gene cluster in K. lactis is closely related to the homologous gene cluster in Saccharomyces cerevisiae. Likewise, the Upstream Activator Sequences that regulate induction of the GAL genes are similar in base sequence, number and relative location in the two yeasts. Images PMID:3047676

  10. Next-generation sequencing approach for connecting secondary metabolites to biosynthetic gene clusters in fungi

    PubMed Central

    Cacho, Ralph A.; Tang, Yi; Chooi, Yit-Heng

    2015-01-01

    Genomics has revolutionized the research on fungal secondary metabolite (SM) biosynthesis. To elucidate the molecular and enzymatic mechanisms underlying the biosynthesis of a specific SM compound, the important first step is often to find the genes that responsible for its synthesis. The accessibility to fungal genome sequences allows the bypass of the cumbersome traditional library construction and screening approach. The advance in next-generation sequencing (NGS) technologies have further improved the speed and reduced the cost of microbial genome sequencing in the past few years, which has accelerated the research in this field. Here, we will present an example work flow for identifying the gene cluster encoding the biosynthesis of SMs of interest using an NGS approach. We will also review the different strategies that can be employed to pinpoint the targeted gene clusters rapidly by giving several examples stemming from our work. PMID:25642215