Sample records for common gene clusters

  1. Conditional clustering of temporal expression profiles

    PubMed Central

    Wang, Ling; Montano, Monty; Rarick, Matt; Sebastiani, Paola

    2008-01-01

    Background Many microarray experiments produce temporal profiles in different biological conditions but common cluster techniques are not able to analyze the data conditional on the biological conditions. Results This article presents a novel technique to cluster data from time course microarray experiments performed across several experimental conditions. Our algorithm uses polynomial models to describe the gene expression patterns over time, a full Bayesian approach with proper conjugate priors to make the algorithm invariant to linear transformations, and an iterative procedure to identify genes that have a common temporal expression profile across two or more experimental conditions, and genes that have a unique temporal profile in a specific condition. Conclusion We use simulated data to evaluate the effectiveness of this new algorithm in finding the correct number of clusters and in identifying genes with common and unique profiles. We also use the algorithm to characterize the response of human T cells to stimulations of antigen-receptor signaling gene expression temporal profiles measured in six different biological conditions and we identify common and unique genes. These studies suggest that the methodology proposed here is useful in identifying and distinguishing uniquely stimulated genes from commonly stimulated genes in response to variable stimuli. Software for using this clustering method is available from the project home page. PMID:18334028

  2. Identification of Common Differentially Expressed Genes in Urinary Bladder Cancer

    PubMed Central

    Zaravinos, Apostolos; Lambrou, George I.; Boulalas, Ioannis; Delakas, Dimitris; Spandidos, Demetrios A.

    2011-01-01

    Background Current diagnosis and treatment of urinary bladder cancer (BC) has shown great progress with the utilization of microarrays. Purpose Our goal was to identify common differentially expressed (DE) genes among clinically relevant subclasses of BC using microarrays. Methodology/Principal Findings BC samples and controls, both experimental and publicly available datasets, were analyzed by whole genome microarrays. We grouped the samples according to their histology and defined the DE genes in each sample individually, as well as in each tumor group. A dual analysis strategy was followed. First, experimental samples were analyzed and conclusions were formulated; and second, experimental sets were combined with publicly available microarray datasets and were further analyzed in search of common DE genes. The experimental dataset identified 831 genes that were DE in all tumor samples, simultaneously. Moreover, 33 genes were up-regulated and 85 genes were down-regulated in all 10 BC samples compared to the 5 normal tissues, simultaneously. Hierarchical clustering partitioned tumor groups in accordance to their histology. K-means clustering of all genes and all samples, as well as clustering of tumor groups, presented 49 clusters. K-means clustering of common DE genes in all samples revealed 24 clusters. Genes manifested various differential patterns of expression, based on PCA. YY1 and NFκB were among the most common transcription factors that regulated the expression of the identified DE genes. Chromosome 1 contained 32 DE genes, followed by chromosomes 2 and 11, which contained 25 and 23 DE genes, respectively. Chromosome 21 had the least number of DE genes. GO analysis revealed the prevalence of transport and binding genes in the common down-regulated DE genes; the prevalence of RNA metabolism and processing genes in the up-regulated DE genes; as well as the prevalence of genes responsible for cell communication and signal transduction in the DE genes that were down-regulated in T1-Grade III tumors and up-regulated in T2/T3-Grade III tumors. Combination of samples from all microarray platforms revealed 17 common DE genes, (BMP4, CRYGD, DBH, GJB1, KRT83, MPZ, NHLH1, TACR3, ACTC1, MFAP4, SPARCL1, TAGLN, TPM2, CDC20, LHCGR, TM9SF1 and HCCS) 4 of which participate in numerous pathways. Conclusions/Significance The identification of the common DE genes among BC samples of different histology can provide further insight into the discovery of new putative markers. PMID:21483740

  3. Atlas of nonribosomal peptide and polyketide biosynthetic pathways reveals common occurrence of nonmodular enzymes.

    PubMed

    Wang, Hao; Fewer, David P; Holm, Liisa; Rouhiainen, Leo; Sivonen, Kaarina

    2014-06-24

    Nonribosomal peptides and polyketides are a diverse group of natural products with complex chemical structures and enormous pharmaceutical potential. They are synthesized on modular nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS) enzyme complexes by a conserved thiotemplate mechanism. Here, we report the widespread occurrence of NRPS and PKS genetic machinery across the three domains of life with the discovery of 3,339 gene clusters from 991 organisms, by examining a total of 2,699 genomes. These gene clusters display extraordinarily diverse organizations, and a total of 1,147 hybrid NRPS/PKS clusters were found. Surprisingly, 10% of bacterial gene clusters lacked modular organization, and instead catalytic domains were mostly encoded as separate proteins. The finding of common occurrence of nonmodular NRPS differs substantially from the current classification. Sequence analysis indicates that the evolution of NRPS machineries was driven by a combination of common descent and horizontal gene transfer. We identified related siderophore NRPS gene clusters that encoded modular and nonmodular NRPS enzymes organized in a gradient. A higher frequency of the NRPS and PKS gene clusters was detected from bacteria compared with archaea or eukarya. They commonly occurred in the phyla of Proteobacteria, Actinobacteria, Firmicutes, and Cyanobacteria in bacteria and the phylum of Ascomycota in fungi. The majority of these NRPS and PKS gene clusters have unknown end products highlighting the power of genome mining in identifying novel genetic machinery for the biosynthesis of secondary metabolites.

  4. Ortholog-based screening and identification of genes related to intracellular survival.

    PubMed

    Yang, Xiaowen; Wang, Jiawei; Bing, Guoxia; Bie, Pengfei; De, Yanyan; Lyu, Yanli; Wu, Qingmin

    2018-04-20

    Bioinformatics and comparative genomics analysis methods were used to predict unknown pathogen genes based on homology with identified or functionally clustered genes. In this study, the genes of common pathogens were analyzed to screen and identify genes associated with intracellular survival through sequence similarity, phylogenetic tree analysis and the λ-Red recombination system test method. The total 38,952 protein-coding genes of common pathogens were divided into 19,775 clusters. As demonstrated through a COG analysis, information storage and processing genes might play an important role intracellular survival. Only 19 clusters were present in facultative intracellular pathogens, and not all were present in extracellular pathogens. Construction of a phylogenetic tree selected 18 of these 19 clusters. Comparisons with the DEG database and previous research revealed that seven other clusters are considered essential gene clusters and that seven other clusters are associated with intracellular survival. Moreover, this study confirmed that clusters screened by orthologs with similar function could be replaced with an approved uvrY gene and its orthologs, and the results revealed that the usg gene is associated with intracellular survival. The study improves the current understanding of intracellular pathogens characteristics and allows further exploration of the intracellular survival-related gene modules in these pathogens. Copyright © 2018. Published by Elsevier B.V.

  5. From hormones to secondary metabolism: the emergence of metabolic gene clusters in plants.

    PubMed

    Chu, Hoi Yee; Wegel, Eva; Osbourn, Anne

    2011-04-01

    Gene clusters for the synthesis of secondary metabolites are a common feature of microbial genomes. Well-known examples include clusters for the synthesis of antibiotics in actinomycetes, and also for the synthesis of antibiotics and toxins in filamentous fungi. Until recently it was thought that genes for plant metabolic pathways were not clustered, and this is certainly true in many cases; however, five plant secondary metabolic gene clusters have now been discovered, all of them implicated in synthesis of defence compounds. An obvious assumption might be that these eukaryotic gene clusters have arisen by horizontal gene transfer from microbes, but there is compelling evidence to indicate that this is not the case. This raises intriguing questions about how widespread such clusters are, what the significance of clustering is, why genes for some metabolic pathways are clustered and those for others are not, and how these clusters form. In answering these questions we may hope to learn more about mechanisms of genome plasticity and adaptive evolution in plants. It is noteworthy that for the five plant secondary metabolic gene clusters reported so far, the enzymes for the first committed steps all appear to have been recruited directly or indirectly from primary metabolic pathways involved in hormone synthesis. This may or may not turn out to be a common feature of plant secondary metabolic gene clusters as new clusters emerge. © 2011 The Authors. The Plant Journal © 2011 Blackwell Publishing Ltd.

  6. Hox gene cluster of the ascidian, Halocynthia roretzi, reveals multiple ancient steps of cluster disintegration during ascidian evolution.

    PubMed

    Sekigami, Yuka; Kobayashi, Takuya; Omi, Ai; Nishitsuji, Koki; Ikuta, Tetsuro; Fujiyama, Asao; Satoh, Noriyuki; Saiga, Hidetoshi

    2017-01-01

    Hox gene clusters with at least 13 paralog group (PG) members are common in vertebrate genomes and in that of amphioxus. Ascidians, which belong to the subphylum Tunicata (Urochordata), are phylogenetically positioned between vertebrates and amphioxus, and traditionally divided into two groups: the Pleurogona and the Enterogona. An enterogonan ascidian, Ciona intestinalis ( Ci ), possesses nine Hox genes localized on two chromosomes; thus, the Hox gene cluster is disintegrated. We investigated the Hox gene cluster of a pleurogonan ascidian, Halocynthia roretzi ( Hr ) to investigate whether Hox gene cluster disintegration is common among ascidians, and if so, how such disintegration occurred during ascidian or tunicate evolution. Our phylogenetic analysis reveals that the Hr Hox gene complement comprises nine members, including one with a relatively divergent Hox homeodomain sequence. Eight of nine Hr Hox genes were orthologous to Ci-Hox1 , 2, 3, 4, 5, 10, 12 and 13. Following the phylogenetic classification into 13 PGs, we designated Hr Hox genes as Hox1, 2, 3, 4, 5, 10, 11/12/13.a , 11/12/13.b and HoxX . To address the chromosomal arrangement of the nine Hox genes, we performed two-color chromosomal fluorescent in situ hybridization, which revealed that the nine Hox genes are localized on a single chromosome in Hr , distinct from their arrangement in Ci . We further examined the order of the nine Hox genes on the chromosome by chromosome/scaffold walking. This analysis suggested a gene order of Hox1 , 11/12/13.b, 11/12/13.a, 10, 5, X, followed by either Hox4, 3, 2 or Hox2, 3, 4 on the chromosome. Based on the present results and those previously reported in Ci , we discuss the establishment of the Hox gene complement and disintegration of Hox gene clusters during the course of ascidian or tunicate evolution. The Hox gene cluster and the genome must have experienced extensive reorganization during the course of evolution from the ancestral tunicate to Hr and Ci . Nevertheless, some features are shared in Hox gene components and gene arrangement on the chromosomes, suggesting that Hox gene cluster disintegration in ascidians involved early events common to tunicates as well as later ascidian lineage-specific events.

  7. The impact of polyploidy on the evolution of a complex NB-LRR resistance gene cluster in soybean

    USDA-ARS?s Scientific Manuscript database

    A comparative genomics approach was used to investigate the evolution of a complex NB-LRR gene cluster found in soybean (Glycine max), common bean (Phaseolus vulgaris), and other legumes. In soybean, the cluster is associated with several disease resistance (R) genes of known function including Rpg1...

  8. Functional clustering of time series gene expression data by Granger causality

    PubMed Central

    2012-01-01

    Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425

  9. Evolution of Chemical Diversity in Echinocandin Lipopeptide Antifungal Metabolites

    PubMed Central

    Yue, Qun; Chen, Li; Zhang, Xiaoling; Li, Kuan; Sun, Jingzu; Liu, Xingzhong

    2015-01-01

    The echinocandins are a class of antifungal drugs that includes caspofungin, micafungin, and anidulafungin. Gene clusters encoding most of the structural complexity of the echinocandins provided a framework for hypotheses about the evolutionary history and chemical logic of echinocandin biosynthesis. Gene orthologs among echinocandin-producing fungi were identified. Pathway genes, including the nonribosomal peptide synthetases (NRPSs), were analyzed phylogenetically to address the hypothesis that these pathways represent descent from a common ancestor. The clusters share cooperative gene contents and linkages among the different strains. Individual pathway genes analyzed in the context of similar genes formed unique echinocandin-exclusive phylogenetic lineages. The echinocandin NRPSs, along with the NRPS from the inp gene cluster in Aspergillus nidulans and its orthologs, comprise a novel lineage among fungal NRPSs. NRPS adenylation domains from different species exhibited a one-to-one correspondence between modules and amino acid specificity that is consistent with models of tandem duplication and subfunctionalization. Pathway gene trees and Ascomycota phylogenies are congruent and consistent with the hypothesis that the echinocandin gene clusters have a common origin. The disjunct Eurotiomycete-Leotiomycete distribution appears to be consistent with a scenario of vertical descent accompanied by incomplete lineage sorting and loss of the clusters from most lineages of the Ascomycota. We present evidence for a single evolutionary origin of the echinocandin family of gene clusters and a progression of structural diversification in two fungal classes that diverged approximately 290 to 390 million years ago. Lineage-specific gene cluster evolution driven by selection of new chemotypes contributed to diversification of the molecular functionalities. PMID:26024901

  10. Finding gene clusters for a replicated time course study

    PubMed Central

    2014-01-01

    Background Finding genes that share similar expression patterns across samples is an important question that is frequently asked in high-throughput microarray studies. Traditional clustering algorithms such as K-means clustering and hierarchical clustering base gene clustering directly on the observed measurements and do not take into account the specific experimental design under which the microarray data were collected. A new model-based clustering method, the clustering of regression models method, takes into account the specific design of the microarray study and bases the clustering on how genes are related to sample covariates. It can find useful gene clusters for studies from complicated study designs such as replicated time course studies. Findings In this paper, we applied the clustering of regression models method to data from a time course study of yeast on two genotypes, wild type and YOX1 mutant, each with two technical replicates, and compared the clustering results with K-means clustering. We identified gene clusters that have similar expression patterns in wild type yeast, two of which were missed by K-means clustering. We further identified gene clusters whose expression patterns were changed in YOX1 mutant yeast compared to wild type yeast. Conclusions The clustering of regression models method can be a valuable tool for identifying genes that are coordinately transcribed by a common mechanism. PMID:24460656

  11. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    PubMed Central

    2010-01-01

    Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is preferable, in particular if the gene selection is successful. However, this is an area that needs to be studied further in order to draw any general conclusions. Conclusions The choice of cluster analysis, and in particular gene selection, has a large impact on the ability to cluster individuals correctly based on expression profiles. Normalization has a positive effect, but the relative performance of different normalizations is an area that needs more research. In summary, although clustering, gene selection and normalization are considered standard methods in bioinformatics, our comprehensive analysis shows that selecting the right methods, and the right combinations of methods, is far from trivial and that much is still unexplored in what is considered to be the most basic analysis of genomic data. PMID:20937082

  12. Constrained clusters of gene expression profiles with pathological features.

    PubMed

    Sese, Jun; Kurokawa, Yukinori; Monden, Morito; Kato, Kikuya; Morishita, Shinichi

    2004-11-22

    Gene expression profiles should be useful in distinguishing variations in disease, since they reflect accurately the status of cells. The primary clustering of gene expression reveals the genotypes that are responsible for the proximity of members within each cluster, while further clustering elucidates the pathological features of the individual members of each cluster. However, since the first clustering process and the second classification step, in which the features are associated with clusters, are performed independently, the initial set of clusters may omit genes that are associated with pathologically meaningful features. Therefore, it is important to devise a way of identifying gene expression clusters that are associated with pathological features. We present the novel technique of 'itemset constrained clustering' (IC-Clustering), which computes the optimal cluster that maximizes the interclass variance of gene expression between groups, which are divided according to the restriction that only divisions that can be expressed using common features are allowed. This constraint automatically labels each cluster with a set of pathological features which characterize that cluster. When applied to liver cancer datasets, IC-Clustering revealed informative gene expression clusters, which could be annotated with various pathological features, such as 'tumor' and 'man', or 'except tumor' and 'normal liver function'. In contrast, the k-means method overlooked these clusters.

  13. Pre-Bilaterian Origins of the Hox Cluster and the Hox Code: Evidence from the Sea Anemone, Nematostella vectensis

    PubMed Central

    Ryan, Joseph F.; Mazza, Maureen E.; Pang, Kevin; Matus, David Q.; Baxevanis, Andreas D.; Martindale, Mark Q.; Finnerty, John R.

    2007-01-01

    Background Hox genes were critical to many morphological innovations of bilaterian animals. However, early Hox evolution remains obscure. Phylogenetic, developmental, and genomic analyses on the cnidarian sea anemone Nematostella vectensis challenge recent claims that the Hox code is a bilaterian invention and that no “true” Hox genes exist in the phylum Cnidaria. Methodology/Principal Findings Phylogenetic analyses of 18 Hox-related genes from Nematostella identify putative Hox1, Hox2, and Hox9+ genes. Statistical comparisons among competing hypotheses bolster these findings, including an explicit consideration of the gene losses implied by alternate topologies. In situ hybridization studies of 20 Hox-related genes reveal that multiple Hox genes are expressed in distinct regions along the primary body axis, supporting the existence of a pre-bilaterian Hox code. Additionally, several Hox genes are expressed in nested domains along the secondary body axis, suggesting a role in “dorsoventral” patterning. Conclusions/Significance A cluster of anterior and posterior Hox genes, as well as ParaHox cluster of genes evolved prior to the cnidarian-bilaterian split. There is evidence to suggest that these clusters were formed from a series of tandem gene duplication events and played a role in patterning both the primary and secondary body axes in a bilaterally symmetrical common ancestor. Cnidarians and bilaterians shared a common ancestor some 570 to 700 million years ago, and as such, are derived from a common body plan. Our work reveals several conserved genetic components that are found in both of these diverse lineages. This finding is consistent with the hypothesis that a set of developmental rules established in the common ancestor of cnidarians and bilaterians is still at work today. PMID:17252055

  14. Computational gene expression profiling under salt stress reveals patterns of co-expression

    PubMed Central

    Sanchita; Sharma, Ashok

    2016-01-01

    Plants respond differently to environmental conditions. Among various abiotic stresses, salt stress is a condition where excess salt in soil causes inhibition of plant growth. To understand the response of plants to the stress conditions, identification of the responsible genes is required. Clustering is a data mining technique used to group the genes with similar expression. The genes of a cluster show similar expression and function. We applied clustering algorithms on gene expression data of Solanum tuberosum showing differential expression in Capsicum annuum under salt stress. The clusters, which were common in multiple algorithms were taken further for analysis. Principal component analysis (PCA) further validated the findings of other cluster algorithms by visualizing their clusters in three-dimensional space. Functional annotation results revealed that most of the genes were involved in stress related responses. Our findings suggest that these algorithms may be helpful in the prediction of the function of co-expressed genes. PMID:26981411

  15. Transcriptome profiling analysis reveals biomarkers in colon cancer samples of various differentiation

    PubMed Central

    Yu, Tonghu; Zhang, Huaping; Qi, Hong

    2018-01-01

    The aim of the present study was to investigate more colon cancer-related genes in different stages. Gene expression profile E-GEOD-62932 was extracted for differentially expressed gene (DEG) screening. Series test of cluster analysis was used to obtain significant trending models. Based on the Gene Ontology and Kyoto Encyclopedia of Genes and Genomes databases, functional and pathway enrichment analysis were processed and a pathway relation network was constructed. Gene co-expression network and gene signal network were constructed for common DEGs. The DEGs with the same trend were clustered and in total, 16 clusters with statistical significance were obtained. The screened DEGs were enriched into small molecule metabolic process and metabolic pathways. The pathway relation network was constructed with 57 nodes. A total of 328 common DEGs were obtained. Gene signal network was constructed with 71 nodes. Gene co-expression network was constructed with 161 nodes and 211 edges. ABCD3, CPT2, AGL and JAM2 are potential biomarkers for the diagnosis of colon cancer. PMID:29928385

  16. Clusters of ancestrally related genes that show paralogy in whole or in part are a major feature of the genomes of humans and other species.

    PubMed

    Walker, Michael B; King, Benjamin L; Paigen, Kenneth

    2012-01-01

    Arrangements of genes along chromosomes are a product of evolutionary processes, and we can expect that preferable arrangements will prevail over the span of evolutionary time, often being reflected in the non-random clustering of structurally and/or functionally related genes. Such non-random arrangements can arise by two distinct evolutionary processes: duplications of DNA sequences that give rise to clusters of genes sharing both sequence similarity and common sequence features and the migration together of genes related by function, but not by common descent. To provide a background for distinguishing between the two, which is important for future efforts to unravel the evolutionary processes involved, we here provide a description of the extent to which ancestrally related genes are found in proximity.Towards this purpose, we combined information from five genomic datasets, InterPro, SCOP, PANTHER, Ensembl protein families, and Ensembl gene paralogs. The results are provided in publicly available datasets (http://cgd.jax.org/datasets/clustering/paraclustering.shtml) describing the extent to which ancestrally related genes are in proximity beyond what is expected by chance (i.e. form paraclusters) in the human and nine other vertebrate genomes, as well as the D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae genomes. With the exception of Saccharomyces, paraclusters are a common feature of the genomes we examined. In the human genome they are estimated to include at least 22% of all protein coding genes. Paraclusters are far more prevalent among some gene families than others, are highly species or clade specific and can evolve rapidly, sometimes in response to environmental cues. Altogether, they account for a large portion of the functional clustering previously reported in several genomes.

  17. CYP76M7 Is an ent-Cassadiene C11α-Hydroxylase Defining a Second Multifunctional Diterpenoid Biosynthetic Gene Cluster in Rice[W][OA

    PubMed Central

    Swaminathan, Sivakumar; Morrone, Dana; Wang, Qiang; Fulton, D. Bruce; Peters, Reuben J.

    2009-01-01

    Biosynthetic gene clusters are common in microbial organisms, but rare in plants, raising questions regarding the evolutionary forces that drive their assembly in multicellular eukaryotes. Here, we characterize the biochemical function of a rice (Oryza sativa) cytochrome P450 monooxygenase, CYP76M7, which seems to act in the production of antifungal phytocassanes and defines a second diterpenoid biosynthetic gene cluster in rice. This cluster is uniquely multifunctional, containing enzymatic genes involved in the production of two distinct sets of phytoalexins, the antifungal phytocassanes and antibacterial oryzalides/oryzadiones, with the corresponding genes being subject to distinct transcriptional regulation. The lack of uniform coregulation of the genes within this multifunctional cluster suggests that this was not a primary driving force in its assembly. However, the cluster is dedicated to specialized metabolism, as all genes in the cluster are involved in phytoalexin metabolism. We hypothesize that this dedication to specialized metabolism led to the assembly of the corresponding biosynthetic gene cluster. Consistent with this hypothesis, molecular phylogenetic comparison demonstrates that the two rice diterpenoid biosynthetic gene clusters have undergone independent elaboration to their present-day forms, indicating continued evolutionary pressure for coclustering of enzymatic genes encoding components of related biosynthetic pathways. PMID:19825834

  18. Genetic analysis of the resistance to eight anthracnose races in the common bean differential cultivar Kaboon.

    PubMed

    Campa, Ana; Giraldez, Ramón; Ferreira, Juan José

    2011-06-01

    Resistance to the eight races (3, 7, 19, 31, 81, 449, 453, and 1545) of the pathogenic fungus Colletotrichum lindemuthianum (anthracnose) was evaluated in F(3) families derived from the cross between the anthracnose differential bean cultivars Kaboon and Michelite. Molecular marker analyses were carried out in the F(2) individuals in order to map and characterize the anthracnose resistance genes or gene clusters present in Kaboon. The analysis of the combined segregations indicates that the resistance present in Kaboon against these eight anthracnose races is determined by 13 different race-specific genes grouped in three clusters. One of these clusters, corresponding to locus Co-1 in linkage group (LG) 1, carries two dominant genes conferring specific resistance to races 81 and 1545, respectively, and a gene necessary (dominant complementary gene) for the specific resistance to race 31. A second cluster, corresponding to locus Co-3/9 in LG 4, carries six dominant genes conferring specific resistance to races 3, 7, 19, 449, 453, and 1545, respectively, and the second dominant complementary gene for the specific resistance to race 31. A third cluster of unknown location carries three dominant genes conferring specific resistance to races 449, 453, and 1545, respectively. This is the first time that two anthracnose resistance genes with a complementary mode of action have been mapped in common bean and their relationship with previously known Co- resistance genes established.

  19. Transcriptome Analysis of Aspergillus flavus Reveals veA-Dependent Regulation of Secondary Metabolite Gene Clusters, Including the Novel Aflavarin Cluster

    PubMed Central

    Cary, J. W.; Han, Z.; Yin, Y.; Lohmar, J. M.; Shantappa, S.; Harris-Coward, P. Y.; Mack, B.; Ehrlich, K. C.; Wei, Q.; Arroyo-Manzanares, N.; Uka, V.; Vanhaecke, L.; Bhatnagar, D.; Yu, J.; Nierman, W. C.; Johns, M. A.; Sorensen, D.; Shen, H.; De Saeger, S.; Diana Di Mavungu, J.

    2015-01-01

    The global regulatory veA gene governs development and secondary metabolism in numerous fungal species, including Aspergillus flavus. This is especially relevant since A. flavus infects crops of agricultural importance worldwide, contaminating them with potent mycotoxins. The most well-known are aflatoxins, which are cytotoxic and carcinogenic polyketide compounds. The production of aflatoxins and the expression of genes implicated in the production of these mycotoxins are veA dependent. The genes responsible for the synthesis of aflatoxins are clustered, a signature common for genes involved in fungal secondary metabolism. Studies of the A. flavus genome revealed many gene clusters possibly connected to the synthesis of secondary metabolites. Many of these metabolites are still unknown, or the association between a known metabolite and a particular gene cluster has not yet been established. In the present transcriptome study, we show that veA is necessary for the expression of a large number of genes. Twenty-eight out of the predicted 56 secondary metabolite gene clusters include at least one gene that is differentially expressed depending on presence or absence of veA. One of the clusters under the influence of veA is cluster 39. The absence of veA results in a downregulation of the five genes found within this cluster. Interestingly, our results indicate that the cluster is expressed mainly in sclerotia. Chemical analysis of sclerotial extracts revealed that cluster 39 is responsible for the production of aflavarin. PMID:26209694

  20. Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms

    PubMed Central

    Esplin, M Sean; Manuck, Tracy A.; Varner, Michael W.; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M.; Ilekis, John

    2015-01-01

    Objective We sought to employ an innovative tool based on common biological pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB), in order to enhance investigators' ability to identify to highlight common mechanisms and underlying genetic factors responsible for SPTB. Study Design A secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks gestation. Each woman was assessed for the presence of underlying SPTB etiologies. A hierarchical cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis using VEGAS software. Results 1028 women with SPTB were assigned phenotypes. Hierarchical clustering of the phenotypes revealed five major clusters. Cluster 1 (N=445) was characterized by maternal stress, cluster 2 (N=294) by premature membrane rupture, cluster 3 (N=120) by familial factors, and cluster 4 (N=63) by maternal comorbidities. Cluster 5 (N=106) was multifactorial, characterized by infection (INF), decidual hemorrhage (DH) and placental dysfunction (PD). These three phenotypes were highly correlated by Chi-square analysis [PD and DH (p<2.2e-6); PD and INF (p=6.2e-10); INF and DH (p=0.0036)]. Gene-based testing identified the INS (insulin) gene as significantly associated with cluster 3 of SPTB. Conclusion We identified 5 major clusters of SPTB based on a phenotype tool and hierarchal clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors underlying SPTB. PMID:26070700

  1. Using SNP genetic markers to elucidate the linkage of the Co-34/Phg-3 anthracnose and angular leaf spot resistance gene cluster with the Ur-14 resistance gene

    USDA-ARS?s Scientific Manuscript database

    The Ouro Negro common bean cultivar contains the Co-34/Phg-3 gene cluster that confers resistance to the anthracnose (ANT) and angular leaf spot (ALS) pathogens. These genes are tightly linked on chromosome 4. Ouro Negro also has the Ur-14 rust resistance gene, reportedly in the vicinity of Co- 34; ...

  2. Mining subspace clusters from DNA microarray data using large itemset techniques.

    PubMed

    Chang, Ye-In; Chen, Jiun-Rung; Tsai, Yueh-Chi

    2009-05-01

    Mining subspace clusters from the DNA microarrays could help researchers identify those genes which commonly contribute to a disease, where a subspace cluster indicates a subset of genes whose expression levels are similar under a subset of conditions. Since in a DNA microarray, the number of genes is far larger than the number of conditions, those previous proposed algorithms which compute the maximum dimension sets (MDSs) for any two genes will take a long time to mine subspace clusters. In this article, we propose the Large Itemset-Based Clustering (LISC) algorithm for mining subspace clusters. Instead of constructing MDSs for any two genes, we construct only MDSs for any two conditions. Then, we transform the task of finding the maximal possible gene sets into the problem of mining large itemsets from the condition-pair MDSs. Since we are only interested in those subspace clusters with gene sets as large as possible, it is desirable to pay attention to those gene sets which have reasonable large support values in the condition-pair MDSs. From our simulation results, we show that the proposed algorithm needs shorter processing time than those previous proposed algorithms which need to construct gene-pair MDSs.

  3. A tripartite clustering analysis on microRNA, gene and disease model.

    PubMed

    Shen, Chengcheng; Liu, Ying

    2012-02-01

    Alteration of gene expression in response to regulatory molecules or mutations could lead to different diseases. MicroRNAs (miRNAs) have been discovered to be involved in regulation of gene expression and a wide variety of diseases. In a tripartite biological network of human miRNAs, their predicted target genes and the diseases caused by altered expressions of these genes, valuable knowledge about the pathogenicity of miRNAs, involved genes and related disease classes can be revealed by co-clustering miRNAs, target genes and diseases simultaneously. Tripartite co-clustering can lead to more informative results than traditional co-clustering with only two kinds of members and pass the hidden relational information along the relation chain by considering multi-type members. Here we report a spectral co-clustering algorithm for k-partite graph to find clusters with heterogeneous members. We use the method to explore the potential relationships among miRNAs, genes and diseases. The clusters obtained from the algorithm have significantly higher density than randomly selected clusters, which means members in the same cluster are more likely to have common connections. Results also show that miRNAs in the same family based on the hairpin sequences tend to belong to the same cluster. We also validate the clustering results by checking the correlation of enriched gene functions and disease classes in the same cluster. Finally, widely studied miR-17-92 and its paralogs are analyzed as a case study to reveal that genes and diseases co-clustered with the miRNAs are in accordance with current research findings.

  4. Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values.

    PubMed

    Bhattacharya, Anindya; De, Rajat K

    2010-08-01

    Distance based clustering algorithms can group genes that show similar expression values under multiple experimental conditions. They are unable to identify a group of genes that have similar pattern of variation in their expression values. Previously we developed an algorithm called divisive correlation clustering algorithm (DCCA) to tackle this situation, which is based on the concept of correlation clustering. But this algorithm may also fail for certain cases. In order to overcome these situations, we propose a new clustering algorithm, called average correlation clustering algorithm (ACCA), which is able to produce better clustering solution than that produced by some others. ACCA is able to find groups of genes having more common transcription factors and similar pattern of variation in their expression values. Moreover, ACCA is more efficient than DCCA with respect to the time of execution. Like DCCA, we use the concept of correlation clustering concept introduced by Bansal et al. ACCA uses the correlation matrix in such a way that all genes in a cluster have the highest average correlation values with the genes in that cluster. We have applied ACCA and some well-known conventional methods including DCCA to two artificial and nine gene expression datasets, and compared the performance of the algorithms. The clustering results of ACCA are found to be more significantly relevant to the biological annotations than those of the other methods. Analysis of the results show the superiority of ACCA over some others in determining a group of genes having more common transcription factors and with similar pattern of variation in their expression profiles. Availability of the software: The software has been developed using C and Visual Basic languages, and can be executed on the Microsoft Windows platforms. The software may be downloaded as a zip file from http://www.isical.ac.in/~rajat. Then it needs to be installed. Two word files (included in the zip file) need to be consulted before installation and execution of the software. Copyright 2010 Elsevier Inc. All rights reserved.

  5. An ergot alkaloid biosynthesis gene and clustered hypothetical genes from Aspergillus fumigatus.

    PubMed

    Coyle, Christine M; Panaccione, Daniel G

    2005-06-01

    The ergot alkaloids are a family of indole-derived mycotoxins with a variety of significant biological activities. Aspergillus fumigatus, a common airborne fungus and opportunistic human pathogen, and several fungi in the relatively distant taxon Clavicipitaceae (clavicipitaceous fungi) produce different sets of ergot alkaloids. The ergot alkaloids of these divergent fungi share a four-member ergoline ring but differ in the number, type, and position of the side chains. Several genes required for ergot alkaloid production are known in the clavicipitaceous fungi, and these genes are clustered in the genome of the ergot fungus Claviceps purpurea. We investigated whether the ergot alkaloids of A. fumigatus have a common biosynthetic and genetic origin with those of the clavicipitaceous fungi. A homolog of dmaW, the gene controlling the determinant step in the ergot alkaloid pathway of clavicipitaceous fungi, was identified in the A. fumigatus genome. Knockout of dmaW eliminated all known ergot alkaloids from A. fumigatus, and complementation of the mutation restored ergot alkaloid production. Clustered with dmaW in the A. fumigatus genome are sequences corresponding to five genes previously proposed to encode steps in the ergot alkaloid pathway of C. purpurea, as well as additional sequences whose deduced protein products are consistent with their involvement in the ergot alkaloid pathway. The corresponding genes have similarities in their nucleotide sequences, but the orientations and positions within the cluster of several of these genes differ. The data indicate that the ergot alkaloid biosynthetic capabilities in A. fumigatus and the clavicipitaceous fungi had a common origin.

  6. An Ergot Alkaloid Biosynthesis Gene and Clustered Hypothetical Genes from Aspergillus fumigatus†

    PubMed Central

    Coyle, Christine M.; Panaccione, Daniel G.

    2005-01-01

    The ergot alkaloids are a family of indole-derived mycotoxins with a variety of significant biological activities. Aspergillus fumigatus, a common airborne fungus and opportunistic human pathogen, and several fungi in the relatively distant taxon Clavicipitaceae (clavicipitaceous fungi) produce different sets of ergot alkaloids. The ergot alkaloids of these divergent fungi share a four-member ergoline ring but differ in the number, type, and position of the side chains. Several genes required for ergot alkaloid production are known in the clavicipitaceous fungi, and these genes are clustered in the genome of the ergot fungus Claviceps purpurea. We investigated whether the ergot alkaloids of A. fumigatus have a common biosynthetic and genetic origin with those of the clavicipitaceous fungi. A homolog of dmaW, the gene controlling the determinant step in the ergot alkaloid pathway of clavicipitaceous fungi, was identified in the A. fumigatus genome. Knockout of dmaW eliminated all known ergot alkaloids from A. fumigatus, and complementation of the mutation restored ergot alkaloid production. Clustered with dmaW in the A. fumigatus genome are sequences corresponding to five genes previously proposed to encode steps in the ergot alkaloid pathway of C. purpurea, as well as additional sequences whose deduced protein products are consistent with their involvement in the ergot alkaloid pathway. The corresponding genes have similarities in their nucleotide sequences, but the orientations and positions within the cluster of several of these genes differ. The data indicate that the ergot alkaloid biosynthetic capabilities in A. fumigatus and the clavicipitaceous fungi had a common origin. PMID:15933009

  7. Coral comparative genomics reveal expanded Hox cluster in the cnidarian-bilaterian ancestor.

    PubMed

    DuBuc, Timothy Q; Ryan, Joseph F; Shinzato, Chuya; Satoh, Nori; Martindale, Mark Q

    2012-12-01

    The key developmental role of the Hox cluster of genes was established prior to the last common ancestor of protostomes and deuterostomes and the subsequent evolution of this cluster has played a major role in the morphological diversity exhibited in extant bilaterians. Despite 20 years of research into cnidarian Hox genes, the nature of the cnidarian-bilaterian ancestral Hox cluster remains unclear. In an attempt to further elucidate this critical phylogenetic node, we have characterized the Hox cluster of the recently sequenced Acropora digitifera genome. The A. digitifera genome contains two anterior Hox genes (PG1 and PG2) linked to an Eve homeobox gene and an Anthox1A gene, which is thought to be either a posterior or posterior/central Hox gene. These data show that the Hox cluster of the cnidarian-bilaterian ancestor was more extensive than previously thought. The results are congruent with the existence of an ancient set of constraints on the Hox cluster and reinforce the importance of incorporating a wide range of animal species to reconstruct critical ancestral nodes.

  8. Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms.

    PubMed

    Esplin, M Sean; Manuck, Tracy A; Varner, Michael W; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M; Ilekis, John

    2015-09-01

    We sought to use an innovative tool that is based on common biologic pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB) to enhance investigators' ability to identify and to highlight common mechanisms and underlying genetic factors that are responsible for SPTB. We performed a secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks' gestation. Each woman was assessed for the presence of underlying SPTB causes. A hierarchic cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis with the use of VEGAS software. One thousand twenty-eight women with SPTB were assigned phenotypes. Hierarchic clustering of the phenotypes revealed 5 major clusters. Cluster 1 (n = 445) was characterized by maternal stress; cluster 2 (n = 294) was characterized by premature membrane rupture; cluster 3 (n = 120) was characterized by familial factors, and cluster 4 (n = 63) was characterized by maternal comorbidities. Cluster 5 (n = 106) was multifactorial and characterized by infection (INF), decidual hemorrhage (DH), and placental dysfunction (PD). These 3 phenotypes were correlated highly by χ(2) analysis (PD and DH, P < 2.2e-6; PD and INF, P = 6.2e-10; INF and DH, (P = .0036). Gene-based testing identified the INS (insulin) gene as significantly associated with cluster 3 of SPTB. We identified 5 major clusters of SPTB based on a phenotype tool and hierarch clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors that were underlying SPTB. Copyright © 2015 Elsevier Inc. All rights reserved.

  9. Fast gene ontology based clustering for microarray experiments.

    PubMed

    Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

    2008-11-21

    Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  10. Clustering Genes of Common Evolutionary History

    PubMed Central

    Gori, Kevin; Suchan, Tomasz; Alvarez, Nadir; Goldman, Nick; Dessimoz, Christophe

    2016-01-01

    Phylogenetic inference can potentially result in a more accurate tree using data from multiple loci. However, if the loci are incongruent—due to events such as incomplete lineage sorting or horizontal gene transfer—it can be misleading to infer a single tree. To address this, many previous contributions have taken a mechanistic approach, by modeling specific processes. Alternatively, one can cluster loci without assuming how these incongruencies might arise. Such “process-agnostic” approaches typically infer a tree for each locus and cluster these. There are, however, many possible combinations of tree distance and clustering methods; their comparative performance in the context of tree incongruence is largely unknown. Furthermore, because standard model selection criteria such as AIC cannot be applied to problems with a variable number of topologies, the issue of inferring the optimal number of clusters is poorly understood. Here, we perform a large-scale simulation study of phylogenetic distances and clustering methods to infer loci of common evolutionary history. We observe that the best-performing combinations are distances accounting for branch lengths followed by spectral clustering or Ward’s method. We also introduce two statistical tests to infer the optimal number of clusters and show that they strongly outperform the silhouette criterion, a general-purpose heuristic. We illustrate the usefulness of the approach by 1) identifying errors in a previous phylogenetic analysis of yeast species and 2) identifying topological incongruence among newly sequenced loci of the globeflower fly genus Chiastocheta. We release treeCl, a new program to cluster genes of common evolutionary history (http://git.io/treeCl). PMID:26893301

  11. Use of keyword hierarchies to interpret gene expression patterns.

    PubMed

    Masys, D R; Welsh, J B; Lynn Fink, J; Gribskov, M; Klacansky, I; Corbeil, J

    2001-04-01

    High-density microarray technology permits the quantitative and simultaneous monitoring of thousands of genes. The interpretation challenge is to extract relevant information from this large amount of data. A growing variety of statistical analysis approaches are available to identify clusters of genes that share common expression characteristics, but provide no information regarding the biological similarities of genes within clusters. The published literature provides a potential source of information to assist in interpretation of clustering results. We describe a data mining method that uses indexing terms ('keywords') from the published literature linked to specific genes to present a view of the conceptual similarity of genes within a cluster or group of interest. The method takes advantage of the hierarchical nature of Medical Subject Headings used to index citations in the MEDLINE database, and the registry numbers applied to enzymes.

  12. Genome Neighborhood Network Reveals Insights into Enediyne Biosynthesis and Facilitates Prediction and Prioritization for Discovery

    PubMed Central

    Rudolf, Jeffrey D.; Yan, Xiaohui; Shen, Ben

    2015-01-01

    The enediynes are one of the most fascinating families of bacterial natural products given their unprecedented molecular architecture and extraordinary cytotoxicity. Enediynes are rare with only 11 structurally characterized members and four additional members isolated in their cycloaromatized form. Recent advances in DNA sequencing have resulted in an explosion of microbial genomes. A virtual survey of the GenBank and JGI genome databases revealed 87 enediyne biosynthetic gene clusters from 78 bacteria strains, implying enediynes are more common than previously thought. Here we report the construction and analysis of an enediyne genome neighborhood network (GNN) as a high-throughput approach to analyze secondary metabolite gene clusters. Analysis of the enediyne GNN facilitated rapid gene cluster annotation, revealed genetic trends in enediyne biosynthetic gene clusters resulting in a simple prediction scheme to determine 9- vs 10-membered enediyne gene clusters, and supported a genomic-based strain prioritization method for enediyne discovery. PMID:26318027

  13. Many nonuniversal archaeal ribosomal proteins are found in conserved gene clusters

    PubMed Central

    WANG, JIACHEN; DASGUPTA, INDRANI; FOX, GEORGE E.

    2009-01-01

    The genomic associations of the archaeal ribosomal proteins, (r-proteins), were examined in detail. The archaeal versions of the universal r-protein genes are typically in clusters similar or identical and to those found in bacteria. Of the 35 nonuniversal archaeal r-protein genes examined, the gene encoding L18e was found to be associated with the conserved L13 cluster, whereas the genes for S4e, L32e and L19e were found in the archaeal version of the spc operon. Eleven nonuniversal protein genes were not associated with any common genomic context. Of the remaining 19 protein genes, 17 were convincingly assigned to one of 10 previously unrecognized gene clusters. Examination of the gene content of these clusters revealed multiple associations with genes involved in the initiation of protein synthesis, transcription or other cellular processes. The lack of such associations in the universal clusters suggests that initially the ribosome evolved largely independently of other processes. More recently it likely has evolved in concert with other cellular systems. It was also verified that a second copy of the gene encoding L7ae found in some bacteria is actually a homolog of the gene encoding L30e and should be annotated as such. PMID:19478915

  14. From Coexpression to Coregulation: An Approach to Inferring Transcriptional Regulation Among Gene Classes from Large-Scale Expression Data

    NASA Technical Reports Server (NTRS)

    Mjolsness, Eric; Castano, Rebecca; Mann, Tobias; Wold, Barbara

    2000-01-01

    We provide preliminary evidence that existing algorithms for inferring small-scale gene regulation networks from gene expression data can be adapted to large-scale gene expression data coming from hybridization microarrays. The essential steps are (I) clustering many genes by their expression time-course data into a minimal set of clusters of co-expressed genes, (2) theoretically modeling the various conditions under which the time-courses are measured using a continuous-time analog recurrent neural network for the cluster mean time-courses, (3) fitting such a regulatory model to the cluster mean time courses by simulated annealing with weight decay, and (4) analysing several such fits for commonalities in the circuit parameter sets including the connection matrices. This procedure can be used to assess the adequacy of existing and future gene expression time-course data sets for determining transcriptional regulatory relationships such as coregulation.

  15. Conditions for the Evolution of Gene Clusters in Bacterial Genomes

    PubMed Central

    Ballouz, Sara; Francis, Andrew R.; Lan, Ruiting; Tanaka, Mark M.

    2010-01-01

    Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model), genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters. PMID:20168992

  16. Genomic and evolutionary comparisons of diazotrophic and pathogenic bacteria of the order Rhizobiales.

    PubMed

    Carvalho, Fabíola M; Souza, Rangel C; Barcellos, Fernando G; Hungria, Mariangela; Vasconcelos, Ana Tereza R

    2010-02-08

    Species belonging to the Rhizobiales are intriguing and extensively researched for including both bacteria with the ability to fix nitrogen when in symbiosis with leguminous plants and pathogenic bacteria to animals and plants. Similarities between the strategies adopted by pathogenic and symbiotic Rhizobiales have been described, as well as high variability related to events of horizontal gene transfer. Although it is well known that chromosomal rearrangements, mutations and horizontal gene transfer influence the dynamics of bacterial genomes, in Rhizobiales, the scenario that determine pathogenic or symbiotic lifestyle are not clear and there are very few studies of comparative genomic between these classes of prokaryotic microorganisms trying to delineate the evolutionary characterization of symbiosis and pathogenesis. Non-symbiotic nitrogen-fixing bacteria and bacteria involved in bioremediation closer to symbionts and pathogens in study may assist in the origin and ancestry genes and the gene flow occurring in Rhizobiales. The genomic comparisons of 19 species of Rhizobiales, including nitrogen-fixing, bioremediators and pathogens resulted in 33 common clusters to biological nitrogen fixation and pathogenesis, 15 clusters exclusive to all nitrogen-fixing bacteria and bacteria involved in bioremediation, 13 clusters found in only some nitrogen-fixing and bioremediation bacteria, 01 cluster exclusive to some symbionts, and 01 cluster found only in some pathogens analyzed. In BBH performed to all strains studied, 77 common genes were obtained, 17 of which were related to biological nitrogen fixation and pathogenesis. Phylogenetic reconstructions for Fix, Nif, Nod, Vir, and Trb showed possible horizontal gene transfer events, grouping species of different phenotypes. The presence of symbiotic and virulence genes in both pathogens and symbionts does not seem to be the only determinant factor for lifestyle evolution in these microorganisms, although they may act in common stages of host infection. The phylogenetic analysis for many distinct operons involved in these processes emphasizes the relevance of horizontal gene transfer events in the symbiotic and pathogenic similarity.

  17. A Nomadic Subtelomeric Disease Resistance Gene Cluster in Common Bean1[W

    PubMed Central

    David, Perrine; Chen, Nicolas W.G.; Pedrosa-Harand, Andrea; Thareau, Vincent; Sévignac, Mireille; Cannon, Steven B.; Debouck, Daniel; Langin, Thierry; Geffroy, Valérie

    2009-01-01

    The B4 resistance (R) gene cluster is one of the largest clusters known in common bean (Phaseolus vulgaris [Pv]). It is located in a peculiar genomic environment in the subtelomeric region of the short arm of chromosome 4, adjacent to two heterochromatic blocks (knobs). We sequenced 650 kb spanning this locus and annotated 97 genes, 26 of which correspond to Coiled-Coil-Nucleotide-Binding-Site-Leucine-Rich-Repeat (CNL). Conserved microsynteny was observed between the Pv B4 locus and corresponding regions of Medicago truncatula and Lotus japonicus in chromosomes Mt6 and Lj2, respectively. The notable exception was the CNL sequences, which were completely absent in these regions. The origin of the Pv B4-CNL sequences was investigated through phylogenetic analysis, which reveals that, in the Pv genome, paralogous CNL genes are shared among nonhomologous chromosomes (4 and 11). Together, our results suggest that Pv B4-CNL was derived from CNL sequences from another cluster, the Co-2 cluster, through an ectopic recombination event. Integration of the soybean (Glycine max) genome data enables us to date more precisely this event and also to infer that a single CNL moved from the Co-2 to the B4 cluster. Moreover, we identified a new 528-bp satellite repeat, referred to as khipu, specific to the Phaseolus genus, present both between B4-CNL sequences and in the two knobs identified at the B4 R gene cluster. The khipu repeat is present on most chromosomal termini, indicating the existence of frequent ectopic recombination events in Pv subtelomeric regions. Our results highlight the importance of ectopic recombination in R gene evolution. PMID:19776165

  18. Organization of the Escherichia coli K-12 gene cluster responsible for production of the extracellular polysaccharide colanic acid.

    PubMed Central

    Stevenson, G; Andrianopoulos, K; Hobbs, M; Reeves, P R

    1996-01-01

    Colanic acid (CA) is an extracellular polysaccharide produced by most Escherichia coli strains as well as by other species of the family Enterobacteriaceae. We have determined the sequence of a 23-kb segment of the E. coli K-12 chromosome which includes the cluster of genes necessary for production of CA. The CA cluster comprises 19 genes. Two other sequenced genes (orf1.3 and galF), which are situated between the CA cluster and the O-antigen cluster, were shown to be unnecessary for CA production. The CA cluster includes genes for synthesis of GDP-L-fucose, one of the precursors of CA, and the gene for one of the enzymes in this pathway (GDP-D-mannose 4,6-dehydratase) was identified by biochemical assay. Six of the inferred proteins show sequence similarity to glycosyl transferases, and two others have sequence similarity to acetyl transferases. Another gene (wzx) is predicted to encode a protein with multiple transmembrane segments and may function in export of the CA repeat unit from the cytoplasm into the periplasm in a process analogous to O-unit export. The first three genes of the cluster are predicted to encode an outer membrane lipoprotein, a phosphatase, and an inner membrane protein with an ATP-binding domain. Since homologs of these genes are found in other extracellular polysaccharide gene clusters, they may have a common function, such as export of polysaccharide from the cell. PMID:8759852

  19. Cancer Detection in Microarray Data Using a Modified Cat Swarm Optimization Clustering Approach

    PubMed

    M, Pandi; R, Balamurugan; N, Sadhasivam

    2017-12-29

    Objective: A better understanding of functional genomics can be obtained by extracting patterns hidden in gene expression data. This could have paramount implications for cancer diagnosis, gene treatments and other domains. Clustering may reveal natural structures and identify interesting patterns in underlying data. The main objective of this research was to derive a heuristic approach to detection of highly co-expressed genes related to cancer from gene expression data with minimum Mean Squared Error (MSE). Methods: A modified CSO algorithm using Harmony Search (MCSO-HS) for clustering cancer gene expression data was applied. Experiment results are analyzed using two cancer gene expression benchmark datasets, namely for leukaemia and for breast cancer. Result: The results indicated MCSO-HS to be better than HS and CSO, 13% and 9% with the leukaemia dataset. For breast cancer dataset improvement was by 22% and 17%, respectively, in terms of MSE. Conclusion: The results showed MCSO-HS to outperform HS and CSO with both benchmark datasets. To validate the clustering results, this work was tested with internal and external cluster validation indices. Also this work points to biological validation of clusters with gene ontology in terms of function, process and component. Creative Commons Attribution License

  20. Structure-related clustering of gene expression fingerprints of thp-1 cells exposed to smaller polycyclic aromatic hydrocarbons.

    PubMed

    Wan, B; Yarbrough, J W; Schultz, T W

    2008-01-01

    This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.

  1. A Zn(II)2Cys6 DNA binding protein regulates the sirodesmin PL biosynthetic gene cluster in Leptosphaeria maculans

    PubMed Central

    Fox, Ellen M.; Gardiner, Donald M.; Keller, Nancy P.; Howlett, Barbara J.

    2008-01-01

    A gene, sirZ, encoding a Zn(II)2Cys6 DNA binding protein is present in a cluster of genes responsible for the biosynthesis of the epipolythiodioxopiperazine (ETP) toxin, sirodesmin PL in the ascomycete plant pathogen, Leptosphaeria maculans. RNA-mediated silencing of sirZ gives rise to transformants that produce only residual amounts of sirodesmin PL and display a decrease in the transcription of several sirodesmin PL biosynthetic genes. This indicates that SirZ is a major regulator of this gene cluster. Proteins similar to SirZ are encoded in the gliotoxin biosynthetic gene cluster of Aspergillus fumigatus (gliZ) and in an ETP-like cluster in Penicillium lilacinoechinulatum (PlgliZ). Despite its high level of sequence similarity to gliZ, PlgliZ is unable to complement the gliotoxin-deficiency of a mutant of gliZ in A. fumigatus. Putative binding sites for these regulatory proteins in the promoters of genes in these clusters were predicted using bioinformatic analysis. These sites are similar to those commonly bound by other proteins with Zn(II)2Cys6 DNA binding domains. PMID:18023597

  2. Clavine Alkaloids Gene Clusters of Penicillium and Related Fungi: Evolutionary Combination of Prenyltransferases, Monooxygenases and Dioxygenases

    PubMed Central

    Martín, Juan F.; Liras, Paloma

    2017-01-01

    The clavine alkaloids produced by the fungi of the Aspergillaceae and Arthrodermatacea families differ from the ergot alkaloids produced by Claviceps and Neotyphodium. The clavine alkaloids lack the extensive peptide chain modifications that occur in lysergic acid derived ergot alkaloids. Both clavine and ergot alkaloids arise from the condensation of tryptophan and dimethylallylpyrophosphate by the action of the dimethylallyltryptophan synthase. The first five steps of the biosynthetic pathway that convert tryptophan and dimethylallyl-pyrophosphate (DMA-PP) in chanoclavine-1-aldehyde are common to both clavine and ergot alkaloids. The biosynthesis of ergot alkaloids has been extensively studied and is not considered in this article. We focus this review on recent advances in the gene clusters for clavine alkaloids in the species of Penicillium, Aspergillus (Neosartorya), Arthroderma and Trychophyton and the enzymes encoded by them. The final products of the clavine alkaloids pathways derive from the tetracyclic ergoline ring, which is modified by late enzymes, including a reverse type prenyltransferase, P450 monooxygenases and acetyltransferases. In Aspergillus japonicus, a α-ketoglutarate and Fe2+-dependent dioxygenase is involved in the cyclization of a festuclavine-like unknown type intermediate into cycloclavine. Related dioxygenases occur in the biosynthetic gene clusters of ergot alkaloids in Claviceps purpurea and also in the clavine clusters in Penicillium species. The final products of the clavine alkaloid pathway in these fungi differ from each other depending on the late biosynthetic enzymes involved. An important difference between clavine and ergot alkaloid pathways is that clavine producers lack the enzyme CloA, a P450 monooxygenase, involved in one of the steps of the conversion of chanoclavine-1-aldehyde into lysergic acid. Bioinformatic analysis of the sequenced genomes of the Aspergillaceae and Arthrodermataceae fungi showed the presence of clavine gene clusters in Arthroderma species, Penicillium roqueforti, Penicillium commune, Penicillium camemberti, Penicillium expansum, Penicillium steckii and Penicillium griseofulvum. Analysis of the gene clusters in several clavine alkaloid producers indicates that there are gene gains, gene losses and gene rearrangements. These findings may be explained by a divergent evolution of the gene clusters of ergot and clavine alkaloids from a common ancestral progenitor six genes cluster although horizontal gene transfer of some specific genes may have occurred more recently. PMID:29186777

  3. Genetic Changes Accompanying the Domestication of Pisum sativum: Is there a Common Genetic Basis to the ‘Domestication Syndrome’ for Legumes?

    PubMed Central

    Weeden, Norman F.

    2007-01-01

    Background and Aims The changes that occur during the domestication of crops such as maize and common bean appear to be controlled by relatively few genes. This study investigates the genetic basis of domestication in pea (Pisum sativum) and compares the genes involved with those determined to be important in common bean domestication. Methods Quantitative trait loci and classical genetic analysis are used to investigate and identify the genes modified at three stages of the domestication process. Five recombinant inbred populations involving crosses between different lines representing different stages are examined. Key Results A minimum of 15 known genes, in addition to a relatively few major quantitative trait loci, are identified as being critical to the domestication process. These genes control traits such as pod dehiscence, seed dormancy, seed size and other seed quality characters, stem height, root mass, and harvest index. Several of the genes have pleiotropic effects that in species possessing a more rudimentary genetic characterization might have been interpreted as clusters of genes. Very little evidence for gene clustering was found in pea. When compared with common bean, pea has used a different set of genes to produce the same or similar phenotypic changes. Conclusions Similar to results for common bean, relatively few genes appear to have been modified during the domestication of pea. However, the genes involved are different, and there does not appear to be a common genetic basis to ‘domestication syndrome’ in the Fabaceae. PMID:17660515

  4. Identification of Clusters that Condition Resistance to Anthracnose in the Common Bean Differential Cultivars AB136 and MDRK.

    PubMed

    Campa, Ana; Trabanco, Noemí; Ferreira, Juan José

    2017-12-01

    The correct identification of the anthracnose resistance systems present in the common bean cultivars AB136 and MDRK is important because both are included in the set of 12 differential cultivars proposed for use in classifying the races of the anthracnose causal agent, Colletrotrichum lindemuthianum. In this work, the responses against seven C. lindemuthianum races were analyzed in a recombinant inbred line population derived from the cross AB136 × MDRK. A genetic linkage map of 100 molecular markers distributed across the 11 bean chromosomes was developed in this population to locate the gene or genes conferring resistance against each race, based on linkage analyses and χ 2 tests of independence. The identified anthracnose resistance genes were organized in clusters. Two clusters were found in AB136: one located on linkage group Pv07, which corresponds to the anthracnose resistance cluster Co-5, and the other located at the end of linkage group Pv11, which corresponds to the Co-2 cluster. The presence of resistance genes at the Co-5 cluster in AB136 was validated through an allelism test conducted in the F 2 population TU × AB136. The presence of resistance genes at the Co-2 cluster in AB136 was validated through genetic dissection using the F 2:3 population ABM3 × MDRK, in which it was directly mapped to a genomic position between 46.01 and 47.77 Mb of chromosome Pv11. In MDRK, two independent clusters were identified: one located on linkage group Pv01, corresponding to the Co-1 cluster, and the second located on LG Pv04, corresponding to the Co-3 cluster. This report enhances the understanding of the race-specific Phaseolus vulgaris-C. lindemuthianum interactions and will be useful in breeding programs.

  5. A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data.

    PubMed

    Nishiyama, Takeshi; Takahashi, Kunihiko; Tango, Toshiro; Pinto, Dalila; Scherer, Stephen W; Takami, Satoshi; Kishino, Hirohisa

    2011-05-26

    Several statistical tests have been developed for analyzing genome-wide association data by incorporating gene pathway information in terms of gene sets. Using these methods, hundreds of gene sets are typically tested, and the tested gene sets often overlap. This overlapping greatly increases the probability of generating false positives, and the results obtained are difficult to interpret, particularly when many gene sets show statistical significance. We propose a flexible statistical framework to circumvent these problems. Inspired by spatial scan statistics for detecting clustering of disease occurrence in the field of epidemiology, we developed a scan statistic to extract disease-associated gene clusters from a whole gene pathway. Extracting one or a few significant gene clusters from a global pathway limits the overall false positive probability, which results in increased statistical power, and facilitates the interpretation of test results. In the present study, we applied our method to genome-wide association data for rare copy-number variations, which have been strongly implicated in common diseases. Application of our method to a simulated dataset demonstrated the high accuracy of this method in detecting disease-associated gene clusters in a whole gene pathway. The scan statistic approach proposed here shows a high level of accuracy in detecting gene clusters in a whole gene pathway. This study has provided a sound statistical framework for analyzing genome-wide rare CNV data by incorporating topological information on the gene pathway.

  6. High-throughput platform for the discovery of elicitors of silent bacterial gene clusters.

    PubMed

    Seyedsayamdost, Mohammad R

    2014-05-20

    Over the past decade, bacterial genome sequences have revealed an immense reservoir of biosynthetic gene clusters, sets of contiguous genes that have the potential to produce drugs or drug-like molecules. However, the majority of these gene clusters appear to be inactive for unknown reasons prompting terms such as "cryptic" or "silent" to describe them. Because natural products have been a major source of therapeutic molecules, methods that rationally activate these silent clusters would have a profound impact on drug discovery. Herein, a new strategy is outlined for awakening silent gene clusters using small molecule elicitors. In this method, a genetic reporter construct affords a facile read-out for activation of the silent cluster of interest, while high-throughput screening of small molecule libraries provides potential inducers. This approach was applied to two cryptic gene clusters in the pathogenic model Burkholderia thailandensis. The results not only demonstrate a prominent activation of these two clusters, but also reveal that the majority of elicitors are themselves antibiotics, most in common clinical use. Antibiotics, which kill B. thailandensis at high concentrations, act as inducers of secondary metabolism at low concentrations. One of these antibiotics, trimethoprim, served as a global activator of secondary metabolism by inducing at least five biosynthetic pathways. Further application of this strategy promises to uncover the regulatory networks that activate silent gene clusters while at the same time providing access to the vast array of cryptic molecules found in bacteria.

  7. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    PubMed

    Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong

    2015-01-01

    Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  8. A Genomics Based Discovery of Secondary Metabolite Biosynthetic Gene Clusters in Aspergillus ustus

    PubMed Central

    Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong

    2015-01-01

    Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180

  9. Noninvasive analysis of the sputum transcriptome discriminates clinical phenotypes of asthma.

    PubMed

    Yan, Xiting; Chu, Jen-Hwa; Gomez, Jose; Koenigs, Maria; Holm, Carole; He, Xiaoxuan; Perez, Mario F; Zhao, Hongyu; Mane, Shrikant; Martinez, Fernando D; Ober, Carole; Nicolae, Dan L; Barnes, Kathleen C; London, Stephanie J; Gilliland, Frank; Weiss, Scott T; Raby, Benjamin A; Cohn, Lauren; Chupp, Geoffrey L

    2015-05-15

    The airway transcriptome includes genes that contribute to the pathophysiologic heterogeneity seen in individuals with asthma. We analyzed sputum gene expression for transcriptomic endotypes of asthma (TEA), gene signatures that discriminate phenotypes of disease. Gene expression in the sputum and blood of patients with asthma was measured using Affymetrix microarrays. Unsupervised clustering analysis based on pathways from the Kyoto Encyclopedia of Genes and Genomes was used to identify TEA clusters. Logistic regression analysis of matched blood samples defined an expression profile in the circulation to determine the TEA cluster assignment in a cohort of children with asthma to replicate clinical phenotypes. Three TEA clusters were identified. TEA cluster 1 had the most subjects with a history of intubation (P = 0.05), a lower prebronchodilator FEV1 (P = 0.006), a higher bronchodilator response (P = 0.03), and higher exhaled nitric oxide levels (P = 0.04) compared with the other TEA clusters. TEA cluster 2, the smallest cluster, had the most subjects that were hospitalized for asthma (P = 0.04). TEA cluster 3, the largest cluster, had normal lung function, low exhaled nitric oxide levels, and lower inhaled steroid requirements. Evaluation of TEA clusters in children confirmed that TEA clusters 1 and 2 are associated with a history of intubation (P = 5.58 × 10(-6)) and hospitalization (P = 0.01), respectively. There are common patterns of gene expression in the sputum and blood of children and adults that are associated with near-fatal, severe, and milder asthma.

  10. Noninvasive Analysis of the Sputum Transcriptome Discriminates Clinical Phenotypes of Asthma

    PubMed Central

    Yan, Xiting; Chu, Jen-Hwa; Gomez, Jose; Koenigs, Maria; Holm, Carole; He, Xiaoxuan; Perez, Mario F.; Zhao, Hongyu; Mane, Shrikant; Martinez, Fernando D.; Ober, Carole; Nicolae, Dan L.; Barnes, Kathleen C.; London, Stephanie J.; Gilliland, Frank; Weiss, Scott T.; Raby, Benjamin A.; Cohn, Lauren

    2015-01-01

    Rationale: The airway transcriptome includes genes that contribute to the pathophysiologic heterogeneity seen in individuals with asthma. Objectives: We analyzed sputum gene expression for transcriptomic endotypes of asthma (TEA), gene signatures that discriminate phenotypes of disease. Methods: Gene expression in the sputum and blood of patients with asthma was measured using Affymetrix microarrays. Unsupervised clustering analysis based on pathways from the Kyoto Encyclopedia of Genes and Genomes was used to identify TEA clusters. Logistic regression analysis of matched blood samples defined an expression profile in the circulation to determine the TEA cluster assignment in a cohort of children with asthma to replicate clinical phenotypes. Measurements and Main Results: Three TEA clusters were identified. TEA cluster 1 had the most subjects with a history of intubation (P = 0.05), a lower prebronchodilator FEV1 (P = 0.006), a higher bronchodilator response (P = 0.03), and higher exhaled nitric oxide levels (P = 0.04) compared with the other TEA clusters. TEA cluster 2, the smallest cluster, had the most subjects that were hospitalized for asthma (P = 0.04). TEA cluster 3, the largest cluster, had normal lung function, low exhaled nitric oxide levels, and lower inhaled steroid requirements. Evaluation of TEA clusters in children confirmed that TEA clusters 1 and 2 are associated with a history of intubation (P = 5.58 × 10−6) and hospitalization (P = 0.01), respectively. Conclusions: There are common patterns of gene expression in the sputum and blood of children and adults that are associated with near-fatal, severe, and milder asthma. PMID:25763605

  11. pySAPC, a python package for sparse affinity propagation clustering: Application to odontogenesis whole genome time series gene-expression data.

    PubMed

    Cao, Huojun; Amendt, Brad A

    2016-11-01

    Developmental dental anomalies are common forms of congenital defects. The molecular mechanisms of dental anomalies are poorly understood. Systematic approaches such as clustering genes based on similar expression patterns could identify novel genes involved in dental anomalies and provide a framework for understanding molecular regulatory mechanisms of these genes during tooth development (odontogenesis). A python package (pySAPC) of sparse affinity propagation clustering algorithm for large datasets was developed. Whole genome pair-wise similarity was calculated based on expression pattern similarity based on 45 microarrays of several stages during odontogenesis. pySAPC identified 743 gene clusters based on expression pattern similarity during mouse tooth development. Three clusters are significantly enriched for genes associated with dental anomalies (with FDR <0.1). The three clusters of genes have distinct expression patterns during odontogenesis. Clustering genes based on similar expression profiles recovered several known regulatory relationships for genes involved in odontogenesis, as well as many novel genes that may be involved with the same genetic pathways as genes that have already been shown to contribute to dental defects. By using sparse similarity matrix, pySAPC use much less memory and CPU time compared with the original affinity propagation program that uses a full similarity matrix. This python package will be useful for many applications where dataset(s) are too large to use full similarity matrix. This article is part of a Special Issue entitled "System Genetics" Guest Editor: Dr. Yudong Cai and Dr. Tao Huang. Copyright © 2016. Published by Elsevier B.V.

  12. Assessment of gene order computing methods for Alzheimer's disease

    PubMed Central

    2013-01-01

    Background Computational genomics of Alzheimer disease (AD), the most common form of senile dementia, is a nascent field in AD research. The field includes AD gene clustering by computing gene order which generates higher quality gene clustering patterns than most other clustering methods. However, there are few available gene order computing methods such as Genetic Algorithm (GA) and Ant Colony Optimization (ACO). Further, their performance in gene order computation using AD microarray data is not known. We thus set forth to evaluate the performances of current gene order computing methods with different distance formulas, and to identify additional features associated with gene order computation. Methods Using different distance formulas- Pearson distance and Euclidean distance, the squared Euclidean distance, and other conditions, gene orders were calculated by ACO and GA (including standard GA and improved GA) methods, respectively. The qualities of the gene orders were compared, and new features from the calculated gene orders were identified. Results Compared to the GA methods tested in this study, ACO fits the AD microarray data the best when calculating gene order. In addition, the following features were revealed: different distance formulas generated a different quality of gene order, and the commonly used Pearson distance was not the best distance formula when used with both GA and ACO methods for AD microarray data. Conclusion Compared with Pearson distance and Euclidean distance, the squared Euclidean distance generated the best quality gene order computed by GA and ACO methods. PMID:23369541

  13. Specific resistances against Pseudomonas syringae effectors AvrB and AvrRpm1 have evolved differently in common bean, soybean, and Arabidopsis

    PubMed Central

    Chen, Nicolas W. G.; Sévignac, Mireille; Thareau, Vincent; Magdelenat, Ghislaine; David, Perrine; Ashfield, Tom; Innes, Roger W.; Geffroy, Valérie

    2010-01-01

    Summary In plants, the evolution of specific resistance is poorly understood. Pseudomonas syringae effectors AvrB and AvrRpm1 are recognized by phylogenetically distinct resistance (R) proteins in Arabidopsis (Brassicaceae) and soybean (Glycine max, Fabaceae). In soybean, these resistances are encoded by two tightly linked R genes Rpg1-b and Rpg1-r. To study the evolution of these specific resistances, we investigated AvrB- and AvrRpm1-induced responses in common bean (Phaseolus vulgaris, Fabaceae).Common bean genotypes of various geographical origins were inoculated with P. syringae strains expressing AvrB or AvrRpm1. A common bean recombinant-inbred-line (RIL) population was used to map R genes to AvrRpm1.No common bean genotypes recognized AvrB. By contrast, multiple genotypes responded to AvrRpm1, and two independent R genes conferring AvrRpm1-specific resistance were mapped to the ends of linkage group B11 (Rpsar-1) and B8 (Rpsar-2). Rpsar-1 is located in a region syntenic with the soybean Rpg1 cluster. However, mapping of specific Rpg1 homologous genes suggests that AvrRpm1 recognition evolved independently in common bean and soybean.The conservation of genomic position of AvrRpm1-specific genes between soybean and common bean suggests a model whereby specific clusters of R genes are predisposed to evolve recognition of the same effector molecules. PMID:20561214

  14. Epigenetic transgenerational inheritance of somatic transcriptomes and epigenetic control regions

    PubMed Central

    2012-01-01

    Background Environmentally induced epigenetic transgenerational inheritance of adult onset disease involves a variety of phenotypic changes, suggesting a general alteration in genome activity. Results Investigation of different tissue transcriptomes in male and female F3 generation vinclozolin versus control lineage rats demonstrated all tissues examined had transgenerational transcriptomes. The microarrays from 11 different tissues were compared with a gene bionetwork analysis. Although each tissue transgenerational transcriptome was unique, common cellular pathways and processes were identified between the tissues. A cluster analysis identified gene modules with coordinated gene expression and each had unique gene networks regulating tissue-specific gene expression and function. A large number of statistically significant over-represented clusters of genes were identified in the genome for both males and females. These gene clusters ranged from 2-5 megabases in size, and a number of them corresponded to the epimutations previously identified in sperm that transmit the epigenetic transgenerational inheritance of disease phenotypes. Conclusions Combined observations demonstrate that all tissues derived from the epigenetically altered germ line develop transgenerational transcriptomes unique to the tissue, but common epigenetic control regions in the genome may coordinately regulate these tissue-specific transcriptomes. This systems biology approach provides insight into the molecular mechanisms involved in the epigenetic transgenerational inheritance of a variety of adult onset disease phenotypes. PMID:23034163

  15. Identifying resistance gene analogs associated with resistances to different pathogens in common bean.

    PubMed

    López, Camilo E; Acosta, Iván F; Jara, Carlos; Pedraza, Fabio; Gaitán-Solís, Eliana; Gallego, Gerardo; Beebe, Steve; Tohme, Joe

    2003-01-01

    ABSTRACT A polymerase chain reaction approach using degenerate primers that targeted the conserved domains of cloned plant disease resistance genes (R genes) was used to isolate a set of 15 resistance gene analogs (RGAs) from common bean (Phaseolus vulgaris). Eight different classes of RGAs were obtained from nucleotide binding site (NBS)-based primers and seven from not previously described Toll/Interleukin-1 receptor-like (TIR)-based primers. Putative amino acid sequences of RGAs were significantly similar to R genes and contained additional conserved motifs. The NBS-type RGAs were classified in two subgroups according to the expected final residue in the kinase-2 motif. Eleven RGAs were mapped at 19 loci on eight linkage groups of the common bean genetic map constructed at Centro Internacional de Agricultura Tropical. Genetic linkage was shown for eight RGAs with partial resistance to anthracnose, angular leaf spot (ALS) and Bean golden yellow mosaic virus (BGYMV). RGA1 and RGA2 were associated with resistance loci to anthracnose and BGYMV and were part of two clusters of R genes previously described. A new major cluster was detected by RGA7 and explained up to 63.9% of resistance to ALS and has a putative contribution to anthracnose resistance. These results show the usefulness of RGAs as candidate genes to detect and eventually isolate numerous R genes in common bean.

  16. Outcome-Driven Cluster Analysis with Application to Microarray Data.

    PubMed

    Hsu, Jessie J; Finkelstein, Dianne M; Schoenfeld, David A

    2015-01-01

    One goal of cluster analysis is to sort characteristics into groups (clusters) so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes) into groups of highly correlated genes that have the same effect on the outcome (recovery). We propose a random effects model where the genes within each group (cluster) equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.

  17. Arrangement of the Clostridium baratii F7 Toxin Gene Cluster with Identification of a σ Factor That Recognizes the Botulinum Toxin Gene Cluster Promoters

    DOE PAGES

    Dover, Nir; Barash, Jason R.; Burke, Julianne N.; ...

    2014-05-22

    Botulinum neurotoxin (BoNT) is the most poisonous substances known and its eight toxin types (A to H) are distinguished by the inability of polyclonal antibodies that neutralize one toxin type to neutralize any of the other seven toxin types. Infant botulism, an intestinal toxemia orphan disease, is the most common form of human botulism in the United States. It results from swallowed spores of Clostridium botulinum (or rarely, neurotoxigenic Clostridium butyricum or Clostridium baratii) that germinate and temporarily colonize the lumen of the large intestine, where, as vegetative cells, they produce botulinum toxin. Botulinum neurotoxin is encoded by the bontmore » gene that is part of a toxin gene cluster that includes several accessory genes. In this paper, we sequenced for the first time the complete botulinum neurotoxin gene cluster of nonproteolytic C. baratii type F7. Like the type E and the nonproteolytic type F6 botulinum toxin gene clusters, the C. baratii type F7 had an orfX toxin gene cluster that lacked the regulatory botR gene which is found in proteolytic C. botulinum strains and codes for an alternative σ factor. In the absence of botR, we identified a putative alternative regulatory gene located upstream of the C. baratii type F7 toxin gene cluster. This putative regulatory gene codes for a predicted σ factor that contains DNA-binding-domain homologues to the DNA-binding domains both of BotR and of other members of the TcdR-related group 5 of the σ 70 family that are involved in the regulation of toxin gene expression in clostridia. We showed that this TcdR-related protein in association with RNA polymerase core enzyme specifically binds to the C. baratii type F7 botulinum toxin gene cluster promoters. Finally, this TcdR-related protein may therefore be involved in regulating the expression of the genes of the botulinum toxin gene cluster in neurotoxigenic C. baratii.« less

  18. Tissue-specific impact of FADS cluster variants on FADS1 and FADS2 gene expression.

    PubMed

    Reynolds, Lindsay M; Howard, Timothy D; Ruczinski, Ingo; Kanchan, Kanika; Seeds, Michael C; Mathias, Rasika A; Chilton, Floyd H

    2018-01-01

    Omega-6 (n-6) and omega-3 (n-3) long (≥ 20 carbon) chain polyunsaturated fatty acids (LC-PUFAs) play a critical role in human health and disease. Biosynthesis of LC-PUFAs from dietary 18 carbon PUFAs in tissues such as the liver is highly associated with genetic variation within the fatty acid desaturase (FADS) gene cluster, containing FADS1 and FADS2 that encode the rate-limiting desaturation enzymes in the LC-PUFA biosynthesis pathway. However, the molecular mechanisms by which FADS genetic variants affect LC-PUFA biosynthesis, and in which tissues, are unclear. The current study examined associations between common single nucleotide polymorphisms (SNPs) within the FADS gene cluster and FADS1 and FADS2 gene expression in 44 different human tissues (sample sizes ranging 70-361) from the Genotype-Tissue Expression (GTEx) Project. FADS1 and FADS2 expression were detected in all 44 tissues. Significant cis-eQTLs (within 1 megabase of each gene, False Discovery Rate, FDR<0.05, as defined by GTEx) were identified in 12 tissues for FADS1 gene expression and 23 tissues for FADS2 gene expression. Six tissues had significant (FDR< 0.05) eQTLs associated with both FADS1 and FADS2 (including artery, esophagus, heart, muscle, nerve, and thyroid). Interestingly, the identified eQTLs were consistently found to be associated in opposite directions for FADS1 and FADS2 expression. Taken together, findings from this study suggest common SNPs within the FADS gene cluster impact the transcription of FADS1 and FADS2 in numerous tissues and raise important questions about how the inverse expression of these two genes impact intermediate molecular (such a LC-PUFA and LC-PUFA-containing glycerolipid levels) and ultimately clinical phenotypes associated with inflammatory diseases and brain health.

  19. Distal regulatory regions restrict the expression of cis-linked genes to the tapetal cells.

    PubMed

    Franco, Luciana O; de O Manes, Carmem Lara; Hamdi, Said; Sachetto-Martins, Gilberto; de Oliveira, Dulce E

    2002-04-24

    The oleosin glycine-rich protein genes Atgrp-6, Atgrp-7, and Atgrp-8 occur in clusters in the Arabidopsis genome and are expressed specifically in the tapetum cells. The cis-regulatory regions involved in the tissue-specific gene expression were investigated by fusing different segments of the gene cluster to the uidA reporter gene. Common distal regulatory regions were identified that coordinate expression of the sequential genes. At least two of these genes were regulated spatially by proximal and distal sequences. The cis-acting elements (122 bp upstream of the transcriptional start point) drive the uidA expression to floral tissues, whereas distal 5' upstream regions restrict the gene activity to tapetal cells.

  20. Ancient genomic architecture for mammalian olfactory receptor clusters

    PubMed Central

    Aloni, Ronny; Olender, Tsviya; Lancet, Doron

    2006-01-01

    Background Mammalian olfactory receptor (OR) genes reside in numerous genomic clusters of up to several dozen genes. Whole-genome sequence alignment nets of five mammals allow their comprehensive comparison, aimed at reconstructing the ancestral olfactory subgenome. Results We developed a new and general tool for genome-wide definition of genomic gene clusters conserved in multiple species. Syntenic orthologs, defined as gene pairs showing conservation of both genomic location and coding sequence, were subjected to a graph theory algorithm for discovering CLICs (clusters in conservation). When applied to ORs in five mammals, including the marsupial opossum, more than 90% of the OR genes were found within a framework of 48 multi-species CLICs, invoking a general conservation of gene order and composition. A detailed analysis of individual CLICs revealed multiple differences among species, interpretable through species-specific genomic rearrangements and reflecting complex mammalian evolutionary dynamics. One significant instance involves CLIC #1, which lacks a human member, implying the human-specific deletion of an OR cluster, whose mouse counterpart has been tentatively associated with isovaleric acid odorant detection. Conclusion The identified multi-species CLICs demonstrate that most of the mammalian OR clusters have a common ancestry, preceding the split between marsupials and placental mammals. However, only two of these CLICs were capable of incorporating chicken OR genes, parsimoniously implying that all other CLICs emerged subsequent to the avian-mammalian divergence. PMID:17010214

  1. Genetic analysis of the response to eleven Colletotrichum lindemuthianum races in a RIL population of common bean (Phaseolus vulgaris L.)

    PubMed Central

    2014-01-01

    Background Bean anthracnose is caused by the fungus Colletotrichum lindemuthianum (Sacc. & Magnus) Lams.- Scrib. Resistance to C. lindemuthianum in common bean (Phaseolus vulgaris L.) generally follows a qualitative mode of inheritance. The pathogen shows extensive pathogenic variation and up to 20 anthracnose resistance loci (named Co-), conferring resistance to specific races, have been described. Anthracnose resistance has generally been investigated by analyzing a limited number of isolates or races in segregating populations. In this work, we analyzed the response against eleven C. lindemuthianum races in a recombinant inbred line (RIL) common bean population derived from the cross Xana × Cornell 49242 in which a saturated linkage map was previously developed. Results A systematic genetic analysis was carried out to dissect the complex resistance segregations observed, which included contingency analyses, subpopulations and genetic mapping. Twenty two resistance genes were identified, some with a complementary mode of action. The Cornell 49242 genotype carries a complex cluster of resistance genes at the end of linkage group (LG) Pv11 corresponding to the previously described anthracnose resistance cluster Co-2. In this position, specific resistance genes to races 3, 6, 7, 19, 38, 39, 65, 357, 449 and 453 were identified, with one of them showing a complementary mode of action. In addition, Cornell 49242 had an independent gene on LG Pv09 showing a complementary mode of action for resistance to race 453. Resistance genes in genotype Xana were located on three regions involving LGs Pv01, Pv02 and Pv04. All resistance genes identified in Xana showed a complementary mode of action, except for two controlling resistance to races 65 and 73 located on LG Pv01, in the position of the previously described anthracnose resistance cluster Co-1. Conclusions Results shown herein reveal a complex and specific interaction between bean and fungus genotypes leading to anthracnose resistance. Organization of specific resistance genes in clusters including resistance genes with different modes of action (dominant and complementary genes) was also confirmed. Finally, new locations for anthracnose resistance genes were identified in LG Pv09. PMID:24779442

  2. Transcriptional profiles of Arabidopsis stomataless mutants reveal developmental and physiological features of life in the absence of stomata

    PubMed Central

    de Marcos, Alberto; Triviño, Magdalena; Pérez-Bueno, María Luisa; Ballesteros, Isabel; Barón, Matilde; Mena, Montaña; Fenoll, Carmen

    2015-01-01

    Loss of function of the positive stomata development regulators SPCH or MUTE in Arabidopsis thaliana renders stomataless plants; spch-3 and mute-3 mutants are extreme dwarfs, but produce cotyledons and tiny leaves, providing a system to interrogate plant life in the absence of stomata. To this end, we compared their cotyledon transcriptomes with that of wild-type plants. K-means clustering of differentially expressed genes generated four clusters: clusters 1 and 2 grouped genes commonly regulated in the mutants, while clusters 3 and 4 contained genes distinctively regulated in mute-3. Classification in functional categories and metabolic pathways of genes in clusters 1 and 2 suggested that both mutants had depressed secondary, nitrogen and sulfur metabolisms, while only a few photosynthesis-related genes were down-regulated. In situ quenching analysis of chlorophyll fluorescence revealed limited inhibition of photosynthesis. This and other fluorescence measurements matched the mutant transcriptomic features. Differential transcriptomes of both mutants were enriched in growth-related genes, including known stomata development regulators, which paralleled their epidermal phenotypes. Analysis of cluster 3 was not informative for developmental aspects of mute-3. Cluster 4 comprised genes differentially up−regulated in mute−3, 35% of which were direct targets for SPCH and may relate to the unique cell types of mute−3. A screen of T-DNA insertion lines in genes differentially expressed in the mutants identified a gene putatively involved in stomata development. A collection of lines for conditional overexpression of transcription factors differentially expressed in the mutants rendered distinct epidermal phenotypes, suggesting that these proteins may be novel stomatal development regulators. Thus, our transcriptome analysis represents a useful source of new genes for the study of stomata development and for characterizing physiology and growth in the absence of stomata. PMID:26157447

  3. KinFin: Software for Taxon-Aware Analysis of Clustered Protein Sequences.

    PubMed

    Laetsch, Dominik R; Blaxter, Mark L

    2017-10-05

    The field of comparative genomics is concerned with the study of similarities and differences between the information encoded in the genomes of organisms. A common approach is to define gene families by clustering protein sequences based on sequence similarity, and analyze protein cluster presence and absence in different species groups as a guide to biology. Due to the high dimensionality of these data, downstream analysis of protein clusters inferred from large numbers of species, or species with many genes, is nontrivial, and few solutions exist for transparent, reproducible, and customizable analyses. We present KinFin, a streamlined software solution capable of integrating data from common file formats and delivering aggregative annotation of protein clusters. KinFin delivers analyses based on systematic taxonomy of the species analyzed, or on user-defined, groupings of taxa, for example, sets based on attributes such as life history traits, organismal phenotypes, or competing phylogenetic hypotheses. Results are reported through graphical and detailed text output files. We illustrate the utility of the KinFin pipeline by addressing questions regarding the biology of filarial nematodes, which include parasites of veterinary and medical importance. We resolve the phylogenetic relationships between the species and explore functional annotation of proteins in clusters in key lineages and between custom taxon sets, identifying gene families of interest. KinFin can easily be integrated into existing comparative genomic workflows, and promotes transparent and reproducible analysis of clustered protein data. Copyright © 2017 Laetsch and Blaxter.

  4. The nif Gene Operon of the Methanogenic Archaeon Methanococcus maripaludis

    PubMed Central

    Kessler, Peter S.; Blank, Carrine; Leigh, John A.

    1998-01-01

    Nitrogen fixation occurs in two domains, Archaea and Bacteria. We have characterized a nif (nitrogen fixation) gene cluster in the methanogenic archaeon Methanococcus maripaludis. Sequence analysis revealed eight genes, six with sequence similarity to known nif genes and two with sequence similarity to glnB. The gene order, nifH, ORF105 (similar to glnB), ORF121 (similar to glnB), nifD, nifK, nifE, nifN, and nifX, was the same as that found in part in other diazotrophic methanogens and except for the presence of the glnB-like genes, also resembled the order found in many members of the Bacteria. Using transposon insertion mutagenesis, we determined that an 8-kb region required for nitrogen fixation corresponded to the nif gene cluster. Northern analysis revealed the presence of either a single 7.6-kb nif mRNA transcript or 10 smaller mRNA species containing portions of the large transcript. Polar effects of transposon insertions demonstrated that all of these mRNAs arose from a single promoter region, where transcription initiated 80 bp 5′ to nifH. Distinctive features of the nif gene cluster include the presence of the six primary nif genes in a single operon, the placement of the two glnB-like genes within the cluster, the apparent physical separation of the cluster from any other nif genes that might be in the genome, the fragmentation pattern of the mRNA, and the regulation of expression by a repression mechanism described previously. Our study and others with methanogenic archaea reporting multiple mRNAs arising from gene clusters with only a single putative promoter sequence suggest that mRNA processing following transcription may be a common occurrence in methanogens. PMID:9515920

  5. Amplification of the entire kanamycin biosynthetic gene cluster during empirical strain improvement of Streptomyces kanamyceticus.

    PubMed

    Yanai, Koji; Murakami, Takeshi; Bibb, Mervyn

    2006-06-20

    Streptomyces kanamyceticus 12-6 is a derivative of the wild-type strain developed for industrial kanamycin (Km) production. Southern analysis and DNA sequencing revealed amplification of a large genomic segment including the entire Km biosynthetic gene cluster in the chromosome of strain 12-6. At 145 kb, the amplifiable unit of DNA (AUD) is the largest AUD reported in Streptomyces. Striking repetitive DNA sequences belonging to the clustered regularly interspaced short palindromic repeats family were found in the AUD and may play a role in its amplification. Strain 12-6 contains a mixture of different chromosomes with varying numbers of AUDs, sometimes exceeding 36 copies and producing an amplified region >5.7 Mb. The level of Km production depended on the copy number of the Km biosynthetic gene cluster, suggesting that DNA amplification occurred during strain improvement as a consequence of selection for increased Km resistance. Amplification of DNA segments including entire antibiotic biosynthetic gene clusters might be a common mechanism leading to increased antibiotic production in industrial strains.

  6. Leukocyte common antigen-related phosphatase (LRP) gene structure: Conservation of the genomic organization of transmembrane protein tyrosine phosphatases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wong, E.C.C.; Mullersman, J.E.; Thomas, M.L.

    1993-07-01

    The leukocyte common antigen-related protein tyrosine phosphatase (LRP) is a widely expressed transmembrane glycoprotein thought to be involved in cell growth and differentiation. Similar to most other transmembrane protein tyrosine phosphatases, LRP contains two tandem cytoplasmic phosphatase domains. To understand further the regulation and evolution of LRP, the authors have isolated and characterized mouse [lambda] genomic clones. Thirteen genomic clones could be divided into two non-overlapping clusters. The first cluster contained the transcription initiation site and the exon encoding most of the 5[prime] untranslated region. The second cluster contained the remaining exons encoding the protein and the 3[prime] untranslated region.more » The gene consists of 22 exons spanning over 75 kb. The distance between exon 1 and exon 2 is at least 25 kb. Characterization of the 5[prime] ends of LRP mRNA by S1 nuclease protection identifies putative initiation start sites within a G/C-rich region. The upstream region does not contain a TATA box. Comparison of the LRP gene structure to the mammalian protein tyrosine phosphatase gene, CD45, shows striking similarities in size and genomic organization. 29 refs., 5 figs., 1 tab.« less

  7. A cross-species bi-clustering approach to identifying conserved co-regulated genes.

    PubMed

    Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun; Bi, Jinbo

    2016-06-15

    A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticated method to date for identifying conserved co-regulated genes is a two-step approach. This approach first identifies gene clusters for each species by a cluster analysis of gene expression data, and subsequently computes the overlaps of clusters identified from different species to reveal common subgroups. This approach is ineffective to deal with the noise in the expression data introduced by the complicated procedures in quantifying gene expression. Furthermore, due to the sequential nature of the approach, the gene clusters identified in the first step may have little overlap among different species in the second step, thus difficult to detect conserved co-regulated genes. We propose a cross-species bi-clustering approach which first denoises the gene expression data of each species into a data matrix. The rows of the data matrices of different species represent the same set of genes that are characterized by their expression patterns over the developmental stages of each species as columns. A novel bi-clustering method is then developed to cluster genes into subgroups by a joint sparse rank-one factorization of all the data matrices. This method decomposes a data matrix into a product of a column vector and a row vector where the column vector is a consistent indicator across the matrices (species) to identify the same gene cluster and the row vector specifies for each species the developmental stages that the clustered genes co-regulate. Efficient optimization algorithm has been developed with convergence analysis. This approach was first validated on synthetic data and compared to the two-step method and several recent joint clustering methods. We then applied this approach to two real world datasets of gene expression during the pre-implantation embryonic development of the human and mouse. Co-regulated genes consistent between the human and mouse were identified, offering insights into conserved functions, as well as similarities and differences in genome activation timing between the human and mouse embryos. The R package containing the implementation of the proposed method in C ++ is available at: https://github.com/JavonSun/mvbc.git and also at the R platform https://www.r-project.org/ jinbo@engr.uconn.edu. © The Author 2016. Published by Oxford University Press.

  8. Polymorphisms in Fatty Acid Desaturase (FADS) Gene Cluster: Effects on Glycemic Controls Following an Omega-3 Polyunsaturated Fatty Acids (PUFA) Supplementation.

    PubMed

    Cormier, Hubert; Rudkowska, Iwona; Thifault, Elisabeth; Lemieux, Simone; Couture, Patrick; Vohl, Marie-Claude

    2013-09-10

    Changes in desaturase activity are associated with insulin sensitivity and may be associated with type 2 diabetes mellitus (T2DM). Polymorphisms (SNPs) in the fatty acid desaturase (FADS) gene cluster have been associated with the homeostasis model assessment of insulin sensitivity (HOMA-IS) and serum fatty acid composition. To investigate whether common genetic variations in the FADS gene cluster influence fasting glucose (FG) and fasting insulin (FI) responses following a 6-week n-3 polyunsaturated fatty acids (PUFA) supplementation. 210 subjects completed a 2-week run-in period followed by a 6-week supplementation with 5 g/d of fish oil (providing 1.9 g-2.2 g of EPA + 1.1 g of DHA). Genotyping of 18 SNPs of the FADS gene cluster covering 90% of all common genetic variations (minor allele frequency ≥ 0.03) was performed. Carriers of the minor allele for rs482548 (FADS2) had increased plasma FG levels after the n-3 PUFA supplementation in a model adjusted for FG levels at baseline, age, sex, and BMI. A significant genotype*supplementation interaction effect on FG levels was observed for rs482548 (p = 0.008). For FI levels, a genotype effect was observed with one SNP (rs174456). For HOMA-IS, several genotype*supplementation interaction effects were observed for rs7394871, rs174602, rs174570, rs7482316 and rs482548 (p = 0.03, p = 0.01, p = 0.03, p = 0.05 and p = 0.07; respectively). RESULTS suggest that SNPs in the FADS gene cluster may modulate plasma FG, FI and HOMA-IS levels in response to n-3 PUFA supplementation.

  9. Polymorphisms in Fatty Acid Desaturase (FADS) Gene Cluster: Effects on Glycemic Controls Following an Omega-3 Polyunsaturated Fatty Acids (PUFA) Supplementation

    PubMed Central

    Cormier, Hubert; Rudkowska, Iwona; Thifault, Elisabeth; Lemieux, Simone; Couture, Patrick; Vohl, Marie-Claude

    2013-01-01

    Changes in desaturase activity are associated with insulin sensitivity and may be associated with type 2 diabetes mellitus (T2DM). Polymorphisms (SNPs) in the fatty acid desaturase (FADS) gene cluster have been associated with the homeostasis model assessment of insulin sensitivity (HOMA-IS) and serum fatty acid composition. Objective: To investigate whether common genetic variations in the FADS gene cluster influence fasting glucose (FG) and fasting insulin (FI) responses following a 6-week n-3 polyunsaturated fatty acids (PUFA) supplementation. Methods: 210 subjects completed a 2-week run-in period followed by a 6-week supplementation with 5 g/d of fish oil (providing 1.9 g–2.2 g of EPA + 1.1 g of DHA). Genotyping of 18 SNPs of the FADS gene cluster covering 90% of all common genetic variations (minor allele frequency ≥ 0.03) was performed. Results: Carriers of the minor allele for rs482548 (FADS2) had increased plasma FG levels after the n-3 PUFA supplementation in a model adjusted for FG levels at baseline, age, sex, and BMI. A significant genotype*supplementation interaction effect on FG levels was observed for rs482548 (p = 0.008). For FI levels, a genotype effect was observed with one SNP (rs174456). For HOMA-IS, several genotype*supplementation interaction effects were observed for rs7394871, rs174602, rs174570, rs7482316 and rs482548 (p = 0.03, p = 0.01, p = 0.03, p = 0.05 and p = 0.07; respectively). Conclusion: Results suggest that SNPs in the FADS gene cluster may modulate plasma FG, FI and HOMA-IS levels in response to n-3 PUFA supplementation. PMID:24705214

  10. Identification of an intact ParaHox cluster with temporal colinearity but altered spatial colinearity in the hemichordate Ptychodera flava

    PubMed Central

    2013-01-01

    Background ParaHox and Hox genes are thought to have evolved from a common ancestral ProtoHox cluster or from tandem duplication prior to the divergence of cnidarians and bilaterians. Similar to Hox clusters, chordate ParaHox genes including Gsx, Xlox, and Cdx, are clustered and their expression exhibits temporal and spatial colinearity. In non-chordate animals, however, studies on the genomic organization of ParaHox genes are limited to only a few animal taxa. Hemichordates, such as the Enteropneust acorn worms, have been used to gain insights into the origins of chordate characters. In this study, we investigated the genomic organization and expression of ParaHox genes in the indirect developing hemichordate acorn worm Ptychodera flava. Results We found that P. flava contains an intact ParaHox cluster with a similar arrangement to that of chordates. The temporal expression order of the P. flava ParaHox genes is the same as that of the chordate ParaHox genes. During embryogenesis, the spatial expression pattern of PfCdx in the posterior endoderm represents a conserved feature similar to the expression of its orthologs in other animals. On the other hand, PfXlox and PfGsx show a novel expression pattern in the blastopore. Nevertheless, during metamorphosis, PfXlox and PfCdx are expressed in the endoderm in a spatially staggered pattern similar to the situation in chordates. Conclusions Our study shows that P. flava ParaHox genes, despite forming an intact cluster, exhibit temporal colinearity but lose spatial colinearity during embryogenesis. During metamorphosis, partial spatial colinearity is retained in the transforming larva. These results strongly suggest that intact ParaHox gene clustering was retained in the deuterostome ancestor and is correlated with temporal colinearity. PMID:23802544

  11. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea.

    PubMed

    Makarova, Kira S; Sorokin, Alexander V; Novichkov, Pavel S; Wolf, Yuri I; Koonin, Eugene V

    2007-11-27

    An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. New Archaeal Clusters of Orthologous Genes (arCOGs) were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon) using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover approximately 88% of the genes in a genome compared to a approximately 76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; approximately 40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome) consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA) is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile that, in addition to the core archaeal functions, encoded more idiosyncratic systems, e.g., the CASS systems of antivirus defense and some toxin-antitoxin systems. The arCOGs provide a convenient, flexible framework for functional annotation of archaeal genomes, comparative genomics and evolutionary reconstructions. Genomic reconstructions suggest that the last common ancestor of archaea might have been (nearly) as advanced as the modern archaeal hyperthermophiles. ArCOGs and related information are available at: ftp://ftp.ncbi.nih.gov/pub/koonin/arCOGs/.

  12. SXT/R391 integrative and conjugative elements in Proteus species reveal abundant genetic diversity and multidrug resistance

    PubMed Central

    Li, Xinyue; Du, Yu; Du, Pengcheng; Dai, Hang; Fang, Yujie; Li, Zhenpeng; Lv, Na; Zhu, Baoli; Kan, Biao; Wang, Duochun

    2016-01-01

    SXT/R391 integrative and conjugative elements (ICEs) are self-transmissible mobile genetic elements that are found in most members of Enterobacteriaceae. Here, we determined fifteen SXT/R391 ICEs carried by Proteus isolates from food (4.2%) and diarrhoea patients (17.3%). BLASTn searches against GenBank showed that the fifteen SXT/R391 ICEs were closely related to that from different Enterobacteriaceae species, including Proteus mirabilis. Using core gene phylogenetic analysis, the fifteen SXT/R391 ICEs were grouped into six distinct clusters, including a dominant cluster and three clusters that have not been previously reported in Proteus isolates. The SXT/R391 ICEs shared a common structure with a set of conserved genes, five hotspots and two variable regions, which contained more foreign genes, including drug-resistance genes. Notably, a class A β-lactamase gene was identified in nine SXT/R391 ICEs. Collectively, the ICE-carrying isolates carried resistance genes for 20 tested drugs. Six isolates were resistant to chloramphenicol, kanamycin, streptomycin, trimethoprim-sulfamethoxazole, sulfisoxazole and tetracycline, which are drug resistances commonly encoded by ICEs. Our results demonstrate abundant genetic diversity and multidrug resistance of the SXT/R391 ICEs carried by Proteus isolates, which may have significance for public health. It is therefore necessary to continuously monitor the antimicrobial resistance and related mobile elements among Proteus isolates. PMID:27892525

  13. High-resolution mapping reveals linkage between genes in common bean cultivar Ouro Negro conferring resistance to the rust, anthracnose, and angular leaf spot diseases.

    PubMed

    Valentini, Giseli; Gonçalves-Vidigal, Maria Celeste; Hurtado-Gonzales, Oscar P; de Lima Castro, Sandra Aparecida; Cregan, Perry B; Song, Qijian; Pastor-Corrales, Marcial A

    2017-08-01

    Co-segregation analysis and high-throughput genotyping using SNP, SSR, and KASP markers demonstrated genetic linkage between Ur-14 and Co-3 4 /Phg-3 loci conferring resistance to the rust, anthracnose and angular leaf spot diseases of common bean. Rust, anthracnose, and angular leaf spot are major diseases of common bean in the Americas and Africa. The cultivar Ouro Negro has the Ur-14 gene that confers broad spectrum resistance to rust and the gene cluster Co-3 4 /Phg-3 containing two tightly linked genes conferring resistance to anthracnose and angular leaf spot, respectively. We used co-segregation analysis and high-throughput genotyping of 179 F 2:3 families from the Rudá (susceptible) × Ouro Negro (resistant) cross-phenotyped separately with races of the rust and anthracnose pathogens. The results confirmed that Ur-14 and Co-3 4 /Phg-3 cluster in Ouro Negro conferred resistance to rust and anthracnose, respectively, and that Ur-14 and the Co-3 4 /Phg-3 cluster were closely linked. Genotyping the F 2:3 families, first with 5398 SNPs on the Illumina BeadChip BARCBEAN6K_3 and with 15 SSR, and eight KASP markers, specifically designed for the candidate region containing Ur-14 and Co-3 4 /Phg-3, permitted the creation of a high-resolution genetic linkage map which revealed that Ur-14 was positioned at 2.2 cM from Co-3 4 /Phg-3 on the short arm of chromosome Pv04 of the common bean genome. Five flanking SSR markers were tightly linked at 0.1 and 0.2 cM from Ur-14, and two flanking KASP markers were tightly linked at 0.1 and 0.3 cM from Co-3 4 /Phg-3. Many other SSR, SNP, and KASP markers were also linked to these genes. These markers will be useful for the development of common bean cultivars combining the important Ur-14 and Co-3 4 /Phg-3 genes conferring resistance to three of the most destructive diseases of common bean.

  14. Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

    2004-08-06

    The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayedmore » embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Measuring conservation of sequence features closely linked to function--such as binding-site clustering--makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less

  15. Phylogenetic Evidence for Lateral Gene Transfer in the Intestine of Marine Iguanas

    PubMed Central

    Nelson, David M.; Cann, Isaac K. O.; Altermann, Eric; Mackie, Roderick I.

    2010-01-01

    Background Lateral gene transfer (LGT) appears to promote genotypic and phenotypic variation in microbial communities in a range of environments, including the mammalian intestine. However, the extent and mechanisms of LGT in intestinal microbial communities of non-mammalian hosts remains poorly understood. Methodology/Principal Findings We sequenced two fosmid inserts obtained from a genomic DNA library derived from an agar-degrading enrichment culture of marine iguana fecal material. The inserts harbored 16S rRNA genes that place the organism from which they originated within Clostridium cluster IV, a well documented group that habitats the mammalian intestinal tract. However, sequence analysis indicates that 52% of the protein-coding genes on the fosmids have top BLASTX hits to bacterial species that are not members of Clostridium cluster IV, and phylogenetic analysis suggests that at least 10 of 44 coding genes on the fosmids may have been transferred from Clostridium cluster XIVa to cluster IV. The fosmids encoded four transposase-encoding genes and an integrase-encoding gene, suggesting their involvement in LGT. In addition, several coding genes likely involved in sugar transport were probably acquired through LGT. Conclusion Our phylogenetic evidence suggests that LGT may be common among phylogenetically distinct members of the phylum Firmicutes inhabiting the intestinal tract of marine iguanas. PMID:20520734

  16. Enhancers and super-enhancers have an equivalent regulatory role in embryonic stem cells through regulation of single or multiple genes

    PubMed Central

    Moorthy, Sakthi D.; Davidson, Scott; Shchuka, Virlana M.; Singh, Gurdeep; Malek-Gilani, Nakisa; Langroudi, Lida; Martchenko, Alexandre; So, Vincent; Macpherson, Neil N.; Mitchell, Jennifer A.

    2017-01-01

    Transcriptional enhancers are critical for maintaining cell-type–specific gene expression and driving cell fate changes during development. Highly transcribed genes are often associated with a cluster of individual enhancers such as those found in locus control regions. Recently, these have been termed stretch enhancers or super-enhancers, which have been predicted to regulate critical cell identity genes. We employed a CRISPR/Cas9-mediated deletion approach to study the function of several enhancer clusters (ECs) and isolated enhancers in mouse embryonic stem (ES) cells. Our results reveal that the effect of deleting ECs, also classified as ES cell super-enhancers, is highly variable, resulting in target gene expression reductions ranging from 12% to as much as 92%. Partial deletions of these ECs which removed only one enhancer or a subcluster of enhancers revealed partially redundant control of the regulated gene by multiple enhancers within the larger cluster. Many highly transcribed genes in ES cells are not associated with a super-enhancer; furthermore, super-enhancer predictions ignore 81% of the potentially active regulatory elements predicted by cobinding of five or more pluripotency-associated transcription factors. Deletion of these additional enhancer regions revealed their robust regulatory role in gene transcription. In addition, select super-enhancers and enhancers were identified that regulated clusters of paralogous genes. We conclude that, whereas robust transcriptional output can be achieved by an isolated enhancer, clusters of enhancers acting on a common target gene act in a partially redundant manner to fine tune transcriptional output of their target genes. PMID:27895109

  17. Gene structure and expression characteristic of a novel odorant receptor gene cluster in the parasitoid wasp Microplitis mediator (Hymenoptera: Braconidae).

    PubMed

    Wang, S-N; Shan, S; Zheng, Y; Peng, Y; Lu, Z-Y; Yang, Y-Q; Li, R-J; Zhang, Y-J; Guo, Y-Y

    2017-08-01

    Odorant receptors (ORs) expressed in the antennae of parasitoid wasps are responsible for detection of various lipophilic airborne molecules. In the present study, 107 novel OR genes were identified from Microplitis mediator antennal transcriptome data. Phylogenetic analysis of the set of OR genes from M. mediator and Microplitis demolitor revealed that M. mediator OR (MmedOR) genes can be classified into different subfamilies, and the majority of MmedORs in each subfamily shared high sequence identities and clear orthologous relationships to M. demolitor ORs. Within a subfamily, six MmedOR genes, MmedOR98, 124, 125, 126, 131 and 155, shared a similar gene structure and were tightly linked in the genome. To evaluate whether the clustered MmedOR genes share common regulatory features, the transcription profile and expression characteristics of the six closely related OR genes were investigated in M. mediator. Rapid amplification of cDNA ends-PCR experiments revealed that the OR genes within the cluster were transcribed as single mRNAs, and a bicistronic mRNA for two adjacent genes (MmedOR124 and MmedOR98) was also detected in female antennae by reverse transcription PCR. In situ hybridization experiments indicated that each OR gene within the cluster was expressed in a different number of cells. Moreover, there was no co-expression of the two highly related OR genes, MmedOR124 and MmedOR98, which appeared to be individually expressed in a distinct population of neurons. Overall, there were distinct expression profiles of closely related MmedOR genes from the same cluster in M. mediator. These data provide a basic understanding of the olfactory coding in parasitoid wasps. © 2017 The Royal Entomological Society.

  18. The evolution of Dscam genes across the arthropods.

    PubMed

    Armitage, Sophie A O; Freiburg, Rebecca Y; Kurtz, Joachim; Bravo, Ignacio G

    2012-04-13

    One way of creating phenotypic diversity is through alternative splicing of precursor mRNAs. A gene that has evolved a hypervariable form is Down syndrome cell adhesion molecule (Dscam-hv), which in Drosophila melanogaster can produce thousands of isoforms via mutually exclusive alternative splicing. The extracellular region of this protein is encoded by three variable exon clusters, each containing multiple exon variants. The protein is vital for neuronal wiring where the extreme variability at the somatic level is required for axonal guidance, and it plays a role in immunity where the variability has been hypothesised to relate to recognition of different antigens. Dscam-hv has been found across the Pancrustacea. Additionally, three paralogous non-hypervariable Dscam-like genes have also been described for D. melanogaster. Here we took a bioinformatics approach, building profile Hidden Markov Models to search across species for putative orthologs to the Dscam genes and for hypervariable alternatively spliced exons, and inferring the phylogenetic relationships among them. Our aims were to examine whether Dscam orthologs exist outside the Bilateria, whether the origin of Dscam-hv could lie outside the Pancrustacea, when the Dscam-like orthologs arose, how many alternatively spliced exons of each exon cluster were present in the most common recent ancestor, and how these clusters evolved. Our results suggest that the origin of Dscam genes may lie after the split between the Cnidaria and the Bilateria and supports the hypothesis that Dscam-hv originated in the common ancestor of the Pancrustacea. Our phylogeny of Dscam gene family members shows six well-supported clades: five containing Dscam-like genes and one containing all the Dscam-hv genes, a seventh clade contains arachnid putative Dscam genes. Furthermore, the exon clusters appear to have experienced different evolutionary histories. Dscam genes have undergone independent duplication events in the insects and in an arachnid genome, which adds to the more well-known tandem duplications that have taken place within Dscam-hv genes. Therefore, two forms of gene expansion seem to be active within this gene family. The evolutionary history of this dynamic gene family will be further unfolded as genomes of species from more disparate groups become available.

  19. The evolution of Dscam genes across the arthropods

    PubMed Central

    2012-01-01

    Background One way of creating phenotypic diversity is through alternative splicing of precursor mRNAs. A gene that has evolved a hypervariable form is Down syndrome cell adhesion molecule (Dscam-hv), which in Drosophila melanogaster can produce thousands of isoforms via mutually exclusive alternative splicing. The extracellular region of this protein is encoded by three variable exon clusters, each containing multiple exon variants. The protein is vital for neuronal wiring where the extreme variability at the somatic level is required for axonal guidance, and it plays a role in immunity where the variability has been hypothesised to relate to recognition of different antigens. Dscam-hv has been found across the Pancrustacea. Additionally, three paralogous non-hypervariable Dscam-like genes have also been described for D. melanogaster. Here we took a bioinformatics approach, building profile Hidden Markov Models to search across species for putative orthologs to the Dscam genes and for hypervariable alternatively spliced exons, and inferring the phylogenetic relationships among them. Our aims were to examine whether Dscam orthologs exist outside the Bilateria, whether the origin of Dscam-hv could lie outside the Pancrustacea, when the Dscam-like orthologs arose, how many alternatively spliced exons of each exon cluster were present in the most common recent ancestor, and how these clusters evolved. Results Our results suggest that the origin of Dscam genes may lie after the split between the Cnidaria and the Bilateria and supports the hypothesis that Dscam-hv originated in the common ancestor of the Pancrustacea. Our phylogeny of Dscam gene family members shows six well-supported clades: five containing Dscam-like genes and one containing all the Dscam-hv genes, a seventh clade contains arachnid putative Dscam genes. Furthermore, the exon clusters appear to have experienced different evolutionary histories. Conclusions Dscam genes have undergone independent duplication events in the insects and in an arachnid genome, which adds to the more well-known tandem duplications that have taken place within Dscam-hv genes. Therefore, two forms of gene expansion seem to be active within this gene family. The evolutionary history of this dynamic gene family will be further unfolded as genomes of species from more disparate groups become available. PMID:22500922

  20. Neighboring Genes Show Correlated Evolution in Gene Expression

    PubMed Central

    Ghanbarian, Avazeh T.; Hurst, Laurence D.

    2015-01-01

    When considering the evolution of a gene’s expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. PMID:25743543

  1. Familial cancer syndromes and clusters.

    PubMed

    Birch, J M

    1994-07-01

    The study of rare families in which a variety of cancers occur, usually at an early age and with patterns consistent with a common hereditary mechanism, has contributed much to our understanding of the process of carcinogenesis. So far, genes identified as having a role in cancer predisposition in these families have also been important in the histogenesis of sporadic cancers. In the two most clearly defined cancer family syndromes, the Li-Fraumeni syndrome and Lynch syndrome II, the genes involved predispose to diverse but specific constellations of cancers. Genes associated with site-specific familial cancer clusters may also give rise to increased susceptibility to other cancers, and site-specific clusters may represent one end of a spectrum. A consistent feature of familial cancer syndromes is the variable expression within and between families. A challenge for the future will be to determine other factors which may interact with the principal genes involved, giving rise to this variability.

  2. Sioxanthin, a novel glycosylated carotenoid, reveals an unusual subclustered biosynthetic pathway.

    PubMed

    Richter, Taylor K S; Hughes, Chambers C; Moore, Bradley S

    2015-06-01

    Members of the marine actinomycete genus Salinispora constitutively produce a characteristic orange pigment during vegetative growth. Contrary to the understanding of widespread carotenoid biosynthesis pathways in bacteria, Salinispora carotenoid biosynthesis genes are not confined to a single cluster. Instead, bioinformatic and genetic investigations confirm that four regions of the Salinispora tropica CNB-440 genome, consisting of two gene clusters and two independent genes, contribute to the in vivo production of a single carotenoid. This compound, namely (2'S)-1'-(β-D-glucopyranosyloxy)-3',4'-didehydro-1',2'-dihydro-φ,ψ-caroten-2'-ol, is novel and has been given the trivial name 'sioxanthin'. Sioxanthin is a C40 -carotenoid, glycosylated on one end of the molecule and containing an aryl moiety on the opposite end. Glycosylation is unusual among actinomycete carotenoids, and sioxanthin joins a rare group of carotenoids with polar and non-polar head groups. Gene sequence homology predicts that the sioxanthin biosynthetic pathway is present in all of the Salinispora as well as other members of the family Micromonosporaceae. Additionally, this study's investigations of clustering of carotenoid biosynthetic genes in heterotrophic bacteria show that a non-clustered genome arrangement is more common than previously suggested, with nearly half of the investigated genomes showing a non-clustered architecture. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.

  3. The resemblance and disparity of gene expression in dormant and non-dormant seeds and crown buds of leafy spurge (Euphorbia esula)

    USDA-ARS?s Scientific Manuscript database

    Overlaps in transcriptome profiles between different phases of bud and seed dormancy have not been determined. Thus, we compared various phases of dormancy between seeds and buds to identify common genes and molecular processes. Cluster analysis of expression profiles for 201 selected genes indicate...

  4. Hox cluster disintegration with persistent anteroposterior order of expression in Oikopleura dioica.

    PubMed

    Seo, Hee-Chan; Edvardsen, Rolf Brudvik; Maeland, Anne Dorthea; Bjordal, Marianne; Jensen, Marit Flo; Hansen, Anette; Flaat, Mette; Weissenbach, Jean; Lehrach, Hans; Wincker, Patrick; Reinhardt, Richard; Chourrout, Daniel

    2004-09-02

    Tunicate embryos and larvae have small cell numbers and simple anatomical features in comparison with other chordates, including vertebrates. Although they branch near the base of chordate phylogenetic trees, their degree of divergence from the common chordate ancestor remains difficult to evaluate. Here we show that the tunicate Oikopleura dioica has a complement of nine Hox genes in which all central genes are lacking but a full vertebrate-like set of posterior genes is present. In contrast to all bilaterians studied so far, Hox genes are not clustered in the Oikopleura genome. Their expression occurs mostly in the tail, with some tissue preference, and a strong partition of expression domains in the nerve cord, in the notochord and in the muscle. In each tissue of the tail, the anteroposterior order of Hox gene expression evokes spatial collinearity, with several alterations. We propose a relationship between the Hox cluster breakdown, the separation of Hox expression domains, and a transition to a determinative mode of development.

  5. Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

    2004-08-06

    Background The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. Results We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene,more » and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Conclusions Measuring conservation of sequence features closely linked to function - such as binding-site clustering - makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less

  6. Accounting for noise when clustering biological data.

    PubMed

    Sloutsky, Roman; Jimenez, Nicolas; Swamidass, S Joshua; Naegle, Kristen M

    2013-07-01

    Clustering is a powerful and commonly used technique that organizes and elucidates the structure of biological data. Clustering data from gene expression, metabolomics and proteomics experiments has proven to be useful at deriving a variety of insights, such as the shared regulation or function of biochemical components within networks. However, experimental measurements of biological processes are subject to substantial noise-stemming from both technical and biological variability-and most clustering algorithms are sensitive to this noise. In this article, we explore several methods of accounting for noise when analyzing biological data sets through clustering. Using a toy data set and two different case studies-gene expression and protein phosphorylation-we demonstrate the sensitivity of clustering algorithms to noise. Several methods of accounting for this noise can be used to establish when clustering results can be trusted. These methods span a range of assumptions about the statistical properties of the noise and can therefore be applied to virtually any biological data source.

  7. The transcriptome of a complete episode of acute otitis media.

    PubMed

    Hernandez, Michelle; Leichtle, Anke; Pak, Kwang; Webster, Nicholas J; Wasserman, Stephen I; Ryan, Allen F

    2015-04-03

    Otitis media is the most common disease of childhood, and represents an important health challenge to the 10-15% of children who experience chronic/recurrent middle ear infections. The middle ear undergoes extensive modifications during otitis media, potentially involving changes in the expression of many genes. Expression profiling offers an opportunity to discover novel genes and pathways involved in this common childhood disease. The middle ears of 320 WBxB6 F1 hybrid mice were inoculated with non-typeable Haemophilus influenzae (NTHi) or PBS (sham control). Two independent samples were generated for each time point and condition, from initiation of infection to resolution. RNA was profiled on Affymetrix mouse 430 2.0 whole-genome microarrays. Approximately 8% of the sampled transcripts defined the signature of acute NTHi-induced otitis media across time. Hierarchical clustering of signal intensities revealed several temporal gene clusters. Network and pathway enrichment analysis of these clusters identified sets of genes involved in activation of the innate immune response, negative regulation of immune response, changes in epithelial and stromal cell markers, and the recruitment/function of neutrophils and macrophages. We also identified key transcriptional regulators related to events in otitis media, which likely determine the expression of these gene clusters. A list of otitis media susceptibility genes, derived from genome-wide association and candidate gene studies, was significantly enriched during the early induction phase and the middle re-modeling phase of otitis but not in the resolution phase. Our results further indicate that positive versus negative regulation of inflammatory processes occur with highly similar kinetics during otitis media, underscoring the importance of anti-inflammatory responses in controlling pathogenesis. The results characterize the global gene response during otitis media and identify key signaling and transcription factor networks that control the defense of the middle ear against infection. These networks deserve further attention, as dysregulated immune defense and inflammatory responses may contribute to recurrent or chronic otitis in children.

  8. Interspecific and intraspecific gene variability in a 1-Mb region containing the highest density of NBS-LRR genes found in the melon genome.

    PubMed

    González, Víctor M; Aventín, Núria; Centeno, Emilio; Puigdomènech, Pere

    2014-12-17

    Plant NBS-LRR -resistance genes tend to be found in clusters, which have been shown to be hot spots of genome variability. In melon, half of the 81 predicted NBS-LRR genes group in nine clusters, and a 1 Mb region on linkage group V contains the highest density of R-genes and presence/absence gene polymorphisms found in the melon genome. This region is known to contain the locus of Vat, an agronomically important gene that confers resistance to aphids. However, the presence of duplications makes the sequencing and annotation of R-gene clusters difficult, usually resulting in multi-gapped sequences with higher than average errors. A 1-Mb sequence that contains the largest NBS-LRR gene cluster found in melon was improved using a strategy that combines Illumina paired-end mapping and PCR-based gap closing. Unknown sequence was decreased by 70% while about 3,000 SNPs and small indels were corrected. As a result, the annotations of 18 of a total of 23 NBS-LRR genes found in this region were modified, including additional coding sequences, amino acid changes, correction of splicing boundaries, or fussion of ORFs in common transcription units. A phylogeny analysis of the R-genes and their comparison with syntenic sequences in other cucurbits point to a pattern of local gene amplifications since the diversification of cucurbits from other families, and through speciation within the family. A candidate Vat gene is proposed based on the sequence similarity between a reported Vat gene from a Korean melon cultivar and a sequence fragment previously absent in the unrefined sequence. A sequence refinement strategy allowed substantial improvement of a 1 Mb fragment of the melon genome and the re-annotation of the largest cluster of NBS-LRR gene homologues found in melon. Analysis of the cluster revealed that resistance genes have been produced by sequence duplication in adjacent genome locations since the divergence of cucurbits from other close families, and through the process of speciation within the family a candidate Vat gene was also identified using sequence previously unavailable, which demonstrates the advantages of genome assembly refinements when analyzing complex regions such as those containing clusters of highly similar genes.

  9. Prenatal Diagnosis and Molecular Analysis of a Large Novel Deletion (- -JS) Causing α0-Thalassemia.

    PubMed

    Cao, Jinru; He, Shuzhen; Pu, Yudong; Liu, Jingjing; Liu, Fuping; Feng, Jun

    α-Thalassemia (α-thal) is a very common single gene hereditary disease caused by large deletions or point mutations of the α-globin gene cluster in tropical and subtropical regions of the world. Here, we report for the first time, a novel large α-thal deletion in a Chinese family from Jiangsu Province, People's Republic of China (PRC), which removes almost the entire α2 and α1 genes from the α-globin gene cluster. Thus, it was named the Jiangsu deletion (- - JS ) on the α-globin gene cluster causing α 0 -thal. Heterozygotes for this deletion showed an α-thal trait phenotype with reduced mean corpuscular volume (MCV) and mean corpuscular hemoglobin (Hb) (MCH) levels. The sequencing results showed that a 2538 bp deletion (NG_000006.1: g.35801_38338) existed in this novel genotype on the basis of -α 4.2 (leftward), indicating a deletion of about 6.8 kb from the α-globin cluster. In addition, a 29 bp sequence was inserted into the deletion during the recombination events that led to this deletion. Through pedigree analysis, we knew that the proband inherited the novel allele from his mother.

  10. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

    PubMed Central

    Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; Taylor, Ronald C.; Weisenhorn, Pamela; Olson, Robert D.; Stevens, Rick L.; Rocha, Miguel; Rocha, Isabel; Best, Aaron A.; DeJongh, Matthew; Tintle, Nathan L.; Parrello, Bruce; Overbeek, Ross; Henry, Christopher S.

    2016-01-01

    Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. An important step toward meeting the challenge of understanding gene function and regulation is the identification of sets of genes that are always co-expressed. These gene sets, Atomic Regulons (ARs), represent fundamental units of function within a cell and could be used to associate genes of unknown function with cellular processes and to enable rational genetic engineering of cellular systems. Here, we describe an approach for inferring ARs that leverages large-scale expression data sets, gene context, and functional relationships among genes. We computed ARs for Escherichia coli based on 907 gene expression experiments and compared our results with gene clusters produced by two prevalent data-driven methods: Hierarchical clustering and k-means clustering. We compared ARs and purely data-driven gene clusters to the curated set of regulatory interactions for E. coli found in RegulonDB, showing that ARs are more consistent with gold standard regulons than are data-driven gene clusters. We further examined the consistency of ARs and data-driven gene clusters in the context of gene interactions predicted by Context Likelihood of Relatedness (CLR) analysis, finding that the ARs show better agreement with CLR predicted interactions. We determined the impact of increasing amounts of expression data on AR construction and find that while more data improve ARs, it is not necessary to use the full set of gene expression experiments available for E. coli to produce high quality ARs. In order to explore the conservation of co-regulated gene sets across different organisms, we computed ARs for Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus, each of which represents increasing degrees of phylogenetic distance from E. coli. Comparison of the organism-specific ARs showed that the consistency of AR gene membership correlates with phylogenetic distance, but there is clear variability in the regulatory networks of closely related organisms. As large scale expression data sets become increasingly common for model and non-model organisms, comparative analyses of atomic regulons will provide valuable insights into fundamental regulatory modules used across the bacterial domain. PMID:27933038

  11. Haplotype analysis of the apolipoprotein gene cluster on human chromosome 11

    PubMed Central

    Olivier, Michael; Wang, Xujing; Cole, Regina; Gau, Brian; Kim, Jessica; Rubin, Edward M.; Pennacchio, Len A.

    2009-01-01

    Members of the apolipoprotein gene cluster (APOA1/C3/A4/A5) on human chromosome 11q23 play an important role in lipid metabolism. Polymorphisms in both APOA5 and APOC3 are strongly associated with plasma triglyceride concentrations. The close genomic locations of these two genes as well as their functional similarity have hindered efforts to define whether each gene independently influences human triglyceride concentrations. In this study, we examined the linkage disequilibrium and haplotype structure of 49 SNPs in a 150-kb region spanning the gene cluster. We identified a total of five common APOA5 haplotypes with a frequency of greater than 8% in samples of northern European origin. The APOA5 haplotype block did not extend past the 7 SNPs in the gene and was separated from the other apolipoprotein gene in the cluster by a region of significantly increased recombination. Furthermore, one previously identified triglyceride risk haplotype of APOA5 (APOA5*3) showed no association with three APOC3 SNPs previously associated with triglyceride concentrations, in contrast to the other risk haplotype (APOA5*2), which was associated with all three minor APOC3 SNP alleles. These results highlight the complex genetic relationship between APOA5 and APOC3 and support the notion that APOA5 represents an independent risk gene affecting plasma triglyceride concentrations in humans. PMID:15081120

  12. When Genome-Based Approach Meets the “Old but Good”: Revealing Genes Involved in the Antibacterial Activity of Pseudomonas sp. P482 against Soft Rot Pathogens

    PubMed Central

    Krzyżanowska, Dorota M.; Ossowicki, Adam; Rajewska, Magdalena; Maciąg, Tomasz; Jabłońska, Magdalena; Obuchowski, Michał; Heeb, Stephan; Jafra, Sylwia

    2016-01-01

    Dickeya solani and Pectobacterium carotovorum subsp. brasiliense are recently established species of bacterial plant pathogens causing black leg and soft rot of many vegetables and ornamental plants. Pseudomonas sp. strain P482 inhibits the growth of these pathogens, a desired trait considering the limited measures to combat these diseases. In this study, we determined the genetic background of the antibacterial activity of P482, and established the phylogenetic position of this strain. Pseudomonas sp. P482 was classified as Pseudomonas donghuensis. Genome mining revealed that the P482 genome does not contain genes determining the synthesis of known antimicrobials. However, the ClusterFinder algorithm, designed to detect atypical or novel classes of secondary metabolite gene clusters, predicted 18 such clusters in the genome. Screening of a Tn5 mutant library yielded an antimicrobial negative transposon mutant. The transposon insertion was located in a gene encoding an HpcH/HpaI aldolase/citrate lyase family protein. This gene is located in a hypothetical cluster predicted by the ClusterFinder, together with the downstream homologs of four nfs genes, that confer production of a non-fluorescent siderophore by P. donghuensis HYST. Site-directed inactivation of the HpcH/HpaI aldolase gene, the adjacent short chain dehydrogenase gene, as well as a homolog of an essential nfs cluster gene, all abolished the antimicrobial activity of the P482, suggesting their involvement in a common biosynthesis pathway. However, none of the mutants showed a decreased siderophore yield, neither was the antimicrobial activity of the wild type P482 compromised by high iron bioavailability. A genomic region comprising the nfs cluster and three upstream genes is involved in the antibacterial activity of P. donghuensis P482 against D. solani and P. carotovorum subsp. brasiliense. The genes studied are unique to the two known P. donghuensis strains. This study illustrates that mining of microbial genomes is a powerful approach for predictingthe presence of novel secondary-metabolite encoding genes especially when coupled with transposon mutagenesis. PMID:27303376

  13. Neighboring Genes Show Correlated Evolution in Gene Expression.

    PubMed

    Ghanbarian, Avazeh T; Hurst, Laurence D

    2015-07-01

    When considering the evolution of a gene's expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. A De Novo Deletion in the Regulators of Complement Activation Cluster Producing a Hybrid Complement Factor H/Complement Factor H-Related 3 Gene in Atypical Hemolytic Uremic Syndrome.

    PubMed

    Challis, Rachel C; Araujo, Geisilaine S R; Wong, Edwin K S; Anderson, Holly E; Awan, Atif; Dorman, Anthony M; Waldron, Mary; Wilson, Valerie; Brocklebank, Vicky; Strain, Lisa; Morgan, B Paul; Harris, Claire L; Marchbank, Kevin J; Goodship, Timothy H J; Kavanagh, David

    2016-06-01

    The regulators of complement activation cluster at chromosome 1q32 contains the complement factor H (CFH) and five complement factor H-related (CFHR) genes. This area of the genome arose from several large genomic duplications, and these low-copy repeats can cause genome instability in this region. Genomic disorders affecting these genes have been described in atypical hemolytic uremic syndrome, arising commonly through nonallelic homologous recombination. We describe a novel CFH/CFHR3 hybrid gene secondary to a de novo 6.3-kb deletion that arose through microhomology-mediated end joining rather than nonallelic homologous recombination. We confirmed a transcript from this hybrid gene and showed a secreted protein product that lacks the recognition domain of factor H and exhibits impaired cell surface complement regulation. The fact that the formation of this hybrid gene arose as a de novo event suggests that this cluster is a dynamic area of the genome in which additional genomic disorders may arise. Copyright © 2016 by the American Society of Nephrology.

  15. Gene expression pattern recognition algorithm inferences to classify samples exposed to chemical agents

    NASA Astrophysics Data System (ADS)

    Bushel, Pierre R.; Bennett, Lee; Hamadeh, Hisham; Green, James; Ableson, Alan; Misener, Steve; Paules, Richard; Afshari, Cynthia

    2002-06-01

    We present an analysis of pattern recognition procedures used to predict the classes of samples exposed to pharmacologic agents by comparing gene expression patterns from samples treated with two classes of compounds. Rat liver mRNA samples following exposure for 24 hours with phenobarbital or peroxisome proliferators were analyzed using a 1700 rat cDNA microarray platform. Sets of genes that were consistently differentially expressed in the rat liver samples following treatment were stored in the MicroArray Project System (MAPS) database. MAPS identified 238 genes in common that possessed a low probability (P < 0.01) of being randomly detected as differentially expressed at the 95% confidence level. Hierarchical cluster analysis on the 238 genes clustered specific gene expression profiles that separated samples based on exposure to a particular class of compound.

  16. The Chloroplast atpA Gene Cluster in Chlamydomonas reinhardtii1

    PubMed Central

    Drapier, Dominique; Suzuki, Hideki; Levy, Haim; Rimbault, Blandine; Kindle, Karen L.; Stern, David B.; Wollman, Francis-André

    1998-01-01

    Most chloroplast genes in vascular plants are organized into polycistronic transcription units, which generate a complex pattern of mono-, di-, and polycistronic transcripts. In contrast, most Chlamydomonas reinhardtii chloroplast transcripts characterized to date have been monocistronic. This paper describes the atpA gene cluster in the C. reinhardtii chloroplast genome, which includes the atpA, psbI, cemA, and atpH genes, encoding the α-subunit of the coupling-factor-1 (CF1) ATP synthase, a small photosystem II polypeptide, a chloroplast envelope membrane protein, and subunit III of the CF0 ATP synthase, respectively. We show that promoters precede the atpA, psbI, and atpH genes, but not the cemA gene, and that cemA mRNA is present only as part of di-, tri-, or tetracistronic transcripts. Deletions introduced into the gene cluster reveal, first, that CF1-α can be translated from di- or polycistronic transcripts, and, second, that substantial reductions in mRNA quantity have minimal effects on protein synthesis rates. We suggest that posttranscriptional mRNA processing is common in C. reinhardtii chloroplasts, permitting the expression of multiple genes from a single promoter. PMID:9625716

  17. Pan-genome and phylogeny of Bacillus cereus sensu lato.

    PubMed

    Bazinet, Adam L

    2017-08-02

    Bacillus cereus sensu lato (s. l.) is an ecologically diverse bacterial group of medical and agricultural significance. In this study, I use publicly available genomes and novel bioinformatic workflows to characterize the B. cereus s. l. pan-genome and perform the largest phylogenetic and population genetic analyses of this group to date in terms of the number of genes and taxa included. With these fundamental data in hand, I identify genes associated with particular phenotypic traits (i.e., "pan-GWAS" analysis), and quantify the degree to which taxa sharing common attributes are phylogenetically clustered. A rapid k-mer based approach (Mash) was used to create reduced representations of selected Bacillus genomes, and a fast distance-based phylogenetic analysis of this data (FastME) was performed to determine which species should be included in B. cereus s. l. The complete genomes of eight B. cereus s. l. species were annotated de novo with Prokka, and these annotations were used by Roary to produce the B. cereus s. l. pan-genome. Scoary was used to associate gene presence and absence patterns with various phenotypes. The orthologous protein sequence clusters produced by Roary were filtered and used to build HaMStR databases of gene models that were used in turn to construct phylogenetic data matrices. Phylogenetic analyses used RAxML, DendroPy, ClonalFrameML, PAUP*, and SplitsTree. Bayesian model-based population genetic analysis assigned taxa to clusters using hierBAPS. The genealogical sorting index was used to quantify the phylogenetic clustering of taxa sharing common attributes. The B. cereus s. l. pan-genome currently consists of ≈60,000 genes, ≈600 of which are "core" (common to at least 99% of taxa sampled). Pan-GWAS analysis revealed genes associated with phenotypes such as isolation source, oxygen requirement, and ability to cause diseases such as anthrax or food poisoning. Extensive phylogenetic analyses using an unprecedented amount of data produced phylogenies that were largely concordant with each other and with previous studies. Phylogenetic support as measured by bootstrap probabilities increased markedly when all suitable pan-genome data was included in phylogenetic analyses, as opposed to when only core genes were used. Bayesian population genetic analysis recommended subdividing the three major clades of B. cereus s. l. into nine clusters. Taxa sharing common traits and species designations exhibited varying degrees of phylogenetic clustering. All phylogenetic analyses recapitulated two previously used classification systems, and taxa were consistently assigned to the same major clade and group. By including accessory genes from the pan-genome in the phylogenetic analyses, I produced an exceptionally well-supported phylogeny of 114 complete B. cereus s. l. genomes. The best-performing methods were used to produce a phylogeny of all 498 publicly available B. cereus s. l. genomes, which was in turn used to compare three different classification systems and to test the monophyly status of various B. cereus s. l. species. The majority of the methodology used in this study is generic and could be leveraged to produce pan-genome estimates and similarly robust phylogenetic hypotheses for other bacterial groups.

  18. Effects of Polymorphisms in APOA4-APOA5-ZNF259-BUD13 Gene Cluster on Plasma Levels of Triglycerides and Risk of Coronary Heart Disease in a Chinese Han Population

    PubMed Central

    Su, Li; Zhang, Mingjun; Wang, Long; Jing, Jinjin; Zhou, Li

    2015-01-01

    Background/Aim Recent genome-wide association studies have identified several loci influencing lipid levels. The present study focused on the triglycerides (TG)-associated locus, the APOA4-APOA5-ZNF259-BUD13 gene cluster on chromosome 11, to explore the role of genetic variants in this gene cluster in the development of increasing TG levels and coronary heart disease (CHD). Methodology/Principal Findings Six single nucleotide polymorphisms (SNPs), rs4417316, rs651821, rs6589566, rs7396835, rs964184 and rs17119975, in the APOA4-APOA5-ZNF259-BUD13 gene cluster were selected and genotyped in 5374 healthy Chinese subjects. There were strong significant associations between the six SNPs and TG levels (P<1.0×10−8). Moreover, a weighted genotype score was found to be associated with TG levels (P = 3.28×10−13). The frequencies of three common haplotypes were observed to be significantly different between the high TG group and the low TG group (P<0.05). However, no significant effects were found for the SNPs regarding susceptibility to CHD in the Chinese case-control populations. Conclusions/Significance This study highlights the genotypes, genotype scores and haplotypes of the APOA4-APOA5-ZNF259-BUD13 gene cluster that were associated with TG levels in a Chinese population; however, the genetic variants in this gene cluster did not increase the risk of CHD in the Chinese population. PMID:26397108

  19. The Epipolythiodiketopiperazine Gene Cluster in Claviceps purpurea: Dysfunctional Cytochrome P450 Enzyme Prevents Formation of the Previously Unknown Clapurines.

    PubMed

    Dopstadt, Julian; Neubauer, Lisa; Tudzynski, Paul; Humpf, Hans-Ulrich

    2016-01-01

    Claviceps purpurea is an important food contaminant and well known for the production of the toxic ergot alkaloids. Apart from that, little is known about its secondary metabolism and not all toxic substances going along with the food contamination with Claviceps are known yet. We explored the metabolite profile of a gene cluster in C. purpurea with a high homology to gene clusters, which are responsible for the formation of epipolythiodiketopiperazine (ETP) toxins in other fungi. By overexpressing the transcription factor, we were able to activate the cluster in the standard C. purpurea strain 20.1. Although all necessary genes for the formation of the characteristic disulfide bridge were expressed in the overexpression mutants, the fungus did not produce any ETPs. Isolation of pathway intermediates showed that the common biosynthetic pathway stops after the first steps. Our results demonstrate that hydroxylation of the diketopiperazine backbone is the critical step during the ETP biosynthesis. Due to a dysfunctional enzyme, the fungus is not able to produce toxic ETPs. Instead, the pathway end-products are new unusual metabolites with a unique nitrogen-sulfur bond. By heterologous expression of the Leptosphaeria maculans cytochrome P450 encoding gene sirC, we were able to identify the end-products of the ETP cluster in C. purpurea. The thioclapurines are so far unknown ETPs, which might contribute to the toxicity of other C. purpurea strains with a potentially intact ETP cluster.

  20. The Epipolythiodiketopiperazine Gene Cluster in Claviceps purpurea: Dysfunctional Cytochrome P450 Enzyme Prevents Formation of the Previously Unknown Clapurines

    PubMed Central

    Tudzynski, Paul; Humpf, Hans-Ulrich

    2016-01-01

    Claviceps purpurea is an important food contaminant and well known for the production of the toxic ergot alkaloids. Apart from that, little is known about its secondary metabolism and not all toxic substances going along with the food contamination with Claviceps are known yet. We explored the metabolite profile of a gene cluster in C. purpurea with a high homology to gene clusters, which are responsible for the formation of epipolythiodiketopiperazine (ETP) toxins in other fungi. By overexpressing the transcription factor, we were able to activate the cluster in the standard C. purpurea strain 20.1. Although all necessary genes for the formation of the characteristic disulfide bridge were expressed in the overexpression mutants, the fungus did not produce any ETPs. Isolation of pathway intermediates showed that the common biosynthetic pathway stops after the first steps. Our results demonstrate that hydroxylation of the diketopiperazine backbone is the critical step during the ETP biosynthesis. Due to a dysfunctional enzyme, the fungus is not able to produce toxic ETPs. Instead, the pathway end-products are new unusual metabolites with a unique nitrogen-sulfur bond. By heterologous expression of the Leptosphaeria maculans cytochrome P450 encoding gene sirC, we were able to identify the end-products of the ETP cluster in C. purpurea. The thioclapurines are so far unknown ETPs, which might contribute to the toxicity of other C. purpurea strains with a potentially intact ETP cluster. PMID:27390873

  1. A pyrosequencing assay for the quantitative methylation analysis of the PCDHB gene cluster, the major factor in neuroblastoma methylator phenotype.

    PubMed

    Banelli, Barbara; Brigati, Claudio; Di Vinci, Angela; Casciano, Ida; Forlani, Alessandra; Borzì, Luana; Allemanni, Giorgio; Romani, Massimo

    2012-03-01

    Epigenetic alterations are hallmarks of cancer and powerful biomarkers, whose clinical utilization is made difficult by the absence of standardization and of common methods of data interpretation. The coordinate methylation of many loci in cancer is defined as 'CpG island methylator phenotype' (CIMP) and identifies clinically distinct groups of patients. In neuroblastoma (NB), CIMP is defined by a methylation signature, which includes different loci, but its predictive power on outcome is entirely recapitulated by the PCDHB cluster only. We have developed a robust and cost-effective pyrosequencing-based assay that could facilitate the clinical application of CIMP in NB. This assay permits the unbiased simultaneous amplification and sequencing of 17 out of 19 genes of the PCDHB cluster for quantitative methylation analysis, taking into account all the sequence variations. As some of these variations were at CpG doublets, we bypassed the data interpretation conducted by the methylation analysis software to assign the corrected methylation value at these sites. The final result of the assay is the mean methylation level of 17 gene fragments in the protocadherin B cluster (PCDHB) cluster. We have utilized this assay to compare the methylation levels of the PCDHB cluster between high-risk and very low-risk NB patients, confirming the predictive value of CIMP. Our results demonstrate that the pyrosequencing-based assay herein described is a powerful instrument for the analysis of this gene cluster that may simplify the data comparison between different laboratories and, in perspective, could facilitate its clinical application. Furthermore, our results demonstrate that, in principle, pyrosequencing can be efficiently utilized for the methylation analysis of gene clusters with high internal homologies.

  2. The group B streptococcal sialic acid O-acetyltransferase is encoded by neuD, a conserved component of bacterial sialic acid biosynthetic gene clusters.

    PubMed

    Lewis, Amanda L; Hensler, Mary E; Varki, Ajit; Nizet, Victor

    2006-04-21

    Nearly two dozen microbial pathogens have surface polysaccharides or lipo-oligosaccharides that contain sialic acid (Sia), and several Sia-dependent virulence mechanisms are known to enhance bacterial survival or result in host tissue injury. Some pathogens are also known to O-acetylate their Sias, although the role of this modification in pathogenesis remains unclear. We report that neuD, a gene located within the Group B Streptococcus (GBS) Sia biosynthetic gene cluster, encodes a Sia O-acetyltransferase that is itself required for capsular polysaccharide (CPS) sialylation. Homology modeling and site-directed mutagenesis identified Lys-123 as a critical residue for Sia O-acetyltransferase activity. Moreover, a single nucleotide polymorphism in neuD can determine whether GBS displays a "high" or "low" Sia O-acetylation phenotype. Complementation analysis revealed that Escherichia coli K1 NeuD also functions as a Sia O-acetyltransferase in GBS. In fact, NeuD homologs are commonly found within Sia biosynthetic gene clusters. A bioinformatic approach identified 18 bacterial species with a Sia biosynthetic gene cluster that included neuD. Included in this list are the sialylated human pathogens Legionella pneumophila, Vibrio parahemeolyticus, Pseudomonas aeruginosa, and Campylobacter jejuni, as well as an additional 12 bacterial species never before analyzed for Sia expression. Phylogenetic analysis shows that NeuD homologs of sialylated pathogens share a common evolutionary lineage distinct from the poly-Sia O-acetyltransferase of E. coli K1. These studies define a molecular genetic approach for the selective elimination of GBS Sia O-acetylation without concurrent loss of sialylation, a key to further studies addressing the role(s) of this modification in bacterial virulence.

  3. Chassis organism from Corynebacterium glutamicum--a top-down approach to identify and delete irrelevant gene clusters.

    PubMed

    Unthan, Simon; Baumgart, Meike; Radek, Andreas; Herbst, Marius; Siebert, Daniel; Brühl, Natalie; Bartsch, Anna; Bott, Michael; Wiechert, Wolfgang; Marin, Kay; Hans, Stephan; Krämer, Reinhard; Seibold, Gerd; Frunzke, Julia; Kalinowski, Jörn; Rückert, Christian; Wendisch, Volker F; Noack, Stephan

    2015-02-01

    For synthetic biology applications, a robust structural basis is required, which can be constructed either from scratch or in a top-down approach starting from any existing organism. In this study, we initiated the top-down construction of a chassis organism from Corynebacterium glutamicum ATCC 13032, aiming for the relevant gene set to maintain its fast growth on defined medium. We evaluated each native gene for its essentiality considering expression levels, phylogenetic conservation, and knockout data. Based on this classification, we determined 41 gene clusters ranging from 3.7 to 49.7 kbp as target sites for deletion. 36 deletions were successful and 10 genome-reduced strains showed impaired growth rates, indicating that genes were hit, which are relevant to maintain biological fitness at wild-type level. In contrast, 26 deleted clusters were found to include exclusively irrelevant genes for growth on defined medium. A combinatory deletion of all irrelevant gene clusters would, in a prophage-free strain, decrease the size of the native genome by about 722 kbp (22%) to 2561 kbp. Finally, five combinatory deletions of irrelevant gene clusters were investigated. The study introduces the novel concept of relevant genes and demonstrates general strategies to construct a chassis suitable for biotechnological application. © 2014 The Authors. Biotechnology Journal published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim. This is an open access article under the terms of the Creative Commons Attribution-Non-Commercial-NoDerivs Licence, which permits use and distribution in any medium, provided the original work is properly cited, the use is non- commercial and no modifications or adaptations are made.

  4. Conserved gene clusters in bacterial genomes provide further support for the primacy of RNA

    NASA Technical Reports Server (NTRS)

    Siefert, J. L.; Martin, K. A.; Abdi, F.; Widger, W. R.; Fox, G. E.

    1997-01-01

    Five complete bacterial genome sequences have been released to the scientific community. These include four (eu)Bacteria, Haemophilus influenzae, Mycoplasma genitalium, M. pneumoniae, and Synechocystis PCC 6803, as well as one Archaeon, Methanococcus jannaschii. Features of organization shared by these genomes are likely to have arisen very early in the history of the bacteria and thus can be expected to provide further insight into the nature of early ancestors. Results of a genome comparison of these five organisms confirm earlier observations that gene order is remarkably unpreserved. There are, nevertheless, at least 16 clusters of two or more genes whose order remains the same among the four (eu)Bacteria and these are presumed to reflect conserved elements of coordinated gene expression that require gene proximity. Eight of these gene orders are essentially conserved in the Archaea as well. Many of these clusters are known to be regulated by RNA-level mechanisms in Escherichia coli, which supports the earlier suggestion that this type of regulation of gene expression may have arisen very early. We conclude that although the last common ancestor may have had a DNA genome, it likely was preceded by progenotes with an RNA genome.

  5. Biosynthesis of actinorhodin and related antibiotics: discovery of alternative routes for quinone formation encoded in the act gene cluster.

    PubMed

    Okamoto, Susumu; Taguchi, Takaaki; Ochi, Kozo; Ichinose, Koji

    2009-02-27

    All known benzoisochromanequinone (BIQ) biosynthetic gene clusters carry a set of genes encoding a two-component monooxygenase homologous to the ActVA-ORF5/ActVB system for actinorhodin biosynthesis in Streptomyces coelicolor A3(2). Here, we conducted molecular genetic and biochemical studies of this enzyme system. Inactivation of actVA-ORF5 yielded a shunt product, actinoperylone (ACPL), apparently derived from 6-deoxy-dihydrokalafungin. Similarly, deletion of actVB resulted in accumulation of ACPL, indicating a critical role for the monooxygenase system in C-6 oxygenation, a biosynthetic step common to all BIQ biosyntheses. Furthermore, in vitro, we showed a quinone-forming activity of the ActVA-ORF5/ActVB system in addition to that of a known C-6 monooxygenase, ActVA-ORF6, by using emodinanthrone as a model substrate. Our results demonstrate that the act gene cluster encodes two alternative routes for quinone formation by C-6 oxygenation in BIQ biosynthesis.

  6. Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury.

    PubMed

    Ryge, Jesper; Winther, Ole; Wienecke, Jacob; Sandelin, Albin; Westerdahl, Ann-Charlotte; Hultborn, Hans; Kiehn, Ole

    2010-06-09

    Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper-excitability, the manipulation of which potentially could be used to alter the transcriptional response to prevent the motor neurons from entering a state of hyper-excitability.

  7. Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury

    PubMed Central

    2010-01-01

    Background Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. Results Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. Conclusions This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper-excitability, the manipulation of which potentially could be used to alter the transcriptional response to prevent the motor neurons from entering a state of hyper-excitability. PMID:20534130

  8. Differential gene expression profiles of peripheral blood mononuclear cells in childhood asthma.

    PubMed

    Kong, Qian; Li, Wen-Jing; Huang, Hua-Rong; Zhong, Ying-Qiang; Fang, Jian-Pei

    2015-05-01

    Asthma is a common childhood disease with strong genetic components. This study compared whole-genome expression differences between asthmatic young children and healthy controls to identify gene signatures of childhood asthma. Total RNA extracted from peripheral blood mononuclear cells (PBMC) was subjected to microarray analysis. QRT-PCR was performed to verify the microarray results. Classification and functional characterization of differential genes were illustrated by hierarchical clustering and gene ontology analysis. Multiple logistic regression (MLR) analysis, receiver operating characteristic (ROC) curve analysis, and discriminate power were used to scan asthma-specific diagnostic markers. For fold-change>2 and p < 0.05, there were 758 named differential genes. The results of QRT-PCR confirmed successfully the array data. Hierarchical clustering divided 29 highly possible genes into seven categories and the genes in the same cluster were likely to possess similar expression patterns or functions. Gene ontology analysis presented that differential genes primarily enriched in immune response, response to stress or stimulus, and regulation of apoptosis in biological process. MLR and ROC curve analysis revealed that the combination of ADAM33, Smad7, and LIGHT possessed excellent discriminating power. The combination of ADAM33, Smad7, and LIGHT would be a reliable and useful childhood asthma model for prediction and diagnosis.

  9. A HIV-1 heterosexual transmission chain in Guangzhou, China: a molecular epidemiological study.

    PubMed

    Han, Zhigang; Leung, Tommy W C; Zhao, Jinkou; Wang, Ming; Fan, Lirui; Li, Kai; Pang, Xinli; Liang, Zhenbo; Lim, Wilina W L; Xu, Huifang

    2009-09-25

    We conducted molecular analyses to confirm four clustering HIV-1 infections (Patient A, B, C & D) in Guangzhou, China. These cases were identified by epidemiological investigation and suspected to acquire the infection through a common heterosexual transmission chain. Env C2V3V4 region, gag p17/p24 junction and partial pol gene of HIV-1 genome from serum specimens of these infected cases were amplified by reverse transcription polymerase chain reaction (RT-PCR) and nucleotide sequenced. Phylogenetic analyses indicated that their viral nucleotide sequences were significantly clustered together (bootstrap value is 99%, 98% and 100% in env, gag and pol tree respectively). Evolutionary distance analysis indicated that their genetic diversities of env, gag and pol genes were significantly lower than non-clustered controls, as measured by unpaired t-test (env gene comparison: p < 0.005; gag gene comparison: p < 0.005; pol gene comparison: p < 0.005). Epidemiological results and molecular analyses consistently illustrated these four cases represented a transmission chain which dispersed in the locality through heterosexual contact involving commercial sex worker.

  10. Concordance of transcriptional and apical benchmark dose levels for conazole-ind uced liver effects in mice

    EPA Science Inventory

    The ability to anchor chemical class-based gene expression changes to phenotypic lesions and to describe these changes as a function of dose and time can inform mode of action and improve quantitative risk assessment. Previous research identified a 330-gene cluster commonly resp...

  11. Characterization of three different clusters of 18S-26S ribosomal DNA genes in the sea urchin P. lividus: Genetic and epigenetic regulation synchronous to 5S rDNA.

    PubMed

    Bellavia, Daniele; Dimarco, Eufrosina; Caradonna, Fabio

    2016-04-15

    We previously reported the characterization 5S ribosomal DNA (rDNA) clusters in the common sea urchin Paracentrotus lividus and demonstrated the presence of DNA methylation-dependent silencing of embryo specific 5S rDNA cluster in adult tissue. In this work, we show genetic and epigenetic characterization of 18S-26S rDNA clusters in this specie. The results indicate the presence of three different 18S-26S rDNA clusters with different Non-Transcribed Spacer (NTS) regions that have different chromosomal localizations. Moreover, we show that the two largest clusters are hyper-methylated in the promoter-containing NTS regions in adult tissues, as in the 5S rDNA. These findings demonstrate an analogous epigenetic regulation in small and large rDNA clusters and support the logical synchronism in building ribosomes. In fact, all the ribosomal RNA genes must be synchronously and equally transcribed to perform their unique final product. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Homologues of a single resistance-gene cluster in potato confer resistance to distinct pathogens: a virus and a nematode.

    PubMed

    van der Vossen, E A; van der Voort, J N; Kanyuka, K; Bendahmane, A; Sandbrink, H; Baulcombe, D C; Bakker, J; Stiekema, W J; Klein-Lankhorst, R M

    2000-09-01

    The isolation of the nematode-resistance gene Gpa2 in potato is described, and it is demonstrated that highly homologous resistance genes of a single resistance-gene cluster can confer resistance to distinct pathogen species. Molecular analysis of the Gpa2 locus resulted in the identification of an R-gene cluster of four highly homologous genes in a region of approximately 115 kb. At least two of these genes are active: one corresponds to the previously isolated Rx1 gene that confers resistance to potato virus X, while the other corresponds to the Gpa2 gene that confers resistance to the potato cyst nematode Globodera pallida. The proteins encoded by the Gpa2 and the Rx1 genes share an overall homology of over 88% (amino-acid identity) and belong to the leucine-zipper, nucleotide-binding site, leucine-rich repeat (LZ-NBS-LRR)-containing class of plant resistance genes. From the sequence conservation between Gpa2 and Rx1 it is clear that there is a direct evolutionary relationship between the two proteins. Sequence diversity is concentrated in the LRR region and in the C-terminus. The putative effector domains are more conserved suggesting that, at least in this case, nematode and virus resistance cascades could share common components. These findings underline the potential of protein breeding for engineering new resistance specificities against plant pathogens in vitro.

  13. Identification of hub subnetwork based on topological features of genes in breast cancer

    PubMed Central

    ZHUANG, DA-YONG; JIANG, LI; HE, QING-QING; ZHOU, PENG; YUE, TAO

    2015-01-01

    The aim of this study was to provide functional insight into the identification of hub subnetworks by aggregating the behavior of genes connected in a protein-protein interaction (PPI) network. We applied a protein network-based approach to identify subnetworks which may provide new insight into the functions of pathways involved in breast cancer rather than individual genes. Five groups of breast cancer data were downloaded and analyzed from the Gene Expression Omnibus (GEO) database of high-throughput gene expression data to identify gene signatures using the genome-wide global significance (GWGS) method. A PPI network was constructed using Cytoscape and clusters that focused on highly connected nodes were obtained using the molecular complex detection (MCODE) clustering algorithm. Pathway analysis was performed to assess the functional relevance of selected gene signatures based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Topological centrality was used to characterize the biological importance of gene signatures, pathways and clusters. The results revealed that, cluster1, as well as the cell cycle and oocyte meiosis pathways were significant subnetworks in the analysis of degree and other centralities, in which hub nodes mostly distributed. The most important hub nodes, with top ranked centrality, were also similar with the common genes from the above three subnetwork intersections, which was viewed as a hub subnetwork with more reproducible than individual critical genes selected without network information. This hub subnetwork attributed to the same biological process which was essential in the function of cell growth and death. This increased the accuracy of identifying gene interactions that took place within the same functional process and was potentially useful for the development of biomarkers and networks for breast cancer. PMID:25573623

  14. Bacillus sp. CDB3 isolated from cattle dip-sites possesses two ars gene clusters.

    PubMed

    Bhat, Somanath; Luo, Xi; Xu, Zhiqiang; Liu, Lixia; Zhang, Ren

    2011-01-01

    Contamination of soil and water by arsenic is a global problem. In Australia, the dipping of cattle in arsenic-containing solution to control cattle ticks in last centenary has left many sites heavily contaminated with arsenic and other toxicants. We had previously isolated five soil bacterial strains (CDB1-5) highly resistant to arsenic. To understand the resistance mechanism, molecular studies have been carried out. Two chromosome-encoded arsenic resistance (ars) gene clusters have been cloned from CDB3 (Bacillus sp.). They both function in Escherichia coli and cluster 1 exerts a much higher resistance to the toxic metalloid. Cluster 2 is smaller possessing four open reading frames (ORFs) arsRorf2BC, similar to that identified in Bacillus subtilis Skin element. Among the eight ORFs in cluster 1 five are analogs of common ars genes found in other bacteria, however, organized in a unique order arsRBCDA instead of arsRDABC. Three other putative genes are located directly downstream and designated as arsTIP based on the homologies of their theoretical translation sequences respectively to thioredoxin reductases, iron-sulphur cluster proteins and protein phosphatases. The latter two are novel of any known ars operons. The arsD gene from Bacillus species was cloned for the first time and the predict protein differs from the well studied E. coli ArsD by lacking two pairs of C-terminal cysteine residues. Its functional involvement in arsenic resistance has been confirmed by a deletion experiment. There exists also an inverted repeat in the intergenic region between arsC and arsD implying some unknown transcription regulation.

  15. Comparative Genomics of Non-TNL Disease Resistance Genes from Six Plant Species.

    PubMed

    Nepal, Madhav P; Andersen, Ethan J; Neupane, Surendra; Benson, Benjamin V

    2017-09-30

    Disease resistance genes (R genes), as part of the plant defense system, have coevolved with corresponding pathogen molecules. The main objectives of this project were to identify non-Toll interleukin receptor, nucleotide-binding site, leucine-rich repeat (nTNL) genes and elucidate their evolutionary divergence across six plant genomes. Using reference sequences from Arabidopsis , we investigated nTNL orthologs in the genomes of common bean, Medicago , soybean, poplar, and rice. We used Hidden Markov Models for sequence identification, performed model-based phylogenetic analyses, visualized chromosomal positioning, inferred gene clustering, and assessed gene expression profiles. We analyzed 908 nTNL R genes in the genomes of the six plant species, and classified them into 12 subgroups based on the presence of coiled-coil (CC), nucleotide binding site (NBS), leucine rich repeat (LRR), resistance to Powdery mildew 8 (RPW8), and BED type zinc finger domains. Traditionally classified CC-NBS-LRR (CNL) genes were nested into four clades (CNL A-D) often with abundant, well-supported homogeneous subclades of Type-II R genes. CNL-D members were absent in rice, indicating a unique R gene retention pattern in the rice genome. Genomes from Arabidopsis , common bean, poplar and soybean had one chromosome without any CNL R genes. Medicago and Arabidopsis had the highest and lowest number of gene clusters, respectively. Gene expression analyses suggested unique patterns of expression for each of the CNL clades. Differential gene expression patterns of the nTNL genes were often found to correlate with number of introns and GC content, suggesting structural and functional divergence.

  16. Comparative Genomics of Non-TNL Disease Resistance Genes from Six Plant Species

    PubMed Central

    Andersen, Ethan J.; Neupane, Surendra; Benson, Benjamin V.

    2017-01-01

    Disease resistance genes (R genes), as part of the plant defense system, have coevolved with corresponding pathogen molecules. The main objectives of this project were to identify non-Toll interleukin receptor, nucleotide-binding site, leucine-rich repeat (nTNL) genes and elucidate their evolutionary divergence across six plant genomes. Using reference sequences from Arabidopsis, we investigated nTNL orthologs in the genomes of common bean, Medicago, soybean, poplar, and rice. We used Hidden Markov Models for sequence identification, performed model-based phylogenetic analyses, visualized chromosomal positioning, inferred gene clustering, and assessed gene expression profiles. We analyzed 908 nTNL R genes in the genomes of the six plant species, and classified them into 12 subgroups based on the presence of coiled-coil (CC), nucleotide binding site (NBS), leucine rich repeat (LRR), resistance to Powdery mildew 8 (RPW8), and BED type zinc finger domains. Traditionally classified CC-NBS-LRR (CNL) genes were nested into four clades (CNL A-D) often with abundant, well-supported homogeneous subclades of Type-II R genes. CNL-D members were absent in rice, indicating a unique R gene retention pattern in the rice genome. Genomes from Arabidopsis, common bean, poplar and soybean had one chromosome without any CNL R genes. Medicago and Arabidopsis had the highest and lowest number of gene clusters, respectively. Gene expression analyses suggested unique patterns of expression for each of the CNL clades. Differential gene expression patterns of the nTNL genes were often found to correlate with number of introns and GC content, suggesting structural and functional divergence. PMID:28973974

  17. The PhytoClust tool for metabolic gene clusters discovery in plant genomes

    PubMed Central

    Fuchs, Lisa-Maria

    2017-01-01

    Abstract The existence of Metabolic Gene Clusters (MGCs) in plant genomes has recently raised increased interest. Thus far, MGCs were commonly identified for pathways of specialized metabolism, mostly those associated with terpene type products. For efficient identification of novel MGCs, computational approaches are essential. Here, we present PhytoClust; a tool for the detection of candidate MGCs in plant genomes. The algorithm employs a collection of enzyme families related to plant specialized metabolism, translated into hidden Markov models, to mine given genome sequences for physically co-localized metabolic enzymes. Our tool accurately identifies previously characterized plant MGCs. An exhaustive search of 31 plant genomes detected 1232 and 5531 putative gene cluster types and candidates, respectively. Clustering analysis of putative MGCs types by species reflected plant taxonomy. Furthermore, enrichment analysis revealed taxa- and species-specific enrichment of certain enzyme families in MGCs. When operating through our web-interface, PhytoClust users can mine a genome either based on a list of known cluster types or by defining new cluster rules. Moreover, for selected plant species, the output can be complemented by co-expression analysis. Altogether, we envisage PhytoClust to enhance novel MGCs discovery which will in turn impact the exploration of plant metabolism. PMID:28486689

  18. The PhytoClust tool for metabolic gene clusters discovery in plant genomes.

    PubMed

    Töpfer, Nadine; Fuchs, Lisa-Maria; Aharoni, Asaph

    2017-07-07

    The existence of Metabolic Gene Clusters (MGCs) in plant genomes has recently raised increased interest. Thus far, MGCs were commonly identified for pathways of specialized metabolism, mostly those associated with terpene type products. For efficient identification of novel MGCs, computational approaches are essential. Here, we present PhytoClust; a tool for the detection of candidate MGCs in plant genomes. The algorithm employs a collection of enzyme families related to plant specialized metabolism, translated into hidden Markov models, to mine given genome sequences for physically co-localized metabolic enzymes. Our tool accurately identifies previously characterized plant MGCs. An exhaustive search of 31 plant genomes detected 1232 and 5531 putative gene cluster types and candidates, respectively. Clustering analysis of putative MGCs types by species reflected plant taxonomy. Furthermore, enrichment analysis revealed taxa- and species-specific enrichment of certain enzyme families in MGCs. When operating through our web-interface, PhytoClust users can mine a genome either based on a list of known cluster types or by defining new cluster rules. Moreover, for selected plant species, the output can be complemented by co-expression analysis. Altogether, we envisage PhytoClust to enhance novel MGCs discovery which will in turn impact the exploration of plant metabolism. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. A remarkably stable TipE gene cluster: evolution of insect Para sodium channel auxiliary subunits

    PubMed Central

    2011-01-01

    Background First identified in fruit flies with temperature-sensitive paralysis phenotypes, the Drosophila melanogaster TipE locus encodes four voltage-gated sodium (NaV) channel auxiliary subunits. This cluster of TipE-like genes on chromosome 3L, and a fifth family member on chromosome 3R, are important for the optional expression and functionality of the Para NaV channel but appear quite distinct from auxiliary subunits in vertebrates. Here, we exploited available arthropod genomic resources to trace the origin of TipE-like genes by mapping their evolutionary histories and examining their genomic architectures. Results We identified a remarkably conserved synteny block of TipE-like orthologues with well-maintained local gene arrangements from 21 insect species. Homologues in the water flea, Daphnia pulex, suggest an ancestral pancrustacean repertoire of four TipE-like genes; a subsequent gene duplication may have generated functional redundancy allowing gene losses in the silk moth and mosquitoes. Intronic nesting of the insect TipE gene cluster probably occurred following the divergence from crustaceans, but in the flour beetle and silk moth genomes the clusters apparently escaped from nesting. Across Pancrustacea, TipE gene family members have experienced intronic nesting, escape from nesting, retrotransposition, translocation, and gene loss events while generally maintaining their local gene neighbourhoods. D. melanogaster TipE-like genes exhibit coordinated spatial and temporal regulation of expression distinct from their host gene but well-correlated with their regulatory target, the Para NaV channel, suggesting that functional constraints may preserve the TipE gene cluster. We identified homology between TipE-like NaV channel regulators and vertebrate Slo-beta auxiliary subunits of big-conductance calcium-activated potassium (BKCa) channels, which suggests that ion channel regulatory partners have evolved distinct lineage-specific characteristics. Conclusions TipE-like genes form a remarkably conserved genomic cluster across all examined insect genomes. This study reveals likely structural and functional constraints on the genomic evolution of insect TipE gene family members maintained in synteny over hundreds of millions of years of evolution. The likely common origin of these NaV channel regulators with BKCa auxiliary subunits highlights the evolutionary plasticity of ion channel regulatory mechanisms. PMID:22098672

  20. Overproduction of lactimidomycin by cross-overexpression of genes encoding Streptomyces antibiotic regulatory proteins.

    PubMed

    Zhang, Bo; Yang, Dong; Yan, Yijun; Pan, Guohui; Xiang, Wensheng; Shen, Ben

    2016-03-01

    The glutarimide-containing polyketides represent a fascinating class of natural products that exhibit a multitude of biological activities. We have recently cloned and sequenced the biosynthetic gene clusters for three members of the glutarimide-containing polyketides-iso-migrastatin (iso-MGS) from Streptomyces platensis NRRL 18993, lactimidomycin (LTM) from Streptomyces amphibiosporus ATCC 53964, and cycloheximide (CHX) from Streptomyces sp. YIM56141. Comparative analysis of the three clusters identified mgsA and chxA, from the mgs and chx gene clusters, respectively, that were predicted to encode the PimR-like Streptomyces antibiotic regulatory proteins (SARPs) but failed to reveal any regulatory gene from the ltm gene cluster. Overexpression of mgsA or chxA in S. platensis NRRL 18993, Streptomyces sp. YIM56141 or SB11024, and a recombinant strain of Streptomyces coelicolor M145 carrying the intact mgs gene cluster has no significant effect on iso-MGS or CHX production, suggesting that MgsA or ChxA regulation may not be rate-limiting for iso-MGS and CHX production in these producers. In contrast, overexpression of mgsA or chxA in S. amphibiosporus ATCC 53964 resulted in a significant increase in LTM production, with LTM titer reaching 106 mg/L, which is five-fold higher than that of the wild-type strain. These results support MgsA and ChxA as members of the SARP family of positive regulators for the iso-MGS and CHX biosynthetic machinery and demonstrate the feasibility to improve glutarimide-containing polyketide production in Streptomyces strains by exploiting common regulators.

  1. Chassis optimization as a cornerstone for the application of synthetic biology based strategies in microbial secondary metabolism.

    PubMed

    Beites, Tiago; Mendes, Marta V

    2015-01-01

    The increased number of bacterial genome sequencing projects has generated over the last years a large reservoir of genomic information. In silico analysis of this genomic data has renewed the interest in bacterial bioprospecting for bioactive compounds by unveiling novel biosynthetic gene clusters of unknown or uncharacterized metabolites. However, only a small fraction of those metabolites is produced under laboratory-controlled conditions; the remaining clusters represent a pool of novel metabolites that are waiting to be "awaken". Activation of the biosynthetic gene clusters that present reduced or no expression (known as cryptic or silent clusters) by heterologous expression has emerged as a strategy for the identification and production of novel bioactive molecules. Synthetic biology, with engineering principles at its core, provides an excellent framework for the development of efficient heterologous systems for the expression of biosynthetic gene clusters. However, a common problem in its application is the host-interference problem, i.e., the unpredictable interactions between the device and the host that can hamper the desired output. Although an effort has been made to develop orthogonal devices, the most proficient way to overcome the host-interference problem is through genome simplification. In this review we present an overview on the strategies and tools used in the development of hosts/chassis for the heterologous expression of specialized metabolites biosynthetic gene clusters. Finally, we introduce the concept of specialized host as the next step of development of expression hosts.

  2. The thiostrepton-resistance-encoding gene in Streptomyces laurentii is located within a cluster of ribosomal protein operons.

    PubMed

    Smith, T M; Jiang, Y F; Shipley, P; Floss, H G

    1995-10-16

    A common approach to identify and clone biosynthetic gene from an antibiotic-producing streptomycete is to clone the resistance gene for the antibiotic of interest and then use that gene to clone DNA that is linked to it. As a first step toward cloning the genes responsible for the biosynthesis of thiostrepton (Th) in Streptomyces laurentii (Sl), the Th resistance-encoding gene (tsnR) was cloned as a 1.5-kb BamHI-PvuII fragment in Escherichia coli (Ec), and shown to confer Th resistance when introduced into S. lividans TK24. The tsnR-containing DNA fragment was used as a probe to isolate clones from cosmid libraries of DNA in the Ec cosmid vector SuperCos, and pOJ446 (an Ec/streptomycete) cosmid vector. Sequence and genetic analysis of the DNA flanking the tsnR indicates that the Sl tsnR is not closely linked to biosynthetic genes. Instead it is located within a cluster of ribosomal protein operons.

  3. Axial patterning and diversification in the cnidaria predate the Hox system.

    PubMed

    Kamm, Kai; Schierwater, Bernd; Jakob, Wolfgang; Dellaporta, Stephen L; Miller, David J

    2006-05-09

    Across the animal kingdom, Hox genes are organized in clusters whose genomic organization reflects their central roles in patterning along the anterior/posterior (A/P) axis . While a cluster of Hox genes was present in the bilaterian common ancestor, the origins of this system remain unclear (cf. ). With new data for two representatives of the closest extant phylum to the Bilateria, the sea anemone Nematostella and the hydromedusa Eleutheria, we argue here that the Cnidaria predate the evolution of the Hox system. Although Hox-like genes are present in a range of cnidarians, many of these are paralogs and in neither Nematostella nor Eleutheria is an equivalent of the Hox cluster present. With the exception of independently duplicated genes, the cnidarian genes are unlinked and in several cases are flanked by non-Hox genes. Furthermore, the cnidarian genes are expressed in patterns that are inconsistent with the Hox paradigm. We conclude that the Cnidaria/Bilateria split occurred before a definitive Hox system developed. The spectacular variety in morphological and developmental characteristics shown by extant cnidarians demonstrates that there is no obligate link between the Hox system and morphological diversity in the animal kingdom and that a canonical Hox system is not mandatory for axial patterning.

  4. Nipbl and mediator cooperatively regulate gene expression to control limb development.

    PubMed

    Muto, Akihiko; Ikeda, Shingo; Lopez-Burks, Martha E; Kikuchi, Yutaka; Calof, Anne L; Lander, Arthur D; Schilling, Thomas F

    2014-09-01

    Haploinsufficiency for Nipbl, a cohesin loading protein, causes Cornelia de Lange Syndrome (CdLS), the most common "cohesinopathy". It has been proposed that the effects of Nipbl-haploinsufficiency result from disruption of long-range communication between DNA elements. Here we use zebrafish and mouse models of CdLS to examine how transcriptional changes caused by Nipbl deficiency give rise to limb defects, a common condition in individuals with CdLS. In the zebrafish pectoral fin (forelimb), knockdown of Nipbl expression led to size reductions and patterning defects that were preceded by dysregulated expression of key early limb development genes, including fgfs, shha, hand2 and multiple hox genes. In limb buds of Nipbl-haploinsufficient mice, transcriptome analysis revealed many similar gene expression changes, as well as altered expression of additional classes of genes that play roles in limb development. In both species, the pattern of dysregulation of hox-gene expression depended on genomic location within the Hox clusters. In view of studies suggesting that Nipbl colocalizes with the mediator complex, which facilitates enhancer-promoter communication, we also examined zebrafish deficient for the Med12 Mediator subunit, and found they resembled Nipbl-deficient fish in both morphology and gene expression. Moreover, combined partial reduction of both Nipbl and Med12 had a strongly synergistic effect, consistent with both molecules acting in a common pathway. In addition, three-dimensional fluorescent in situ hybridization revealed that Nipbl and Med12 are required to bring regions containing long-range enhancers into close proximity with the zebrafish hoxda cluster. These data demonstrate a crucial role for Nipbl in limb development, and support the view that its actions on multiple gene pathways result from its influence, together with Mediator, on regulation of long-range chromosomal interactions.

  5. Protein-protein interaction analysis of Alzheimer`s disease and NAFLD based on systems biology methods unhide common ancestor pathways.

    PubMed

    Karbalaei, Reza; Allahyari, Marzieh; Rezaei-Tavirani, Mostafa; Asadzadeh-Aghdaei, Hamid; Zali, Mohammad Reza

    2018-01-01

    Analysis reconstruction networks from two diseases, NAFLD and Alzheimer`s diseases and their relationship based on systems biology methods. NAFLD and Alzheimer`s diseases are two complex diseases, with progressive prevalence and high cost for countries. There are some reports on relation and same spreading pathways of these two diseases. In addition, they have some similar risk factors, exclusively lifestyle such as feeding, exercises and so on. Therefore, systems biology approach can help to discover their relationship. DisGeNET and STRING databases were sources of disease genes and constructing networks. Three plugins of Cytoscape software, including ClusterONE, ClueGO and CluePedia, were used to analyze and cluster networks and enrichment of pathways. An R package used to define best centrality method. Finally, based on degree and Betweenness, hubs and bottleneck nodes were defined. Common genes between NAFLD and Alzheimer`s disease were 190 genes that used construct a network with STRING database. The resulting network contained 182 nodes and 2591 edges and comprises from four clusters. Enrichment of these clusters separately lead to carbohydrate metabolism, long chain fatty acid and regulation of JAK-STAT and IL-17 signaling pathways, respectively. Also seven genes selected as hub-bottleneck include: IL6, AKT1, TP53, TNF, JUN, VEGFA and PPARG. Enrichment of these proteins and their first neighbors in network by OMIM database lead to diabetes and obesity as ancestors of NAFLD and AD. Systems biology methods, specifically PPI networks, can be useful for analyzing complicated related diseases. Finding Hub and bottleneck proteins should be the goal of drug designing and introducing disease markers.

  6. Exploring multicollinearity using a random matrix theory approach.

    PubMed

    Feher, Kristen; Whelan, James; Müller, Samuel

    2012-01-01

    Clustering of gene expression data is often done with the latent aim of dimension reduction, by finding groups of genes that have a common response to potentially unknown stimuli. However, what is poorly understood to date is the behaviour of a low dimensional signal embedded in high dimensions. This paper introduces a multicollinear model which is based on random matrix theory results, and shows potential for the characterisation of a gene cluster's correlation matrix. This model projects a one dimensional signal into many dimensions and is based on the spiked covariance model, but rather characterises the behaviour of the corresponding correlation matrix. The eigenspectrum of the correlation matrix is empirically examined by simulation, under the addition of noise to the original signal. The simulation results are then used to propose a dimension estimation procedure of clusters from data. Moreover, the simulation results warn against considering pairwise correlations in isolation, as the model provides a mechanism whereby a pair of genes with `low' correlation may simply be due to the interaction of high dimension and noise. Instead, collective information about all the variables is given by the eigenspectrum.

  7. Genetic homogeneity of Clostridium botulinum type A1 strains with unique toxin gene clusters.

    PubMed

    Raphael, Brian H; Luquez, Carolina; McCroskey, Loretta M; Joseph, Lavin A; Jacobson, Mark J; Johnson, Eric A; Maslanka, Susan E; Andreadis, Joanne D

    2008-07-01

    A group of five clonally related Clostridium botulinum type A strains isolated from different sources over a period of nearly 40 years harbored several conserved genetic properties. These strains contained a variant bont/A1 with five nucleotide polymorphisms compared to the gene in C. botulinum strain ATCC 3502. The strains also had a common toxin gene cluster composition (ha-/orfX+) similar to that associated with bont/A in type A strains containing an unexpressed bont/B [termed A(B) strains]. However, bont/B was not identified in the strains examined. Comparative genomic hybridization demonstrated identical genomic content among the strains relative to C. botulinum strain ATCC 3502. In addition, microarray data demonstrated the absence of several genes flanking the toxin gene cluster among the ha-/orfX+ A1 strains, suggesting the presence of genomic rearrangements with respect to this region compared to the C. botulinum ATCC 3502 strain. All five strains were shown to have identical flaA variable region nucleotide sequences. The pulsed-field gel electrophoresis patterns of the strains were indistinguishable when digested with SmaI, and a shift in the size of at least one band was observed in a single strain when digested with XhoI. These results demonstrate surprising genomic homogeneity among a cluster of unique C. botulinum type A strains of diverse origin.

  8. A Minimal Nitrogen Fixation Gene Cluster from Paenibacillus sp. WLY78 Enables Expression of Active Nitrogenase in Escherichia coli

    PubMed Central

    Zhao, Dehua; Liu, Xiaomeng; Zhang, Bo; Xie, Jianbo; Hong, Yuanyuan; Li, Pengfei; Chen, Sanfeng; Dixon, Ray; Li, Jilun

    2013-01-01

    Most biological nitrogen fixation is catalyzed by molybdenum-dependent nitrogenase, an enzyme complex comprising two component proteins that contains three different metalloclusters. Diazotrophs contain a common core of nitrogen fixation nif genes that encode the structural subunits of the enzyme and components required to synthesize the metalloclusters. However, the complement of nif genes required to enable diazotrophic growth varies significantly amongst nitrogen fixing bacteria and archaea. In this study, we identified a minimal nif gene cluster consisting of nine nif genes in the genome of Paenibacillus sp. WLY78, a gram-positive, facultative anaerobe isolated from the rhizosphere of bamboo. We demonstrate that the nif genes in this organism are organized as an operon comprising nifB, nifH, nifD, nifK, nifE, nifN, nifX, hesA and nifV and that the nif cluster is under the control of a σ70 (σA)-dependent promoter located upstream of nifB. To investigate genetic requirements for diazotrophy, we transferred the Paenibacillus nif cluster to Escherichia coli. The minimal nif gene cluster enables synthesis of catalytically active nitrogenase in this host, when expressed either from the native nifB promoter or from the T7 promoter. Deletion analysis indicates that in addition to the core nif genes, hesA plays an important role in nitrogen fixation and is responsive to the availability of molybdenum. Whereas nif transcription in Paenibacillus is regulated in response to nitrogen availability and by the external oxygen concentration, transcription from the nifB promoter is constitutive in E. coli, indicating that negative regulation of nif transcription is bypassed in the heterologous host. This study demonstrates the potential for engineering nitrogen fixation in a non-nitrogen fixing organism with a minimum set of nine nif genes. PMID:24146630

  9. Clustering by soft-constraint affinity propagation: applications to gene-expression data.

    PubMed

    Leone, Michele; Sumedha; Weigt, Martin

    2007-10-15

    Similarity-measure-based clustering is a crucial problem appearing throughout scientific data analysis. Recently, a powerful new algorithm called Affinity Propagation (AP) based on message-passing techniques was proposed by Frey and Dueck (2007a). In AP, each cluster is identified by a common exemplar all other data points of the same cluster refer to, and exemplars have to refer to themselves. Albeit its proved power, AP in its present form suffers from a number of drawbacks. The hard constraint of having exactly one exemplar per cluster restricts AP to classes of regularly shaped clusters, and leads to suboptimal performance, e.g. in analyzing gene expression data. This limitation can be overcome by relaxing the AP hard constraints. A new parameter controls the importance of the constraints compared to the aim of maximizing the overall similarity, and allows to interpolate between the simple case where each data point selects its closest neighbor as an exemplar and the original AP. The resulting soft-constraint affinity propagation (SCAP) becomes more informative, accurate and leads to more stable clustering. Even though a new a priori free parameter is introduced, the overall dependence of the algorithm on external tuning is reduced, as robustness is increased and an optimal strategy for parameter selection emerges more naturally. SCAP is tested on biological benchmark data, including in particular microarray data related to various cancer types. We show that the algorithm efficiently unveils the hierarchical cluster structure present in the data sets. Further on, it allows to extract sparse gene expression signatures for each cluster.

  10. Comprehensive analysis and discovery of drought-related NAC transcription factors in common bean.

    PubMed

    Wu, Jing; Wang, Lanfen; Wang, Shumin

    2016-09-07

    Common bean (Phaseolus vulgaris L.) is an important warm-season food legume. Drought is the most important environmental stress factor affecting large areas of common bean via plant death or reduced global production. The NAM, ATAF1/2 and CUC2 (NAC) domain protein family are classic transcription factors (TFs) involved in a variety of abiotic stresses, particularly drought stress. However, the NAC TFs in common bean have not been characterized. In the present study, 86 putative NAC TF proteins were identified from the common bean genome database and located on 11 common bean chromosomes. The proteins were phylogenetically clustered into 8 distinct subfamilies. The gene structure and motif composition of common bean NACs were similar in each subfamily. These results suggest that NACs in the same subfamily may possess conserved functions. The expression patterns of common bean NAC genes were also characterized. The majority of NACs exhibited specific temporal and spatial expression patterns. We identified 22 drought-related NAC TFs based on transcriptome data for drought-tolerant and drought-sensitive genotypes. Quantitative real-time PCR (qRT-PCR) was performed to confirm the expression patterns of the 20 drought-related NAC genes. Based on the common bean genome sequence, we analyzed the structural characteristics, genome distribution, and expression profiles of NAC gene family members and analyzed drought-responsive NAC genes. Our results provide useful information for the functional characterization of common bean NAC genes and rich resources and opportunities for understanding common bean drought stress tolerance mechanisms.

  11. Transcriptome analysis of salinity stress responses in common wheat using a 22k oligo-DNA microarray.

    PubMed

    Kawaura, Kanako; Mochida, Keiichi; Yamazaki, Yukiko; Ogihara, Yasunari

    2006-04-01

    In this study, we constructed a 22k wheat oligo-DNA microarray. A total of 148,676 expressed sequence tags of common wheat were collected from the database of the Wheat Genomics Consortium of Japan. These were grouped into 34,064 contigs, which were then used to design an oligonucleotide DNA microarray. Following a multistep selection of the sense strand, 21,939 60-mer oligo-DNA probes were selected for attachment on the microarray slide. This 22k oligo-DNA microarray was used to examine the transcriptional response of wheat to salt stress. More than 95% of the probes gave reproducible hybridization signals when targeted with RNAs extracted from salt-treated wheat shoots and roots. With the microarray, we identified 1,811 genes whose expressions changed more than 2-fold in response to salt. These included genes known to mediate response to salt, as well as unknown genes, and they were classified into 12 major groups by hierarchical clustering. These gene expression patterns were also confirmed by real-time reverse transcription-PCR. Many of the genes with unknown function were clustered together with genes known to be involved in response to salt stress. Thus, analysis of gene expression patterns combined with gene ontology should help identify the function of the unknown genes. Also, functional analysis of these wheat genes should provide new insight into the response to salt stress. Finally, these results indicate that the 22k oligo-DNA microarray is a reliable method for monitoring global gene expression patterns in wheat.

  12. Resistance to Colletotrichum lindemuthianum in Phaseolus vulgaris: a case study for mapping two independent genes.

    PubMed

    Geffroy, Valérie; Sévignac, Mireille; Billant, Paul; Dron, Michel; Langin, Thierry

    2008-02-01

    Anthracnose, caused by the hemibiotrophic fungal pathogen Colletotrichum lindemuthianum is a devastating disease of common bean. Resistant cultivars are economical means for defense against this pathogen. In the present study, we mapped resistance specificities against 7 C. lindemuthianum strains of various geographical origins revealing differential reactions on BAT93 and JaloEEP558, two parents of a recombinant inbred lines (RILs) population, of Meso-american and Andean origin, respectively. Six strains revealed the segregation of two independent resistance genes. A specific numerical code calculating the LOD score in the case of two independent segregating genes (i.e. genes with duplicate effects) in a RILs population was developed in order to provide a recombination value (r) between each of the two resistance genes and the tested marker. We mapped two closely linked Andean resistance genes (Co-x, Co-w) at the end of linkage group (LG) B1 and mapped one Meso-american resistance genes (Co-u) at the end of LG B2. We also confirmed the complexity of the previously identified B4 resistance gene cluster, because four of the seven tested strains revealed a resistance specificity near Co-y from JaloEEP558 and two strains identified a resistance specificity near Co-9 from BAT93. Resistance genes found within the same cluster confer resistance to different strains of a single pathogen such as the two anthracnose specificities Co-x and Co-w clustered at the end of LG B1. Clustering of resistance specificities to multiple pathogens such as fungi (Co-u) and viruses (I) was also observed at the end of LG B2.

  13. Transcription of two adjacent carbohydrate utilization gene clusters in Bifidobacterium breve UCC2003 is controlled by LacI- and repressor open reading frame kinase (ROK)-type regulators.

    PubMed

    O'Connell, Kerry Joan; Motherway, Mary O'Connell; Liedtke, Andrea; Fitzgerald, Gerald F; Paul Ross, R; Stanton, Catherine; Zomer, Aldert; van Sinderen, Douwe

    2014-06-01

    Members of the genus Bifidobacterium are commonly found in the gastrointestinal tracts of mammals, including humans, where their growth is presumed to be dependent on various diet- and/or host-derived carbohydrates. To understand transcriptional control of bifidobacterial carbohydrate metabolism, we investigated two genetic carbohydrate utilization clusters dedicated to the metabolism of raffinose-type sugars and melezitose. Transcriptomic and gene inactivation approaches revealed that the raffinose utilization system is positively regulated by an activator protein, designated RafR. The gene cluster associated with melezitose metabolism was shown to be subject to direct negative control by a LacI-type transcriptional regulator, designated MelR1, in addition to apparent indirect negative control by means of a second LacI-type regulator, MelR2. In silico analysis, DNA-protein interaction, and primer extension studies revealed the MelR1 and MelR2 operator sequences, each of which is positioned just upstream of or overlapping the correspondingly regulated promoter sequences. Similar analyses identified the RafR binding operator sequence located upstream of the rafB promoter. This study indicates that transcriptional control of gene clusters involved in carbohydrate metabolism in bifidobacteria is subject to conserved regulatory systems, representing either positive or negative control.

  14. Diversity of nonribosomal peptide synthetase and polyketide synthase gene clusters among taxonomically close Streptomyces strains.

    PubMed

    Komaki, Hisayuki; Sakurai, Kenta; Hosoyama, Akira; Kimura, Akane; Igarashi, Yasuhiro; Tamura, Tomohiko

    2018-05-02

    To identify the species of butyrolactol-producing Streptomyces strain TP-A0882, whole genome-sequencing of three type strains in a close taxonomic relationship was performed. In silico DNA-DNA hybridization using the genome sequences suggested that Streptomyces sp. TP-A0882 is classified as Streptomyces diastaticus subsp. ardesiacus. Strain TP-A0882, S. diastaticus subsp. ardesiacus NBRC 15402 T , Streptomyces coelicoflavus NBRC 15399 T , and Streptomyces rubrogriseus NBRC 15455 T harbor at least 14, 14, 10, and 12 biosynthetic gene clusters (BGCs), respectively, coding for nonribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs). All 14 gene clusters were shared by S. diastaticus subsp. ardesiacus strains TP-A0882 and NBRC 15402 T , while only four gene clusters were shared by the three distinct species. Although BGCs for bacteriocin, ectoine, indole, melanine, siderophores such as deferrioxamine, terpenes such as albaflavenone, hopene, carotenoid and geosmin are shared by the three species, many BGCs for secondary metabolites such as butyrolactone, lantipeptides, oligosaccharide, some terpenes are species-specific. These results indicate the possibility that strains belonging to the same species possess the same set of secondary metabolite-biosynthetic pathways, whereas strains belonging to distinct species have species-specific pathways, in addition to some common pathways, even if the strains are taxonomically close.

  15. The genetic epidemiology of personality disorders

    PubMed Central

    Reichborn-Kjennerud, Ted

    2010-01-01

    Genetic epidemiologic studies indicate that all ten personality disorders (PDs) classified on the DSM-IV axis II are modestly to moderately heritable. Shared environmental and nonadditive genetic factors are of minor or no importance. No sex differences have been identified. Multivariate studies suggest that the extensive comorbidity between the PDs can be explained by three common genetic and environmental risk factors. The genetic factors do not reflect the DSM-IV cluster structure, but rather: i) broad vulnerability to PD pathology or negative emotionality; ii) high impulsivity/low agreeableness; and iii) introversion. Common genetic and environmental liability factors contribute to comorbidity between pairs or clusters of axis I and axis II disorders. Molecular genetic studies of PDs, mostly candidate gene association studies, indicate that genes linked to neurotransmitter pathways, especially in the serotonergic and dopaminergic systems, are involved. Future studies, using newer methods like genome-wide association, might take advantage of the use of endophenotypes. PMID:20373672

  16. Metabolism of Four α-Glycosidic Linkage-Containing Oligosaccharides by Bifidobacterium breve UCC2003

    PubMed Central

    O'Connell, Kerry Joan; O'Connell Motherway, Mary; O'Callaghan, John; Fitzgerald, Gerald F.; Ross, R. Paul; Ventura, Marco; Stanton, Catherine

    2013-01-01

    Members of the genus Bifidobacterium are common inhabitants of the gastrointestinal tracts of humans and other mammals, where they ferment many diet-derived carbohydrates that cannot be digested by their hosts. To extend our understanding of bifidobacterial carbohydrate utilization, we investigated the molecular mechanisms by which 11 strains of Bifidobacterium breve metabolize four distinct α-glucose- and/or α-galactose-containing oligosaccharides, namely, raffinose, stachyose, melibiose, and melezitose. Here we demonstrate that all B. breve strains examined possess the ability to utilize raffinose, stachyose, and melibiose. However, the ability to metabolize melezitose was not common to all B. breve strains tested. Transcriptomic and functional genomic approaches identified a gene cluster dedicated to the metabolism of α-galactose-containing carbohydrates, while an adjacent gene cluster, dedicated to the metabolism of α-glucose-containing melezitose, was identified in strains that are able to use this carbohydrate. PMID:23913435

  17. Metabolism of four α-glycosidic linkage-containing oligosaccharides by Bifidobacterium breve UCC2003.

    PubMed

    O'Connell, Kerry Joan; O'Connell Motherway, Mary; O'Callaghan, John; Fitzgerald, Gerald F; Ross, R Paul; Ventura, Marco; Stanton, Catherine; van Sinderen, Douwe

    2013-10-01

    Members of the genus Bifidobacterium are common inhabitants of the gastrointestinal tracts of humans and other mammals, where they ferment many diet-derived carbohydrates that cannot be digested by their hosts. To extend our understanding of bifidobacterial carbohydrate utilization, we investigated the molecular mechanisms by which 11 strains of Bifidobacterium breve metabolize four distinct α-glucose- and/or α-galactose-containing oligosaccharides, namely, raffinose, stachyose, melibiose, and melezitose. Here we demonstrate that all B. breve strains examined possess the ability to utilize raffinose, stachyose, and melibiose. However, the ability to metabolize melezitose was not common to all B. breve strains tested. Transcriptomic and functional genomic approaches identified a gene cluster dedicated to the metabolism of α-galactose-containing carbohydrates, while an adjacent gene cluster, dedicated to the metabolism of α-glucose-containing melezitose, was identified in strains that are able to use this carbohydrate.

  18. Limitations of cytochrome oxidase I for the barcoding of Neritidae (Mollusca: Gastropoda) as revealed by Bayesian analysis.

    PubMed

    Chee, S Y

    2015-05-25

    The mitochondrial DNA (mtDNA) cytochrome oxidase I (COI) gene has been universally and successfully utilized as a barcoding gene, mainly because it can be amplified easily, applied across a wide range of taxa, and results can be obtained cheaply and quickly. However, in rare cases, the gene can fail to distinguish between species, particularly when exposed to highly sensitive methods of data analysis, such as the Bayesian method, or when taxa have undergone introgressive hybridization, over-splitting, or incomplete lineage sorting. Such cases require the use of alternative markers, and nuclear DNA markers are commonly used. In this study, a dendrogram produced by Bayesian analysis of an mtDNA COI dataset was compared with that of a nuclear DNA ATPS-α dataset, in order to evaluate the efficiency of COI in barcoding Malaysian nerites (Neritidae). In the COI dendrogram, most of the species were in individual clusters, except for two species: Nerita chamaeleon and N. histrio. These two species were placed in the same subcluster, whereas in the ATPS-α dendrogram they were in their own subclusters. Analysis of the ATPS-α gene also placed the two genera of nerites (Nerita and Neritina) in separate clusters, whereas COI gene analysis placed both genera in the same cluster. Therefore, in the case of the Neritidae, the ATPS-α gene is a better barcoding gene than the COI gene.

  19. A combination of PhP typing and β-d-glucuronidase gene sequence variation analysis for differentiation of Escherichia coli from humans and animals.

    PubMed

    Masters, N; Christie, M; Katouli, M; Stratton, H

    2015-06-01

    We investigated the usefulness of the β-d-glucuronidase gene variance in Escherichia coli as a microbial source tracking tool using a novel algorithm for comparison of sequences from a prescreened set of host-specific isolates using a high-resolution PhP typing method. A total of 65 common biochemical phenotypes belonging to 318 E. coli strains isolated from humans and domestic and wild animals were analysed for nucleotide variations at 10 loci along a 518 bp fragment of the 1812 bp β-d-glucuronidase gene. Neighbour-joining analysis of loci variations revealed 86 (76.8%) human isolates and 91.2% of animal isolates were correctly identified. Pairwise hierarchical clustering improved assignment; where 92 (82.1%) human and 204 (99%) animal strains were assigned to their respective cluster. Our data show that initial typing of isolates and selection of common types from different hosts prior to analysis of the β-d-glucuronidase gene sequence improves source identification. We also concluded that numerical profiling of the nucleotide variations can be used as a valuable approach to differentiate human from animal E. coli. This study signifies the usefulness of the β-d-glucuronidase gene as a marker for differentiating human faecal pollution from animal sources.

  20. A formal concept analysis approach to consensus clustering of multi-experiment expression data

    PubMed Central

    2014-01-01

    Background Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based on a larger number of samples and the effects of the individual study-specific biases are diminished. This is supported by recent studies suggesting that important biological signals are often preserved or enhanced by multiple experiments. An approach to combining data from different experiments is the aggregation of their clusterings into a consensus or representative clustering solution which increases the confidence in the common features of all the datasets and reveals the important differences among them. Results We propose a novel generic consensus clustering technique that applies Formal Concept Analysis (FCA) approach for the consolidation and analysis of clustering solutions derived from several microarray datasets. These datasets are initially divided into groups of related experiments with respect to a predefined criterion. Subsequently, a consensus clustering algorithm is applied to each group resulting in a clustering solution per group. These solutions are pooled together and further analysed by employing FCA which allows extracting valuable insights from the data and generating a gene partition over all the experiments. In order to validate the FCA-enhanced approach two consensus clustering algorithms are adapted to incorporate the FCA analysis. Their performance is evaluated on gene expression data from multi-experiment study examining the global cell-cycle control of fission yeast. The FCA results derived from both methods demonstrate that, although both algorithms optimize different clustering characteristics, FCA is able to overcome and diminish these differences and preserve some relevant biological signals. Conclusions The proposed FCA-enhanced consensus clustering technique is a general approach to the combination of clustering algorithms with FCA for deriving clustering solutions from multiple gene expression matrices. The experimental results presented herein demonstrate that it is a robust data integration technique able to produce good quality clustering solution that is representative for the whole set of expression matrices. PMID:24885407

  1. Marker-Assisted Molecular Profiling, Deletion Mutant Analysis, and RNA-Seq Reveal a Disease Resistance Cluster Associated with Uromyces appendiculatus Infection in Common Bean Phaseolus vulgaris L.

    PubMed

    Todd, Antonette R; Donofrio, Nicole; Sripathi, Venkateswara R; McClean, Phillip E; Lee, Rian K; Pastor-Corrales, Marcial; Kalavacharla, Venu Kal

    2017-05-23

    Common bean ( Phaseolus vulgaris L.) is an important legume, useful for its high protein and dietary fiber. The fungal pathogen Uromyces appendiculatus (Pers.) Unger can cause major loss in susceptible varieties of the common bean. The Ur-3 locus provides race specific resistance to virulent strains or races of the bean rust pathogen along with Crg , (Complements resistance gene), which is required for Ur-3 -mediated rust resistance. In this study, we inoculated two common bean genotypes (resistant "Sierra" and susceptible crg) with rust race 53 of U. appendiculatus , isolated leaf RNA at specific time points, and sequenced their transcriptomes. First, molecular markers were used to locate and identify a 250 kb deletion on chromosome 10 in mutant crg (which carries a deletion at the Crg locus). Next, we identified differential expression of several disease resistance genes between Mock Inoculated (MI) and Inoculated (I) samples of "Sierra" leaf RNA within the 250 kb delineated region. Both marker assisted molecular profiling and RNA-seq were used to identify possible transcriptomic locations of interest regarding the resistance in the common bean to race 53. Identification of differential expression among samples in disease resistance clusters in the bean genome may elucidate significant genes underlying rust resistance. Along with preserving favorable traits in the crop, the current research may also aid in global sustainability of food stocks necessary for many populations.

  2. Marker-Assisted Molecular Profiling, Deletion Mutant Analysis, and RNA-Seq Reveal a Disease Resistance Cluster Associated with Uromyces appendiculatus Infection in Common Bean Phaseolus vulgaris L.

    PubMed Central

    Todd, Antonette R.; Donofrio, Nicole; Sripathi, Venkateswara R.; McClean, Phillip E.; Lee, Rian K.; Pastor-Corrales, Marcial; Kalavacharla, Venu (Kal)

    2017-01-01

    Common bean (Phaseolus vulgaris L.) is an important legume, useful for its high protein and dietary fiber. The fungal pathogen Uromyces appendiculatus (Pers.) Unger can cause major loss in susceptible varieties of the common bean. The Ur-3 locus provides race specific resistance to virulent strains or races of the bean rust pathogen along with Crg, (Complements resistance gene), which is required for Ur-3-mediated rust resistance. In this study, we inoculated two common bean genotypes (resistant “Sierra” and susceptible crg) with rust race 53 of U. appendiculatus, isolated leaf RNA at specific time points, and sequenced their transcriptomes. First, molecular markers were used to locate and identify a 250 kb deletion on chromosome 10 in mutant crg (which carries a deletion at the Crg locus). Next, we identified differential expression of several disease resistance genes between Mock Inoculated (MI) and Inoculated (I) samples of “Sierra” leaf RNA within the 250 kb delineated region. Both marker assisted molecular profiling and RNA-seq were used to identify possible transcriptomic locations of interest regarding the resistance in the common bean to race 53. Identification of differential expression among samples in disease resistance clusters in the bean genome may elucidate significant genes underlying rust resistance. Along with preserving favorable traits in the crop, the current research may also aid in global sustainability of food stocks necessary for many populations. PMID:28545258

  3. COGNAT: a web server for comparative analysis of genomic neighborhoods.

    PubMed

    Klimchuk, Olesya I; Konovalov, Kirill A; Perekhvatov, Vadim V; Skulachev, Konstantin V; Dibrova, Daria V; Mulkidjanian, Armen Y

    2017-11-22

    In prokaryotic genomes, functionally coupled genes can be organized in conserved gene clusters enabling their coordinated regulation. Such clusters could contain one or several operons, which are groups of co-transcribed genes. Those genes that evolved from a common ancestral gene by speciation (i.e. orthologs) are expected to have similar genomic neighborhoods in different organisms, whereas those copies of the gene that are responsible for dissimilar functions (i.e. paralogs) could be found in dissimilar genomic contexts. Comparative analysis of genomic neighborhoods facilitates the prediction of co-regulated genes and helps to discern different functions in large protein families. We intended, building on the attribution of gene sequences to the clusters of orthologous groups of proteins (COGs), to provide a method for visualization and comparative analysis of genomic neighborhoods of evolutionary related genes, as well as a respective web server. Here we introduce the COmparative Gene Neighborhoods Analysis Tool (COGNAT), a web server for comparative analysis of genomic neighborhoods. The tool is based on the COG database, as well as the Pfam protein families database. As an example, we show the utility of COGNAT in identifying a new type of membrane protein complex that is formed by paralog(s) of one of the membrane subunits of the NADH:quinone oxidoreductase of type 1 (COG1009) and a cytoplasmic protein of unknown function (COG3002). This article was reviewed by Drs. Igor Zhulin, Uri Gophna and Igor Rogozin.

  4. A candidate gene study in low HDL-cholesterol families provides evidence for the involvement of the APOA2 gene and the APOA1C3A4 gene cluster.

    PubMed

    Lilja, Heidi E; Soro, Aino; Ylitalo, Kati; Nuotio, Ilpo; Viikari, Jorma S A; Salomaa, Veikko; Vartiainen, Erkki; Taskinen, Marja-Riitta; Peltonen, Leena; Pajukanta, Päivi

    2002-09-01

    In patients with premature coronary heart disease, the most common lipoprotein abnormality is high-density lipoprotein (HDL) deficiency. To assess the genetic background of the low HDL-cholesterol trait, we performed a candidate gene study in 25 families with low HDL, collected from the genetically isolated population of Finland. We studied 21 genes encoding essential proteins involved in the HDL metabolism by genotyping intragenic and flanking markers for these genes. We found suggestive evidence for linkage in two candidate regions: Marker D1S2844, in the apolipoprotein A-II (APOA2) region, yielded a LOD score of 2.14 and marker D11S939 flanking the apolipoprotein A-I/C-III/A-IV gene cluster (APOA1C3A4) produced a LOD score of 1.69. Interestingly, we identified potential shared haplotypes in these two regions in a subset of low HDL families. These families also contributed to the obtained positive LOD scores, whereas the rest of the families produced negative LOD scores. None of the remaining candidate regions provided any evidence for linkage. Since only a limited number of loci were tested in this candidate gene study, these LOD scores suggest significant involvement of the APOA2 gene and the APOA1C3A4 gene cluster, or loci in their immediate vicinity, in the pathogenesis of low HDL.

  5. Highly Variable Streptococcus oralis Strains Are Common among Viridans Streptococci Isolated from Primates.

    PubMed

    Denapaite, Dalia; Rieger, Martin; Köndgen, Sophie; Brückner, Reinhold; Ochigava, Irma; Kappeler, Peter; Mätz-Rensing, Kerstin; Leendertz, Fabian; Hakenbeck, Regine

    2016-01-01

    Viridans streptococci were obtained from primates (great apes, rhesus monkeys, and ring-tailed lemurs) held in captivity, as well as from free-living animals (chimpanzees and lemurs) for whom contact with humans is highly restricted. Isolates represented a variety of viridans streptococci, including unknown species. Streptococcus oralis was frequently isolated from samples from great apes. Genotypic methods revealed that most of the strains clustered on separate lineages outside the main cluster of human S. oralis strains. This suggests that S. oralis is part of the commensal flora in higher primates and evolved prior to humans. Many genes described as virulence factors in Streptococcus pneumoniae were present also in other viridans streptococcal genomes. Unlike in S. pneumoniae, clustered regularly interspaced short palindromic repeat (CRISPR)-CRISPR-associated protein (Cas) gene clusters were common among viridans streptococci, and many S. oralis strains were type PI-2 (pilus islet 2) variants. S. oralis displayed a remarkable diversity of genes involved in the biosynthesis of peptidoglycan (penicillin-binding proteins and MurMN) and choline-containing teichoic acid. The small noncoding cia-dependent small RNAs (csRNAs) controlled by the response regulator CiaR might contribute to the genomic diversity, since we observed novel genomic islands between duplicated csRNAs, variably present in some isolates. All S. oralis genomes contained a β-N-acetyl-hexosaminidase gene absent in S. pneumoniae, which in contrast frequently harbors the neuraminidases NanB/C, which are absent in S. oralis. The identification of S. oralis-specific genes will help us to understand their adaptation to diverse habitats. IMPORTANCE Streptococcus pneumoniae is a rare example of a human-pathogenic bacterium among viridans streptococci, which consist of commensal symbionts, such as the close relatives Streptococcus mitis and S. oralis. We have shown that S. oralis can frequently be isolated from primates and a variety of other viridans streptococci as well. Genes and genomic islands which are known pneumococcal virulence factors are present in S. oralis and S. mitis, documenting the widespread occurrence of these compounds, which encode surface and secreted proteins. The frequent occurrence of CRISP-Cas gene clusters and a surprising variation of a set of small noncoding RNAs are factors to be considered in future research to further our understanding of mechanisms involved in the genomic diversity driven by horizontal gene transfer among viridans streptococci.

  6. Highly Variable Streptococcus oralis Strains Are Common among Viridans Streptococci Isolated from Primates

    PubMed Central

    Denapaite, Dalia; Rieger, Martin; Köndgen, Sophie; Brückner, Reinhold; Ochigava, Irma; Kappeler, Peter; Mätz-Rensing, Kerstin; Leendertz, Fabian

    2016-01-01

    ABSTRACT Viridans streptococci were obtained from primates (great apes, rhesus monkeys, and ring-tailed lemurs) held in captivity, as well as from free-living animals (chimpanzees and lemurs) for whom contact with humans is highly restricted. Isolates represented a variety of viridans streptococci, including unknown species. Streptococcus oralis was frequently isolated from samples from great apes. Genotypic methods revealed that most of the strains clustered on separate lineages outside the main cluster of human S. oralis strains. This suggests that S. oralis is part of the commensal flora in higher primates and evolved prior to humans. Many genes described as virulence factors in Streptococcus pneumoniae were present also in other viridans streptococcal genomes. Unlike in S. pneumoniae, clustered regularly interspaced short palindromic repeat (CRISPR)–CRISPR-associated protein (Cas) gene clusters were common among viridans streptococci, and many S. oralis strains were type PI-2 (pilus islet 2) variants. S. oralis displayed a remarkable diversity of genes involved in the biosynthesis of peptidoglycan (penicillin-binding proteins and MurMN) and choline-containing teichoic acid. The small noncoding cia-dependent small RNAs (csRNAs) controlled by the response regulator CiaR might contribute to the genomic diversity, since we observed novel genomic islands between duplicated csRNAs, variably present in some isolates. All S. oralis genomes contained a β-N-acetyl-hexosaminidase gene absent in S. pneumoniae, which in contrast frequently harbors the neuraminidases NanB/C, which are absent in S. oralis. The identification of S. oralis-specific genes will help us to understand their adaptation to diverse habitats. IMPORTANCE Streptococcus pneumoniae is a rare example of a human-pathogenic bacterium among viridans streptococci, which consist of commensal symbionts, such as the close relatives Streptococcus mitis and S. oralis. We have shown that S. oralis can frequently be isolated from primates and a variety of other viridans streptococci as well. Genes and genomic islands which are known pneumococcal virulence factors are present in S. oralis and S. mitis, documenting the widespread occurrence of these compounds, which encode surface and secreted proteins. The frequent occurrence of CRISP-Cas gene clusters and a surprising variation of a set of small noncoding RNAs are factors to be considered in future research to further our understanding of mechanisms involved in the genomic diversity driven by horizontal gene transfer among viridans streptococci. PMID:27303717

  7. Sequence Similarity of Clostridium difficile Strains by Analysis of Conserved Genes and Genome Content Is Reflected by Their Ribotype Affiliation

    PubMed Central

    Kurka, Hedwig; Ehrenreich, Armin; Ludwig, Wolfgang; Monot, Marc; Rupnik, Maja; Barbut, Frederic; Indra, Alexander; Dupuy, Bruno; Liebl, Wolfgang

    2014-01-01

    PCR-ribotyping is a broadly used method for the classification of isolates of Clostridium difficile, an emerging intestinal pathogen, causing infections with increased disease severity and incidence in several European and North American countries. We have now carried out clustering analysis with selected genes of numerous C. difficile strains as well as gene content comparisons of their genomes in order to broaden our view of the relatedness of strains assigned to different ribotypes. We analyzed the genomic content of 48 C. difficile strains representing 21 different ribotypes. The calculation of distance matrix-based dendrograms using the neighbor joining method for 14 conserved genes (standard phylogenetic marker genes) from the genomes of the C. difficile strains demonstrated that the genes from strains with the same ribotype generally clustered together. Further, certain ribotypes always clustered together and formed ribotype groups, i.e. ribotypes 078, 033 and 126, as well as ribotypes 002 and 017, indicating their relatedness. Comparisons of the gene contents of the genomes of ribotypes that clustered according to the conserved gene analysis revealed that the number of common genes of the ribotypes belonging to each of these three ribotype groups were very similar for the 078/033/126 group (at most 69 specific genes between the different strains with the same ribotype) but less similar for the 002/017 group (86 genes difference). It appears that the ribotype is indicative not only of a specific pattern of the amplified 16S–23S rRNA intergenic spacer but also reflects specific differences in the nucleotide sequences of the conserved genes studied here. It can be anticipated that the sequence deviations of more genes of C. difficile strains are correlated with their PCR-ribotype. In conclusion, the results of this study corroborate and extend the concept of clonal C. difficile lineages, which correlate with ribotypes affiliation. PMID:24482682

  8. RNA-Seq Analysis of Developing Pecan (Carya illinoinensis) Embryos Reveals Parallel Expression Patterns among Allergen and Lipid Metabolism Genes.

    PubMed

    Mattison, Christopher P; Rai, Ruhi; Settlage, Robert E; Hinchliffe, Doug J; Madison, Crista; Bland, John M; Brashear, Suzanne; Graham, Charles J; Tarver, Matthew R; Florane, Christopher; Bechtel, Peter J

    2017-02-22

    The pecan nut is a nutrient-rich part of a healthy diet full of beneficial fatty acids and antioxidants, but can also cause allergic reactions in people suffering from food allergy to the nuts. The transcriptome of a developing pecan nut was characterized to identify the gene expression occurring during the process of nut development and to highlight those genes involved in fatty acid metabolism and those that commonly act as food allergens. Pecan samples were collected at several time points during the embryo development process including the water, gel, dough, and mature nut stages. Library preparation and sequencing were performed using Illumina-based mRNA HiSeq with RNA from four time points during the growing season during August and September 2012. Sequence analysis with Trinotate software following the Trinity protocol identified 133,000 unigenes with 52,267 named transcripts and 45,882 annotated genes. A total of 27,312 genes were defined by GO annotation. Gene expression clustering analysis identified 12 different gene expression profiles, each containing a number of genes. Three pecan seed storage proteins that commonly act as allergens, Car i 1, Car i 2, and Car i 4, were significantly up-regulated during the time course. Up-regulated fatty acid metabolism genes that were identified included acyl-[ACP] desaturase and omega-6 desaturase genes involved in oleic and linoleic acid metabolism. Notably, a few of the up-regulated acyl-[ACP] desaturase and omega-6 desaturase genes that were identified have expression patterns similar to the allergen genes based upon gene expression clustering and qPCR analysis. These findings suggest the possibility of coordinated accumulation of lipids and allergens during pecan nut embryogenesis.

  9. Genetic dissection of the resistance to nine anthracnose races in the common bean differential cultivars MDRK and TU.

    PubMed

    Campa, Ana; Giraldez, Ramón; Ferreira, Juan José

    2009-06-01

    Resistance to nine races of the pathogenic fungus Colletotrichum lindemuthianum, causal agent of anthracnose, was evaluated in F(3) families derived from the cross between the anthracnose differential bean cultivars TU (resistant to races, 3, 6, 7, 31, 38, 39, 102, and 449) and MDRK (resistant to races, 449, and 1545). Molecular marker analyses were carried out in the F(2) individuals in order to map and characterize the anthracnose resistance genes or gene clusters present in these two differential cultivars. The results of the combined segregation indicate that at least three independent loci conferring resistance to anthracnose are present in TU. One of them, corresponding to the previously described anthracnose resistance locus Co-5, is located in linkage group B7, and is formed by a cluster of different genes conferring specific resistance to races, 3, 6, 7, 31, 38, 39, 102, and 449. Evidence of intra-cluster recombination between these specific resistance genes was found. The second locus present in TU confers specific resistance to races 31 and 102, and the third locus confers specific resistance to race 102, the location of these two loci remains unknown. The resistance to race 1545 present in MDRK is due to two independent dominant genes. The results of the combined segregation of two F(4) families showing monogenic segregation for resistance to race 1545 indicates that one of these two genes is linked to marker OF10(530), located in linkage group B1, and corresponds to the previously described anthracnose resistance locus Co-1. The second gene conferring resistance to race 1545 in MDRK is linked to marker Pv-ctt001, located in linkage group B4, and corresponds to the Co-3/Co-9 cluster. The resistance to race 449 present in MDRK is conferred by a single gene, located in linkage group B4, probably included in the same Co-3/Co-9 cluster.

  10. Clustering cancer gene expression data by projective clustering ensemble

    PubMed Central

    Yu, Xianxue; Yu, Guoxian

    2017-01-01

    Gene expression data analysis has paramount implications for gene treatments, cancer diagnosis and other domains. Clustering is an important and promising tool to analyze gene expression data. Gene expression data is often characterized by a large amount of genes but with limited samples, thus various projective clustering techniques and ensemble techniques have been suggested to combat with these challenges. However, it is rather challenging to synergy these two kinds of techniques together to avoid the curse of dimensionality problem and to boost the performance of gene expression data clustering. In this paper, we employ a projective clustering ensemble (PCE) to integrate the advantages of projective clustering and ensemble clustering, and to avoid the dilemma of combining multiple projective clusterings. Our experimental results on publicly available cancer gene expression data show PCE can improve the quality of clustering gene expression data by at least 4.5% (on average) than other related techniques, including dimensionality reduction based single clustering and ensemble approaches. The empirical study demonstrates that, to further boost the performance of clustering cancer gene expression data, it is necessary and promising to synergy projective clustering with ensemble clustering. PCE can serve as an effective alternative technique for clustering gene expression data. PMID:28234920

  11. Consensus properties and their large-scale applications for the gene duplication problem.

    PubMed

    Moon, Jucheol; Lin, Harris T; Eulenstein, Oliver

    2016-06-01

    Solving the gene duplication problem is a classical approach for species tree inference from gene trees that are confounded by gene duplications. This problem takes a collection of gene trees and seeks a species tree that implies the minimum number of gene duplications. Wilkinson et al. posed the conjecture that the gene duplication problem satisfies the desirable Pareto property for clusters. That is, for every instance of the problem, all clusters that are commonly present in the input gene trees of this instance, called strict consensus, will also be found in every solution to this instance. We prove that this conjecture does not generally hold. Despite this negative result we show that the gene duplication problem satisfies a weaker version of the Pareto property where the strict consensus is found in at least one solution (rather than all solutions). This weaker property contributes to our design of an efficient scalable algorithm for the gene duplication problem. We demonstrate the performance of our algorithm in analyzing large-scale empirical datasets. Finally, we utilize the algorithm to evaluate the accuracy of standard heuristics for the gene duplication problem using simulated datasets.

  12. Leptokurtic pollen-flow, non-leptokurtic gene-flow in a wind-pollinated herb, Plantago lanceolata L.

    PubMed

    Tonsor, Stephen J

    1985-10-01

    The purpose of this study was to simultaneously measure pollen dispersal distance and actual pollen-mediated gene-flow distance in a wind-pollinated herb, Plantago lanceolata. The pollen dispersal distribution, measured as pollen deposition in a wind tunnel, is leptokurtic, as expected from previous studies of wind-pollinated plants. Gene-flow, measured as seeds produced on rows of male-sterile inflorescences in the wind tunnel, is non-leptokurtic, peaking at an intermediate distance. The difference between the two distributions results from the tendency of the pollen grains to cluster. These pollen clusters are the units of gene dispersal, with clusters of intermediate and large size contributing disproportionately to gene-flow. Since many wind-pollinated species show pollen clustering (see text), the common assumption for wind-pollinated plants that gene-flow is leptokurtic requires re-examination. Gene-flow was also measured in an artifical outdoor population of male-steriles, containing a single pollen source plant in the center of the array. The gene flow distribution is significantly platykurtic, and has the same general properties outdoors, where wind speed and turbulence are uncontrolled, as it does in the wind tunnel. I estimated genetic neighborhood size based on my measure of gene-flow in the outdoor population. The estimate shows that populations of Plantago lanceolata will vary in effective number from a few tens of plants to more than five hundred plants, depending on the density of the population in question. Thus, the measured pollen-mediated gene-flow distribution and population density will interact to produce effective population sizes ranging from those in which there is no random genetic drift to those in which random genetic drift plays an important role in determining gene frequencies within and among populations. Despite the platykurtosis in the distribution, pollen-mediated gene dispersal distances are still quite limited, and considerable within and among-population genetic differentiation is to be expected in this species.

  13. Alpha-globin gene haplotypes in South American Indians.

    PubMed

    Zago, M A; Melo Santos, E J; Clegg, J B; Guerreiro, J F; Martinson, J J; Norwich, J; Figueiredo, M S

    1995-08-01

    The haplotypes of the alpha-globin gene cluster were determined for 99 Indians from the Brazilian Amazon region who belong to 5 tribes: Wayampí, Wayana-Apalaí, Kayapó, Arára, and Yanomámi. Three predominant haplotypes were identified: Ia (present in 38.9% of chromosomes), IIIa (25.8%), and IIe (22.1%). The only alpha-globin gene rearrangement detected was alpha alpha alpha 3.7 I gene triplication associated with haplotype IIIa, found in high frequencies (5.6% and 10.6%) in two tribes and absent in the others. alpha-Globin gene deletions that cause alpha-thalassemia were not seen, supporting the argument that malaria was absent in these populations until recently. The heterogeneous distribution of alpha-globin gene haplotypes and rearrangements among the different tribes differs markedly from the homogeneous distribution of beta-globin gene cluster haplotypes and reflects the action of various genetic mechanisms (genetic drift, founder effect, consanguinity) on small isolated population groups with a complicated history of divergence-fusion events. The alpha-globin gene haplotype distribution has some similarities to distributions observed in Southeast Asian and Pacific Island populations, indicating that these populations have considerable genetic affinities. However, the absence of several features of the alpha-globin gene cluster that are consistently present among the Pacific Islanders suggests that the similarity of haplotypes between Brazilian Indians and people from Polynesia, Micronesia, and Melanesia is more likely to result of ancient common ancestry rather than the consequence of recent direct genetic contribution through immigration.

  14. The drug target genes show higher evolutionary conservation than non-target genes.

    PubMed

    Lv, Wenhua; Xu, Yongdeng; Guo, Yiying; Yu, Ziqi; Feng, Guanglong; Liu, Panpan; Luan, Meiwei; Zhu, Hongjie; Liu, Guiyou; Zhang, Mingming; Lv, Hongchao; Duan, Lian; Shang, Zhenwei; Li, Jin; Jiang, Yongshuai; Zhang, Ruijie

    2016-01-26

    Although evidence indicates that drug target genes share some common evolutionary features, there have been few studies analyzing evolutionary features of drug targets from an overall level. Therefore, we conducted an analysis which aimed to investigate the evolutionary characteristics of drug target genes. We compared the evolutionary conservation between human drug target genes and non-target genes by combining both the evolutionary features and network topological properties in human protein-protein interaction network. The evolution rate, conservation score and the percentage of orthologous genes of 21 species were included in our study. Meanwhile, four topological features including the average shortest path length, betweenness centrality, clustering coefficient and degree were considered for comparison analysis. Then we got four results as following: compared with non-drug target genes, 1) drug target genes had lower evolutionary rates; 2) drug target genes had higher conservation scores; 3) drug target genes had higher percentages of orthologous genes and 4) drug target genes had a tighter network structure including higher degrees, betweenness centrality, clustering coefficients and lower average shortest path lengths. These results demonstrate that drug target genes are more evolutionarily conserved than non-drug target genes. We hope that our study will provide valuable information for other researchers who are interested in evolutionary conservation of drug targets.

  15. Biological mechanism analysis of acute renal allograft rejection: integrated of mRNA and microRNA expression profiles.

    PubMed

    Huang, Shi-Ming; Zhao, Xia; Zhao, Xue-Mei; Wang, Xiao-Ying; Li, Shan-Shan; Zhu, Yu-Hui

    2014-01-01

    Renal transplantation is the preferred method for most patients with end-stage renal disease, however, acute renal allograft rejection is still a major risk factor for recipients leading to renal injury. To improve the early diagnosis and treatment of acute rejection, study on the molecular mechanism of it is urgent. MicroRNA (miRNA) expression profile and mRNA expression profile of acute renal allograft rejection and well-functioning allograft downloaded from ArrayExpress database were applied to identify differentially expressed (DE) miRNAs and DE mRNAs. DE miRNAs targets were predicted by combining five algorithm. By overlapping the DE mRNAs and DE miRNAs targets, common genes were obtained. Differentially co-expressed genes (DCGs) were identified by differential co-expression profile (DCp) and differential co-expression enrichment (DCe) methods in Differentially Co-expressed Genes and Links (DCGL) package. Then, co-expression network of DCGs and the cluster analysis were performed. Functional enrichment analysis for DCGs was undergone. A total of 1270 miRNA targets were predicted and 698 DE mRNAs were obtained. While overlapping miRNA targets and DE mRNAs, 59 common genes were gained. We obtained 103 DCGs and 5 transcription factors (TFs) based on regulatory impact factors (RIF), then built the regulation network of miRNA targets and DE mRNAs. By clustering the co-expression network, 5 modules were obtained. Thereinto, module 1 had the highest degree and module 2 showed the most number of DCGs and common genes. TF CEBPB and several common genes, such as RXRA, BASP1 and AKAP10, were mapped on the co-expression network. C1R showed the highest degree in the network. These genes might be associated with human acute renal allograft rejection. We conducted biological analysis on integration of DE mRNA and DE miRNA in acute renal allograft rejection, displayed gene expression patterns and screened out genes and TFs that may be related to acute renal allograft rejection.

  16. Biological mechanism analysis of acute renal allograft rejection: integrated of mRNA and microRNA expression profiles

    PubMed Central

    Huang, Shi-Ming; Zhao, Xia; Zhao, Xue-Mei; Wang, Xiao-Ying; Li, Shan-Shan; Zhu, Yu-Hui

    2014-01-01

    Objectives: Renal transplantation is the preferred method for most patients with end-stage renal disease, however, acute renal allograft rejection is still a major risk factor for recipients leading to renal injury. To improve the early diagnosis and treatment of acute rejection, study on the molecular mechanism of it is urgent. Methods: MicroRNA (miRNA) expression profile and mRNA expression profile of acute renal allograft rejection and well-functioning allograft downloaded from ArrayExpress database were applied to identify differentially expressed (DE) miRNAs and DE mRNAs. DE miRNAs targets were predicted by combining five algorithm. By overlapping the DE mRNAs and DE miRNAs targets, common genes were obtained. Differentially co-expressed genes (DCGs) were identified by differential co-expression profile (DCp) and differential co-expression enrichment (DCe) methods in Differentially Co-expressed Genes and Links (DCGL) package. Then, co-expression network of DCGs and the cluster analysis were performed. Functional enrichment analysis for DCGs was undergone. Results: A total of 1270 miRNA targets were predicted and 698 DE mRNAs were obtained. While overlapping miRNA targets and DE mRNAs, 59 common genes were gained. We obtained 103 DCGs and 5 transcription factors (TFs) based on regulatory impact factors (RIF), then built the regulation network of miRNA targets and DE mRNAs. By clustering the co-expression network, 5 modules were obtained. Thereinto, module 1 had the highest degree and module 2 showed the most number of DCGs and common genes. TF CEBPB and several common genes, such as RXRA, BASP1 and AKAP10, were mapped on the co-expression network. C1R showed the highest degree in the network. These genes might be associated with human acute renal allograft rejection. Conclusions: We conducted biological analysis on integration of DE mRNA and DE miRNA in acute renal allograft rejection, displayed gene expression patterns and screened out genes and TFs that may be related to acute renal allograft rejection. PMID:25664019

  17. Identifying and Assessing Interesting Subgroups in a Heterogeneous Population.

    PubMed

    Lee, Woojoo; Alexeyenko, Andrey; Pernemalm, Maria; Guegan, Justine; Dessen, Philippe; Lazar, Vladimir; Lehtiö, Janne; Pawitan, Yudi

    2015-01-01

    Biological heterogeneity is common in many diseases and it is often the reason for therapeutic failures. Thus, there is great interest in classifying a disease into subtypes that have clinical significance in terms of prognosis or therapy response. One of the most popular methods to uncover unrecognized subtypes is cluster analysis. However, classical clustering methods such as k-means clustering or hierarchical clustering are not guaranteed to produce clinically interesting subtypes. This could be because the main statistical variability--the basis of cluster generation--is dominated by genes not associated with the clinical phenotype of interest. Furthermore, a strong prognostic factor might be relevant for a certain subgroup but not for the whole population; thus an analysis of the whole sample may not reveal this prognostic factor. To address these problems we investigate methods to identify and assess clinically interesting subgroups in a heterogeneous population. The identification step uses a clustering algorithm and to assess significance we use a false discovery rate- (FDR-) based measure. Under the heterogeneity condition the standard FDR estimate is shown to overestimate the true FDR value, but this is remedied by an improved FDR estimation procedure. As illustrations, two real data examples from gene expression studies of lung cancer are provided.

  18. A statistically inferred microRNA network identifies breast cancer target miR-940 as an actin cytoskeleton regulator

    NASA Astrophysics Data System (ADS)

    Bhajun, Ricky; Guyon, Laurent; Pitaval, Amandine; Sulpice, Eric; Combe, Stéphanie; Obeid, Patricia; Haguet, Vincent; Ghorbel, Itebeddine; Lajaunie, Christian; Gidrol, Xavier

    2015-02-01

    MiRNAs are key regulators of gene expression. By binding to many genes, they create a complex network of gene co-regulation. Here, using a network-based approach, we identified miRNA hub groups by their close connections and common targets. In one cluster containing three miRNAs, miR-612, miR-661 and miR-940, the annotated functions of the co-regulated genes suggested a role in small GTPase signalling. Although the three members of this cluster targeted the same subset of predicted genes, we showed that their overexpression impacted cell fates differently. miR-661 demonstrated enhanced phosphorylation of myosin II and an increase in cell invasion, indicating a possible oncogenic miRNA. On the contrary, miR-612 and miR-940 inhibit phosphorylation of myosin II and cell invasion. Finally, expression profiling in human breast tissues showed that miR-940 was consistently downregulated in breast cancer tissues

  19. The structure of a gene co-expression network reveals biological functions underlying eQTLs.

    PubMed

    Villa-Vialaneix, Nathalie; Liaubet, Laurence; Laurent, Thibault; Cherel, Pierre; Gamot, Adrien; SanCristobal, Magali

    2013-01-01

    What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology.

  20. Prediction of epigenetically regulated genes in breast cancer cell lines.

    PubMed

    Loss, Leandro A; Sadanandam, Anguraj; Durinck, Steffen; Nautiyal, Shivani; Flaucher, Diane; Carlton, Victoria E H; Moorhead, Martin; Lu, Yontao; Gray, Joe W; Faham, Malek; Spellman, Paul; Parvin, Bahram

    2010-06-04

    Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profiles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines, which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profiles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fixed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis. Our two-step dimensionality reduction compressed 90% of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically significant negative correlation between methylation profiles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identified 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes. Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators.

  1. Mucopolysaccharidosis type IVA: Common double deletion in the N-Acetylgalactosamine-6-sulfatase gene (GALNS)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hori, Toshinori; Tomatsu, Shunji; Fukuda, Seiji

    1995-04-10

    Mucopolysaccharidosis IVA (MPS IVA) is an autosomal recessive disorder caused by a deficiency in N-acetylgalactosamine-6-sulfatase (GALNS). We found two separate deletions of nearly 8.0 and 6.0 kb in the GALNS gene, including some exons. There are Alu repetitive elements near the breakpoints of the 8.0-kb deletion, and this deletion resulted from an Alu-Alu recombination. The other 6.0-kb deletion involved illegitimate recombinational events between incomplete short direct repeats of 8 bp at deletion breakpoints. The same rearrangement has been observed in a heteroallelic state in four unrelated patients. This is the first documentation of a common double deletion a gene thatmore » is not a member of a gene cluster. 39 refs., 5 figs.« less

  2. Deciphering the Anti-Aflatoxinogenic Properties of Eugenol Using a Large-Scale q-PCR Approach

    PubMed Central

    Caceres, Isaura; El Khoury, Rhoda; Medina, Ángel; Lippi, Yannick; Naylies, Claire; Atoui, Ali; El Khoury, André; Oswald, Isabelle P.; Bailly, Jean-Denis; Puel, Olivier

    2016-01-01

    Produced by several species of Aspergillus, Aflatoxin B1 (AFB1) is a carcinogenic mycotoxin contaminating many crops worldwide. The utilization of fungicides is currently one of the most common methods; nevertheless, their use is not environmentally or economically sound. Thus, the use of natural compounds able to block aflatoxinogenesis could represent an alternative strategy to limit food and feed contamination. For instance, eugenol, a 4-allyl-2-methoxyphenol present in many essential oils, has been identified as an anti-aflatoxin molecule. However, its precise mechanism of action has yet to be clarified. The production of AFB1 is associated with the expression of a 70 kB cluster, and not less than 21 enzymatic reactions are necessary for its production. Based on former empirical data, a molecular tool composed of 60 genes targeting 27 genes of aflatoxin B1 cluster and 33 genes encoding the main regulatory factors potentially involved in its production, was developed. We showed that AFB1 inhibition in Aspergillus flavus following eugenol addition at 0.5 mM in a Malt Extract Agar (MEA) medium resulted in a complete inhibition of the expression of all but one gene of the AFB1 biosynthesis cluster. This transcriptomic effect followed a down-regulation of the complex composed by the two internal regulatory factors, AflR and AflS. This phenomenon was also influenced by an over-expression of veA and mtfA, two genes that are directly linked to AFB1 cluster regulation. PMID:27128940

  3. Polycistronic gene expression in Aspergillus niger.

    PubMed

    Schuetze, Tabea; Meyer, Vera

    2017-09-25

    Genome mining approaches predict dozens of biosynthetic gene clusters in each of the filamentous fungal genomes sequenced so far. However, the majority of these gene clusters still remain cryptic because they are not expressed in their natural host. Simultaneous expression of all genes belonging to a biosynthetic pathway in a heterologous host is one approach to activate biosynthetic gene clusters and to screen the metabolites produced for bioactivities. Polycistronic expression of all pathway genes under control of a single and tunable promoter would be the method of choice, as this does not only simplify cloning procedures, but also offers control on timing and strength of expression. However, polycistronic gene expression is a feature not commonly found in eukaryotic host systems, such as Aspergillus niger. In this study, we tested the suitability of the viral P2A peptide for co-expression of three genes in A. niger. Two genes descend from Fusarium oxysporum and are essential to produce the secondary metabolite enniatin (esyn1, ekivR). The third gene (luc) encodes the reporter luciferase which was included to study position effects. Expression of the polycistronic gene cassette was put under control of the Tet-On system to ensure tunable gene expression in A. niger. In total, three polycistronic expression cassettes which differed in the position of luc were constructed and targeted to the pyrG locus in A. niger. This allowed direct comparison of the luciferase activity based on the position of the luciferase gene. Doxycycline-mediated induction of the Tet-On expression cassettes resulted in the production of one long polycistronic mRNA as proven by Northern analyses, and ensured comparable production of enniatin in all three strains. Notably, gene position within the polycistronic expression cassette matters, as, luciferase activity was lowest at position one and had a comparable activity at positions two and three. The P2A peptide can be used to express at least three genes polycistronically in A. niger. This approach can now be applied to heterologously express entire secondary metabolite gene clusters polycistronically or to co-express any genes of interest in equimolar amounts.

  4. Phylogenetic comparisons of a coastal bacterioplankton community with its counterparts in open ocean and freshwater systems.

    PubMed

    Rappé; Vergin; Giovannoni

    2000-09-01

    In order to extend previous comparisons between coastal marine bacterioplankton communities and their open ocean and freshwater counterparts, here we summarize and provide new data on a clone library of 105 SSU rRNA genes recovered from seawater collected over the western continental shelf of the USA in the Pacific Ocean. Comparisons to previously published data revealed that this coastal bacterioplankton clone library was dominated by SSU rRNA gene phylotypes originally described from surface waters of the open ocean, but also revealed unique SSU rRNA gene lineages of beta Proteobacteria related to those found in clone libraries from freshwater habitats. beta Proteobacteria lineages common to coastal and freshwater samples included members of a clade of obligately methylotrophic bacteria, SSU rRNA genes affiliated with Xylophilus ampelinus, and a clade related to the genus Duganella. In addition, SSU rRNA genes were recovered from such previously recognized marine bacterioplankton SSU rRNA gene clone clusters as the SAR86, SAR11, and SAR116 clusters within the class Proteobacteria, the Roseobacter clade of the alpha subclass of the Proteobacteria, the marine group A/SAR406 cluster, and the marine Actinobacteria clade. Overall, these results support and extend previous observations concerning the global distribution of several marine planktonic prokaryote SSU rRNA gene phylotypes, but also show that coastal bacterioplankton communities contain SSU rRNA gene lineages (and presumably bacterioplankton) shown previously to be prevalent in freshwater habitats.

  5. Analysis of FOXF1 and the FOX gene cluster in patients with VACTERL association

    PubMed Central

    Agochukwu, Nneamaka B.; Pineda-Alvarez, Daniel E.; Keaton, Amelia A.; Warren-Mora, Nicole; Raam, Manu S.; Kamat, Aparna; Chandrasekharappa, Settara C.; Solomon, Benjamin D.

    2011-01-01

    VACTERL association, a relatively common condition with an incidence of approximately 1 in 20,000 – 35,000 births, is a non-random association of birth defects that includes vertebral defects (V), anal atresia (A), cardiac defects (C), tracheo-esophageal fistula (TE), renal anomalies (R) and limb malformations (L). Although the etiology is unknown in the majority of patients, there is evidence that it is causally heterogeneous. Several studies have shown evidence for inheritance in VACTERL, implying a role for genetic loci. Recently, patients with component features of VACTERL and a lethal developmental pulmonary disorder, alveolar capillary dysplasia with misalignment of pulmonary veins (ACD/MPV), were found to harbor deletions or mutations affecting FOXF1 and the FOX gene cluster on chromosome 16q24. We investigated this gene through direct sequencing and high-density SNP microarray in 12 patients with VACTERL association but without ACD/MPV. Our mutational analysis of FOXF1 showed normal sequences and no genomic imbalances affecting the FOX gene cluster on chromosome 16q24 in the studied patients. Possible explanations for these results include the etiologic and clinical heterogeneity of VACTERL association, the possibility that mutations affecting this gene may occur only in more severely affected individuals, and insufficient study sample size. PMID:21315191

  6. Highly preserved consensus gene modules in human papilloma virus 16 positive cervical cancer and head and neck cancers.

    PubMed

    Zhang, Xianglan; Cha, In-Ho; Kim, Ki-Yeol

    2017-12-26

    In this study, we investigated the consensus gene modules in head and neck cancer (HNC) and cervical cancer (CC). We used a publicly available gene expression dataset, GSE6791, which included 42 HNC, 14 normal head and neck, 20 CC and 8 normal cervical tissue samples. To exclude bias because of different human papilloma virus (HPV) types, we analyzed HPV16-positive samples only. We identified 3824 genes common to HNC and CC samples. Among these, 977 genes showed high connectivity and were used to construct consensus modules. We demonstrated eight consensus gene modules for HNC and CC using the dissimilarity measure and average linkage hierarchical clustering methods. These consensus modules included genes with significant biological functions, including ATP binding and extracellular exosome. Eigengen network analysis revealed the consensus modules were highly preserved with high connectivity. These findings demonstrate that HPV16-positive head and neck and cervical cancers share highly preserved consensus gene modules with common potentially therapeutic targets.

  7. Highly preserved consensus gene modules in human papilloma virus 16 positive cervical cancer and head and neck cancers

    PubMed Central

    Zhang, Xianglan; Cha, In-Ho; Kim, Ki-Yeol

    2017-01-01

    In this study, we investigated the consensus gene modules in head and neck cancer (HNC) and cervical cancer (CC). We used a publicly available gene expression dataset, GSE6791, which included 42 HNC, 14 normal head and neck, 20 CC and 8 normal cervical tissue samples. To exclude bias because of different human papilloma virus (HPV) types, we analyzed HPV16-positive samples only. We identified 3824 genes common to HNC and CC samples. Among these, 977 genes showed high connectivity and were used to construct consensus modules. We demonstrated eight consensus gene modules for HNC and CC using the dissimilarity measure and average linkage hierarchical clustering methods. These consensus modules included genes with significant biological functions, including ATP binding and extracellular exosome. Eigengen network analysis revealed the consensus modules were highly preserved with high connectivity. These findings demonstrate that HPV16-positive head and neck and cervical cancers share highly preserved consensus gene modules with common potentially therapeutic targets. PMID:29371966

  8. Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

    PubMed

    Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

    2012-07-15

    Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of E<10(-5)) are included in 27 clusters. Five clusters are associated with metabolism, containing P450 genes restricted to the Brassica family and predicted to be involved in secondary metabolism. Operon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.

  9. Transcription of Two Adjacent Carbohydrate Utilization Gene Clusters in Bifidobacterium breve UCC2003 Is Controlled by LacI- and Repressor Open Reading Frame Kinase (ROK)-Type Regulators

    PubMed Central

    O'Connell, Kerry Joan; O'Connell Motherway, Mary; Liedtke, Andrea; Fitzgerald, Gerald F.; Ross, R. Paul; Stanton, Catherine; Zomer, Aldert

    2014-01-01

    Members of the genus Bifidobacterium are commonly found in the gastrointestinal tracts of mammals, including humans, where their growth is presumed to be dependent on various diet- and/or host-derived carbohydrates. To understand transcriptional control of bifidobacterial carbohydrate metabolism, we investigated two genetic carbohydrate utilization clusters dedicated to the metabolism of raffinose-type sugars and melezitose. Transcriptomic and gene inactivation approaches revealed that the raffinose utilization system is positively regulated by an activator protein, designated RafR. The gene cluster associated with melezitose metabolism was shown to be subject to direct negative control by a LacI-type transcriptional regulator, designated MelR1, in addition to apparent indirect negative control by means of a second LacI-type regulator, MelR2. In silico analysis, DNA-protein interaction, and primer extension studies revealed the MelR1 and MelR2 operator sequences, each of which is positioned just upstream of or overlapping the correspondingly regulated promoter sequences. Similar analyses identified the RafR binding operator sequence located upstream of the rafB promoter. This study indicates that transcriptional control of gene clusters involved in carbohydrate metabolism in bifidobacteria is subject to conserved regulatory systems, representing either positive or negative control. PMID:24705323

  10. Transgenic Over Expression of Nicotinic Receptor Alpha 5, Alpha 3, and Beta 4 Subunit Genes Reduces Ethanol Intake in Mice

    PubMed Central

    Gallego, Xavier; Ruiz, Jessica; Valverde, Olga; Molas, Susanna; Robles, Noemí; Sabrià, Josefa; Crabbe, John C.; Dierssen, Mara

    2012-01-01

    Abuse of alcohol and smoking are extensively co-morbid. Some studies suggest partial commonality of action of alcohol and nicotine mediated through nicotinic acetylcholine receptors (nAChRs). We tested mice with transgenic over expression of the alpha 5, alpha 3, beta 4 receptor subunit genes, which lie in a cluster on human chromosome 15, that were previously shown to have increased nicotine self-administration, for several responses to ethanol. Transgenic and wild-type mice did not differ in sensitivity to several acute behavioral responses to ethanol. However, transgenic mice drank less ethanol than wild-type in a two-bottle (ethanol vs. water) preference test. These results suggest a complex role for this receptor subunit gene cluster in the modulation of ethanol’s as well as nicotine’s effects. PMID:22459873

  11. A cluster of culture positive gonococcal infections but with false negative cppB gene based PCR.

    PubMed

    Lum, G; Freeman, K; Nguyen, N L; Limnios, E A; Tabrizi, S N; Carter, I; Chambers, I W; Whiley, D M; Sloots, T P; Garland, S M; Tapsall, J W

    2005-10-01

    To describe the prevalence and characteristics of isolates of Neisseria gonorrhoeae grown from urine samples that produced negative results with nucleic acid amplification assays (NAA) targeting the cppB gene. An initial cluster of culture positive, but cppB gene based NAA negative, gonococcal infections was recognised. Urine samples and suspensions of gonococci isolated over 9 months in the Northern Territory of Australia were examined using cppB gene based and other non-cppB gene based NAA. The gonococcal isolates were phenotyped by determining the auxotype/serovar (A/S) class and genotyped by pulsed field gel electrophoresis (PFGE). 14 (9.8%) of 143 gonococci isolated were of A/S class Pro(-/)Brpyut, indistinguishable on PFGE and negative in cppB gene based, but not other, NAA. This cluster represents a temporal and geographic expansion of a gonococcal subtype lacking the cppB gene with consequent loss of sensitivity of NAA dependent on amplification of this target. Gonococci lacking the cppB gene have in the past been more commonly associated with the PAU-/PCU- auxotype, a gonococcal subtype hitherto infrequently encountered in Australia. NAA based on the cppB gene as a target may produce false positive as well as false negative NAA. This suggests that unless there is continuing comparison with culture to show their utility, cppB gene based NAA should be regarded as suboptimal for use either as a diagnostic or supplemental assay for diagnosis of gonorrhoea, and NAA with alternative amplification targets should be substituted.

  12. UNCLES: method for the identification of genes differentially consistently co-expressed in a specific subset of datasets.

    PubMed

    Abu-Jamous, Basel; Fa, Rui; Roberts, David J; Nandi, Asoke K

    2015-06-04

    Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.

  13. The first report of a Pelecaniformes defensin cluster: Characterization of β-defensin genes in the crested ibis based on BAC libraries

    PubMed Central

    Lan, Hong; Chen, Hui; Chen, Li-Cheng; Wang, Bei-Bing; Sun, Li; Ma, Mei-Ying; Fang, Sheng-Guo; Wan, Qiu-Hong

    2014-01-01

    Defensins play a key role in the innate immunity of various organisms. Detailed genomic studies of the defensin cluster have only been reported in a limited number of birds. Herein, we present the first characterization of defensins in a Pelecaniformes species, the crested ibis (Nipponia nippon), which is one of the most endangered birds in the world. We constructed bacterial artificial chromosome libraries, including a 4D-PCR library and a reverse-4D library, which provide at least 40 equivalents of this rare bird's genome. A cluster including 14 β-defensin loci within 129 kb was assigned to chromosome 3 by FISH, and one gene duplication of AvBD1 was found. The ibis defensin genes are characterized by multiform gene organization ranging from two to four exons through extensive exon fusion. Splicing signal variations and alternative splice variants were also found. Comparative analysis of four bird species identified one common and multiple species-specific duplications, which might be associated with high GC content. Evolutionary analysis revealed birth-and-death mode and purifying selection for avian defensin evolution, resulting in different defensin gene numbers among bird species and functional conservation within orthologous genes, respectively. Additionally, we propose various directions for further research on genetic conservation in the crested ibis. PMID:25372018

  14. Modulated Modularity Clustering as an Exploratory Tool for Functional Genomic Inference

    PubMed Central

    Stone, Eric A.; Ayroles, Julien F.

    2009-01-01

    In recent years, the advent of high-throughput assays, coupled with their diminishing cost, has facilitated a systems approach to biology. As a consequence, massive amounts of data are currently being generated, requiring efficient methodology aimed at the reduction of scale. Whole-genome transcriptional profiling is a standard component of systems-level analyses, and to reduce scale and improve inference clustering genes is common. Since clustering is often the first step toward generating hypotheses, cluster quality is critical. Conversely, because the validation of cluster-driven hypotheses is indirect, it is critical that quality clusters not be obtained by subjective means. In this paper, we present a new objective-based clustering method and demonstrate that it yields high-quality results. Our method, modulated modularity clustering (MMC), seeks community structure in graphical data. MMC modulates the connection strengths of edges in a weighted graph to maximize an objective function (called modularity) that quantifies community structure. The result of this maximization is a clustering through which tightly-connected groups of vertices emerge. Our application is to systems genetics, and we quantitatively compare MMC both to the hierarchical clustering method most commonly employed and to three popular spectral clustering approaches. We further validate MMC through analyses of human and Drosophila melanogaster expression data, demonstrating that the clusters we obtain are biologically meaningful. We show MMC to be effective and suitable to applications of large scale. In light of these features, we advocate MMC as a standard tool for exploration and hypothesis generation. PMID:19424432

  15. Hierarchical Bayesian modelling of gene expression time series across irregularly sampled replicates and clusters.

    PubMed

    Hensman, James; Lawrence, Neil D; Rattray, Magnus

    2013-08-20

    Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering. Existing methodologies fail to capture either the temporal or replicated nature of the experiments, and often impose constraints on the data collection process, such as regularly spaced samples, or similar sampling schema across replications. We propose hierarchical Gaussian processes as a general model of gene expression time-series, with application to a variety of problems. In particular, we illustrate the method's capacity for missing data imputation, data fusion and clustering.The method can impute data which is missing both systematically and at random: in a hold-out test on real data, performance is significantly better than commonly used imputation methods. The method's ability to model inter- and intra-cluster variance leads to more biologically meaningful clusters. The approach removes the necessity for evenly spaced samples, an advantage illustrated on a developmental Drosophila dataset with irregular replications. The hierarchical Gaussian process model provides an excellent statistical basis for several gene-expression time-series tasks. It has only a few additional parameters over a regular GP, has negligible additional complexity, is easily implemented and can be integrated into several existing algorithms. Our experiments were implemented in python, and are available from the authors' website: http://staffwww.dcs.shef.ac.uk/people/J.Hensman/.

  16. Serogroup, virulence, and molecular traits of Vibrio parahaemolyticus isolated from clinical and cockle sources in northeastern Thailand.

    PubMed

    Mala, Wanida; Alam, Munirul; Angkititrakul, Sunpetch; Wongwajana, Suwin; Lulitanond, Viraphong; Huttayananont, Sriwanna; Kaewkes, Wanlop; Faksri, Kiatichai; Chomvarin, Chariya

    2016-04-01

    Vibrio parahaemolyticus is responsible for seafood-borne gastroenteritis worldwide. Isolates of V. parahaemolyticus from clinical samples (n=74) and cockles (Anadara granosa) (n=74) in Thailand were analyzed by serotyping, determination of virulence and related marker genes present, response to antimicrobial agents, and genetic relatedness. Serological analysis revealed 31 different serotypes, 10 of which occurred among both clinical and cockle samples. The clinical isolates commonly included the pandemic serogroup O3:K6, while a few of the cockle isolates exhibited likely pandemic serovariants such as O3:KUT and O4:KUT, but not O3:K6. The pandemic (orf8 gene-positive) strains were more frequently found among clinical isolates (78.4%) than cockle isolates (28.4%) (p<0.001). Likewise, the virulence and related marker genes were more commonly detected among clinical than cockle isolates; i.e., tdh gene (93.2% versus 29.7%), vcrD2 (97.3% versus 23.0%), vopB2 (89.2% versus 13.5%), vopT (98.6% versus 36.5%) (all p<0.001) and trh (10.8% versus 1.4%) (p<0.05). Pulsed-field gel electrophoresis of NotI-digested genomic DNA of 41 randomly selected V. parahaemolyticus isolates representing different serotypes produced 33 pulsotypes that formed 5 different clusters (clonal complexes) (A-E) in a dendrogram. Vibrio parahaemolyticus O3:K6 and likely related pandemic serotypes were especially common among the numerous clinical isolates in cluster C, suggesting a close clonal link among many of these isolates. Most clinical and cockle isolates were resistant to ampicillin. This study indicates that O3:K6 and its likely serovariants based on the PFGE clusters, are causative agents. Seafoods such as cockles potentially serve as a source of virulent V. parahaemolyticus, but further work is required to identify possible additional sources. Copyright © 2016. Published by Elsevier B.V.

  17. A brain-specific gene cluster isolated from the region of the mouse obesity locus is expressed in the adult hypothalamus and during mouse development

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Laig-Webster, M.; Lim, M.E.; Chehab, F.F.

    1994-09-01

    The molecular defect underlying an autosomal recessive form of genetic obesity in a classical mouse model C57 BL/6J-ob/ob has not yet been elucidated. Whereas metabolic and physiological disturbances such as diabetes and hypertension are associated with obesity, the site of expression and the nature of the primary lesion responsible for this cascade of events remains elusive. Our efforts aimed at the positional cloning of the ob gene by YAC contig mapping and gene identification have resulted in the cloning of a brain-specific gene cluster from the ob critical region. The expression of this gene cluster is remarkably complex owing tomore » the multitude of brain-specific mRNA transcripts detected on Northern blots. cDNA cloning of these transcripts suggests that they are expressed from different genes as well as by alternate splicing mechanisms. Furthermore, the genomic organization of the cluster appears to consist of at least two identical promoters displaying CpG islands characteristic of housekeeping genes, yet clearly involving tissue-specific expression. Sense and anti-sense synthetic RNA probes were derived from a common DNA sequence on 3 cDNA clones and hybridized to 8-16 days mouse embryonic stages and mouse adult brain sections. Expression in development was noticeable as of the 11th day of gestation and confined to the central nervous system mainly in the telencephalon and spinal cord. Coronal and sagittal sections of the adult mouse brain showed expression only in 3 different regions of the brain stem. In situ hybridization to mouse hypothalamus sections revealed the presence of a localized and specialized group of cells expressing high levels of mRNA, suggesting that this gene cluster may also be involved in the regulation of hypothalamic activities. The hypothalamus has long been hypothesized as a primary candidate tissue for the expression of the obesity gene mainly because of its well-established role in the regulation of energy metabolism and food intake.« less

  18. Analysis of the Nicotiana tabacum Stigma/Style Transcriptome Reveals Gene Expression Differences between Wet and Dry Stigma Species1[W][OA

    PubMed Central

    Quiapim, Andréa C.; Brito, Michael S.; Bernardes, Luciano A.S.; daSilva, Idalete; Malavazi, Iran; DePaoli, Henrique C.; Molfetta-Machado, Jeanne B.; Giuliatti, Silvana; Goldman, Gustavo H.; Goldman, Maria Helena S.

    2009-01-01

    The success of plant reproduction depends on pollen-pistil interactions occurring at the stigma/style. These interactions vary depending on the stigma type: wet or dry. Tobacco (Nicotiana tabacum) represents a model of wet stigma, and its stigmas/styles express genes to accomplish the appropriate functions. For a large-scale study of gene expression during tobacco pistil development and preparation for pollination, we generated 11,216 high-quality expressed sequence tags (ESTs) from stigmas/styles and created the TOBEST database. These ESTs were assembled in 6,177 clusters, from which 52.1% are pistil transcripts/genes of unknown function. The 21 clusters with the highest number of ESTs (putative higher expression levels) correspond to genes associated with defense mechanisms or pollen-pistil interactions. The database analysis unraveled tobacco sequences homologous to the Arabidopsis (Arabidopsis thaliana) genes involved in specifying pistil identity or determining normal pistil morphology and function. Additionally, 782 independent clusters were examined by macroarray, revealing 46 stigma/style preferentially expressed genes. Real-time reverse transcription-polymerase chain reaction experiments validated the pistil-preferential expression for nine out of 10 genes tested. A search for these 46 genes in the Arabidopsis pistil data sets demonstrated that only 11 sequences, with putative equivalent molecular functions, are expressed in this dry stigma species. The reverse search for the Arabidopsis pistil genes in the TOBEST exposed a partial overlap between these dry and wet stigma transcriptomes. The TOBEST represents the most extensive survey of gene expression in the stigmas/styles of wet stigma plants, and our results indicate that wet and dry stigmas/styles express common as well as distinct genes in preparation for the pollination process. PMID:19052150

  19. Fractal Clustering and Knowledge-driven Validation Assessment for Gene Expression Profiling.

    PubMed

    Wang, Lu-Yong; Balasubramanian, Ammaiappan; Chakraborty, Amit; Comaniciu, Dorin

    2005-01-01

    DNA microarray experiments generate a substantial amount of information about the global gene expression. Gene expression profiles can be represented as points in multi-dimensional space. It is essential to identify relevant groups of genes in biomedical research. Clustering is helpful in pattern recognition in gene expression profiles. A number of clustering techniques have been introduced. However, these traditional methods mainly utilize shape-based assumption or some distance metric to cluster the points in multi-dimension linear Euclidean space. Their results shows poor consistence with the functional annotation of genes in previous validation study. From a novel different perspective, we propose fractal clustering method to cluster genes using intrinsic (fractal) dimension from modern geometry. This method clusters points in such a way that points in the same clusters are more self-affine among themselves than to the points in other clusters. We assess this method using annotation-based validation assessment for gene clusters. It shows that this method is superior in identifying functional related gene groups than other traditional methods.

  20. An epigenetic state associated with areas of gene duplication

    PubMed Central

    Gimelbrant, Alexander A.; Chess, Andrew

    2006-01-01

    Asynchronous DNA replication is an epigenetically determined feature found in all cases of monoallelic expression, including genomic imprinting, X-inactivation, and random monoallelic expression of autosomal genes such as immunoglobulins and olfactory receptor genes. Most genes of the latter class were identified in experiments focused on genes functioning in the chemosensory and immune systems. We performed an unbiased survey of asynchronous replication in the mouse genome, excluding known asynchronously replicated genes. Fully 10% (eight of 80) of the genes tested exhibited asynchronous replication. A common feature of the newly identified asynchronously replicated areas is their proximity to areas of tandem gene duplication. Testing of other clustered areas supported the idea that such regions are enriched with asynchronously replicated genes. PMID:16687731

  1. Revealing common disease mechanisms shared by tumors of different tissues of origin through semantic representation of genomic alterations and topic modeling.

    PubMed

    Chen, Vicky; Paisley, John; Lu, Xinghua

    2017-03-14

    Cancer is a complex disease driven by somatic genomic alterations (SGAs) that perturb signaling pathways and consequently cellular function. Identifying patterns of pathway perturbations would provide insights into common disease mechanisms shared among tumors, which is important for guiding treatment and predicting outcome. However, identifying perturbed pathways is challenging, because different tumors can have the same perturbed pathways that are perturbed by different SGAs. Here, we designed novel semantic representations that capture the functional similarity of distinct SGAs perturbing a common pathway in different tumors. Combining this representation with topic modeling would allow us to identify patterns in altered signaling pathways. We represented each gene with a vector of words describing its function, and we represented the SGAs of a tumor as a text document by pooling the words representing individual SGAs. We applied the nested hierarchical Dirichlet process (nHDP) model to a collection of tumors of 5 cancer types from TCGA. We identified topics (consisting of co-occurring words) representing the common functional themes of different SGAs. Tumors were clustered based on their topic associations, such that each cluster consists of tumors sharing common functional themes. The resulting clusters contained mixtures of cancer types, which indicates that different cancer types can share disease mechanisms. Survival analysis based on the clusters revealed significant differences in survival among the tumors of the same cancer type that were assigned to different clusters. The results indicate that applying topic modeling to semantic representations of tumors identifies patterns in the combinations of altered functional pathways in cancer.

  2. Discovering semantic features in the literature: a foundation for building functional associations

    PubMed Central

    Chagoyen, Monica; Carmona-Saez, Pedro; Shatkay, Hagit; Carazo, Jose M; Pascual-Montano, Alberto

    2006-01-01

    Background Experimental techniques such as DNA microarray, serial analysis of gene expression (SAGE) and mass spectrometry proteomics, among others, are generating large amounts of data related to genes and proteins at different levels. As in any other experimental approach, it is necessary to analyze these data in the context of previously known information about the biological entities under study. The literature is a particularly valuable source of information for experiment validation and interpretation. Therefore, the development of automated text mining tools to assist in such interpretation is one of the main challenges in current bioinformatics research. Results We present a method to create literature profiles for large sets of genes or proteins based on common semantic features extracted from a corpus of relevant documents. These profiles can be used to establish pair-wise similarities among genes, utilized in gene/protein classification or can be even combined with experimental measurements. Semantic features can be used by researchers to facilitate the understanding of the commonalities indicated by experimental results. Our approach is based on non-negative matrix factorization (NMF), a machine-learning algorithm for data analysis, capable of identifying local patterns that characterize a subset of the data. The literature is thus used to establish putative relationships among subsets of genes or proteins and to provide coherent justification for this clustering into subsets. We demonstrate the utility of the method by applying it to two independent and vastly different sets of genes. Conclusion The presented method can create literature profiles from documents relevant to sets of genes. The representation of genes as additive linear combinations of semantic features allows for the exploration of functional associations as well as for clustering, suggesting a valuable methodology for the validation and interpretation of high-throughput experimental data. PMID:16438716

  3. Algorithms of maximum likelihood data clustering with applications

    NASA Astrophysics Data System (ADS)

    Giada, Lorenzo; Marsili, Matteo

    2002-12-01

    We address the problem of data clustering by introducing an unsupervised, parameter-free approach based on maximum likelihood principle. Starting from the observation that data sets belonging to the same cluster share a common information, we construct an expression for the likelihood of any possible cluster structure. The likelihood in turn depends only on the Pearson's coefficient of the data. We discuss clustering algorithms that provide a fast and reliable approximation to maximum likelihood configurations. Compared to standard clustering methods, our approach has the advantages that (i) it is parameter free, (ii) the number of clusters need not be fixed in advance and (iii) the interpretation of the results is transparent. In order to test our approach and compare it with standard clustering algorithms, we analyze two very different data sets: time series of financial market returns and gene expression data. We find that different maximization algorithms produce similar cluster structures whereas the outcome of standard algorithms has a much wider variability.

  4. Molecular evidence of Burkholderia pseudomallei genotypes based on geographical distribution.

    PubMed

    Zulkefli, Noorfatin Jihan; Mariappan, Vanitha; Vellasamy, Kumutha Malar; Chong, Chun Wie; Thong, Kwai Lin; Ponnampalavanar, Sasheela; Vadivelu, Jamuna; Teh, Cindy Shuan Ju

    2016-01-01

    Background. Central intermediary metabolism (CIM) in bacteria is defined as a set of metabolic biochemical reactions within a cell, which is essential for the cell to survive in response to environmental perturbations. The genes associated with CIM are commonly found in both pathogenic and non-pathogenic strains. As these genes are involved in vital metabolic processes of bacteria, we explored the efficiency of the genes in genotypic characterization of Burkholderia pseudomallei isolates, compared with the established pulsed-field gel electrophoresis (PFGE) and multilocus sequence typing (MLST) schemes. Methods. Nine previously sequenced B. pseudomallei isolates from Malaysia were characterized by PFGE, MLST and CIM genes. The isolates were later compared to the other 39 B. pseudomallei strains, retrieved from GenBank using both MLST and sequence analysis of CIM genes. UniFrac and hierachical clustering analyses were performed using the results generated by both MLST and sequence analysis of CIM genes. Results. Genetic relatedness of nine Malaysian B. pseudomallei isolates and the other 39 strains was investigated. The nine Malaysian isolates were subtyped into six PFGE profiles, four MLST profiles and five sequence types based on CIM genes alignment. All methods demonstrated the clonality of OB and CB as well as CMS and THE. However, PFGE showed less than 70% similarity between a pair of morphology variants, OS and OB. In contrast, OS was identical to the soil isolate, MARAN. To have a better understanding of the genetic diversity of B. pseudomallei worldwide, we further aligned the sequences of genes used in MLST and genes associated with CIM for the nine Malaysian isolates and 39 B. pseudomallei strains from NCBI database. Overall, based on the CIM genes, the strains were subtyped into 33 profiles where majority of the strains from Asian countries were clustered together. On the other hand, MLST resolved the isolates into 31 profiles which formed three clusters. Hierarchical clustering using UniFrac distance suggested that the isolates from Australia were genetically distinct from the Asian isolates. Nevertheless, statistical significant differences were detected between isolates from Malaysia, Thailand and Australia. Discussion. Overall, PFGE showed higher discriminative power in clustering the nine Malaysian B. pseudomallei isolates and indicated its suitability for localized epidemiological study. Compared to MLST, CIM genes showed higher resolution in distinguishing those non-related strains and better clustering of strains from different geographical regions. A closer genetic relatedness of Malaysian isolates with all Asian strains in comparison to Australian strains was observed. This finding was supported by UniFrac analysis which resulted in geographical segregation between Australia and the Asian countries.

  5. Multiconstrained gene clustering based on generalized projections

    PubMed Central

    2010-01-01

    Background Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO) annotations and gene network structures. How to integrate multiple pieces of constraints for an optimal clustering solution still remains an unsolved problem. Results We propose a novel multiconstrained gene clustering (MGC) method within the generalized projection onto convex sets (POCS) framework used widely in image reconstruction. Each constraint is formulated as a corresponding set. The generalized projector iteratively projects the clustering solution onto these sets in order to find a consistent solution included in the intersection set that satisfies all constraints. Compared with previous MGC methods, POCS can integrate multiple constraints from different nature without distorting the original constraints. To evaluate the clustering solution, we also propose a new performance measure referred to as Gene Log Likelihood (GLL) that considers genes having more than one function and hence in more than one cluster. Comparative experimental results show that our POCS-based gene clustering method outperforms current state-of-the-art MGC methods. Conclusions The POCS-based MGC method can successfully combine multiple constraints from different nature for gene clustering. Also, the proposed GLL is an effective performance measure for the soft clustering solutions. PMID:20356386

  6. Identifying pathogenic processes by integrating microarray data with prior knowledge

    PubMed Central

    2014-01-01

    Background It is of great importance to identify molecular processes and pathways that are involved in disease etiology. Although there has been an extensive use of various high-throughput methods for this task, pathogenic pathways are still not completely understood. Often the set of genes or proteins identified as altered in genome-wide screens show a poor overlap with canonical disease pathways. These findings are difficult to interpret, yet crucial in order to improve the understanding of the molecular processes underlying the disease progression. We present a novel method for identifying groups of connected molecules from a set of differentially expressed genes. These groups represent functional modules sharing common cellular function and involve signaling and regulatory events. Specifically, our method makes use of Bayesian statistics to identify groups of co-regulated genes based on the microarray data, where external information about molecular interactions and connections are used as priors in the group assignments. Markov chain Monte Carlo sampling is used to search for the most reliable grouping. Results Simulation results showed that the method improved the ability of identifying correct groups compared to traditional clustering, especially for small sample sizes. Applied to a microarray heart failure dataset the method found one large cluster with several genes important for the structure of the extracellular matrix and a smaller group with many genes involved in carbohydrate metabolism. The method was also applied to a microarray dataset on melanoma cancer patients with or without metastasis, where the main cluster was dominated by genes related to keratinocyte differentiation. Conclusion Our method found clusters overlapping with known pathogenic processes, but also pointed to new connections extending beyond the classical pathways. PMID:24758699

  7. Variation in Fumonisin and Ochratoxin Production Associated with Differences in Biosynthetic Gene Content in Aspergillus niger and A. welwitschiae Isolates from Multiple Crop and Geographic Origins

    PubMed Central

    Susca, Antonia; Proctor, Robert H.; Morelli, Massimiliano; Haidukowski, Miriam; Gallo, Antonia; Logrieco, Antonio F.; Moretti, Antonio

    2016-01-01

    The fungi Aspergillus niger and A. welwitschiae are morphologically indistinguishable species used for industrial fermentation and for food and beverage production. The fungi also occur widely on food crops. Concerns about their safety have arisen with the discovery that some isolates of both species produce fumonisin (FB) and ochratoxin A (OTA) mycotoxins. Here, we examined FB and OTA production as well as the presence of genes responsible for synthesis of the mycotoxins in a collection of 92 A. niger/A. welwitschiae isolates from multiple crop and geographic origins. The results indicate that (i) isolates of both species differed in ability to produce the mycotoxins; (ii) FB-nonproducing isolates of A. niger had an intact fumonisin biosynthetic gene (fum) cluster; (iii) FB-nonproducing isolates of A. welwitschiae exhibited multiple patterns of fum gene deletion; and (iv) OTA-nonproducing isolates of both species lacked the ochratoxin A biosynthetic gene (ota) cluster. Analysis of genome sequence data revealed a single pattern of ota gene deletion in the two species. Phylogenetic analysis suggest that the simplest explanation for this is that ota cluster deletion occurred in a common ancestor of A. niger and A. welwitschiae, and subsequently both the intact and deleted cluster were retained as alternate alleles during divergence of the ancestor into descendent species. Finally, comparison of results from this and previous studies indicate that a majority of A. niger isolates and a minority of A. welwitschiae isolates can produce FBs, whereas, a minority of isolates of both species produce OTA. The comparison also suggested that the relative abundance of each species and frequency of FB/OTA-producing isolates can vary with crop and/or geographic origin. PMID:27667988

  8. Variation in Fumonisin and Ochratoxin Production Associated with Differences in Biosynthetic Gene Content in Aspergillus niger and A. welwitschiae Isolates from Multiple Crop and Geographic Origins.

    PubMed

    Susca, Antonia; Proctor, Robert H; Morelli, Massimiliano; Haidukowski, Miriam; Gallo, Antonia; Logrieco, Antonio F; Moretti, Antonio

    2016-01-01

    The fungi Aspergillus niger and A. welwitschiae are morphologically indistinguishable species used for industrial fermentation and for food and beverage production. The fungi also occur widely on food crops. Concerns about their safety have arisen with the discovery that some isolates of both species produce fumonisin (FB) and ochratoxin A (OTA) mycotoxins. Here, we examined FB and OTA production as well as the presence of genes responsible for synthesis of the mycotoxins in a collection of 92 A. niger/A. welwitschiae isolates from multiple crop and geographic origins. The results indicate that (i) isolates of both species differed in ability to produce the mycotoxins; (ii) FB-nonproducing isolates of A. niger had an intact fumonisin biosynthetic gene (fum) cluster; (iii) FB-nonproducing isolates of A. welwitschiae exhibited multiple patterns of fum gene deletion; and (iv) OTA-nonproducing isolates of both species lacked the ochratoxin A biosynthetic gene (ota) cluster. Analysis of genome sequence data revealed a single pattern of ota gene deletion in the two species. Phylogenetic analysis suggest that the simplest explanation for this is that ota cluster deletion occurred in a common ancestor of A. niger and A. welwitschiae, and subsequently both the intact and deleted cluster were retained as alternate alleles during divergence of the ancestor into descendent species. Finally, comparison of results from this and previous studies indicate that a majority of A. niger isolates and a minority of A. welwitschiae isolates can produce FBs, whereas, a minority of isolates of both species produce OTA. The comparison also suggested that the relative abundance of each species and frequency of FB/OTA-producing isolates can vary with crop and/or geographic origin.

  9. Effects of multiple founder populations on spatial genetic structure of reintroduced American martens.

    PubMed

    Williams, Bronwyn W; Scribner, Kim T

    2010-01-01

    Reintroductions and translocations are increasingly used to repatriate or increase probabilities of persistence for animal and plant species. Genetic and demographic characteristics of founding individuals and suitability of habitat at release sites are commonly believed to affect the success of these conservation programs. Genetic divergence among multiple source populations of American martens (Martes americana) and well documented introduction histories permitted analyses of post-introduction dispersion from release sites and development of genetic clusters in the Upper Peninsula (UP) of Michigan <50 years following release. Location and size of spatial genetic clusters and measures of individual-based autocorrelation were inferred using 11 microsatellite loci. We identified three genetic clusters in geographic proximity to original release locations. Estimated distances of effective gene flow based on spatial autocorrelation varied greatly among genetic clusters (30-90 km). Spatial contiguity of genetic clusters has been largely maintained with evidence for admixture primarily in localized regions, suggesting recent contact or locally retarded rates of gene flow. Data provide guidance for future studies of the effects of permeabilities of different land-cover and land-use features to dispersal and of other biotic and environmental factors that may contribute to the colonization process and development of spatial genetic associations.

  10. Biosynthesis of anatoxin-a and analogues (anatoxins) in cyanobacteria.

    PubMed

    Méjean, Annick; Paci, Guillaume; Gautier, Valérie; Ploux, Olivier

    2014-12-01

    Freshwater cyanobacteria produce secondary metabolites that are toxic to humans and animals, the so-called cyanotoxins. Among them, anatoxin-a and homoanatoxin-a are potent neurotoxins that are agonists of the nicotinic acetylcholine receptor. These alkaloids provoke a rapid death if ingested at low doses. Recently, the cluster of genes responsible for the biosynthesis of these toxins, the ana cluster, has been identified in Oscillatoria sp. PCC 6506, and a biosynthetic pathway was proposed. This biosynthesis was reconstituted in vitro using purified enzymes confirming the predicted pathway. One of the enzymes, AnaB a prolyl-acyl carrier protein oxidase, was crystallized and its three dimensional structure solved confirming its reaction mechanism. Three other ana clusters have now been identified and sequenced in other cyanobacteria. These clusters show similarities and some differences suggesting a common evolutionary origin. In particular, the cluster from Cylindrospermum stagnale PCC 7417, possesses an extra gene coding for an F420-dependent oxidoreductase that is likely involved in the biosynthesis of dihydroanatoxin-a. This review summarizes all these new data and discusses them in relation to the production of anatoxins in the environment. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. Motif-independent prediction of a secondary metabolism gene cluster using comparative genomics: application to sequenced genomes of Aspergillus and ten other filamentous fungal species.

    PubMed

    Takeda, Itaru; Umemura, Myco; Koike, Hideaki; Asai, Kiyoshi; Machida, Masayuki

    2014-08-01

    Despite their biological importance, a significant number of genes for secondary metabolite biosynthesis (SMB) remain undetected due largely to the fact that they are highly diverse and are not expressed under a variety of cultivation conditions. Several software tools including SMURF and antiSMASH have been developed to predict fungal SMB gene clusters by finding core genes encoding polyketide synthase, nonribosomal peptide synthetase and dimethylallyltryptophan synthase as well as several others typically present in the cluster. In this work, we have devised a novel comparative genomics method to identify SMB gene clusters that is independent of motif information of the known SMB genes. The method detects SMB gene clusters by searching for a similar order of genes and their presence in nonsyntenic blocks. With this method, we were able to identify many known SMB gene clusters with the core genes in the genomic sequences of 10 filamentous fungi. Furthermore, we have also detected SMB gene clusters without core genes, including the kojic acid biosynthesis gene cluster of Aspergillus oryzae. By varying the detection parameters of the method, a significant difference in the sequence characteristics was detected between the genes residing inside the clusters and those outside the clusters. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  12. Horizontal transfer of a large and highly toxic secondary metabolic gene cluster between fungi.

    PubMed

    Slot, Jason C; Rokas, Antonis

    2011-01-25

    Genes involved in intermediary and secondary metabolism in fungi are frequently physically linked or clustered. For example, in Aspergillus nidulans the entire pathway for the production of sterigmatocystin (ST), a highly toxic secondary metabolite and a precursor to the aflatoxins (AF), is located in a ∼54 kb, 23 gene cluster. We discovered that a complete ST gene cluster in Podospora anserina was horizontally transferred from Aspergillus. Phylogenetic analysis shows that most Podospora cluster genes are adjacent to or nested within Aspergillus cluster genes, although the two genera belong to different taxonomic classes. Furthermore, the Podospora cluster is highly conserved in content, sequence, and microsynteny with the Aspergillus ST/AF clusters and its intergenic regions contain 14 putative binding sites for AflR, the transcription factor required for activation of the ST/AF biosynthetic genes. Examination of ∼52,000 Podospora expressed sequence tags identified transcripts for 14 genes in the cluster, with several expressed at multiple life cycle stages. The presence of putative AflR-binding sites and the expression evidence for several cluster genes, coupled with the recent independent discovery of ST production in Podospora [1], suggest that this HGT event probably resulted in a functional cluster. Given the abundance of metabolic gene clusters in fungi, our finding that one of the largest known metabolic gene clusters moved intact between species suggests that such transfers might have significantly contributed to fungal metabolic diversity. PAPERFLICK: Copyright © 2011 Elsevier Ltd. All rights reserved.

  13. Genome-wide identification of physically clustered genes suggests chromatin-level co-regulation in male reproductive development in Arabidopsis thaliana

    PubMed Central

    Reimegård, Johan; Kundu, Snehangshu; Pendle, Ali; Irish, Vivian F.; Shaw, Peter

    2017-01-01

    Abstract Co-expression of physically linked genes occurs surprisingly frequently in eukaryotes. Such chromosomal clustering may confer a selective advantage as it enables coordinated gene regulation at the chromatin level. We studied the chromosomal organization of genes involved in male reproductive development in Arabidopsis thaliana. We developed an in-silico tool to identify physical clusters of co-regulated genes from gene expression data. We identified 17 clusters (96 genes) involved in stamen development and acting downstream of the transcriptional activator MS1 (MALE STERILITY 1), which contains a PHD domain associated with chromatin re-organization. The clusters exhibited little gene homology or promoter element similarity, and largely overlapped with reported repressive histone marks. Experiments on a subset of the clusters suggested a link between expression activation and chromatin conformation: qRT-PCR and mRNA in situ hybridization showed that the clustered genes were up-regulated within 48 h after MS1 induction; out of 14 chromatin-remodeling mutants studied, expression of clustered genes was consistently down-regulated only in hta9/hta11, previously associated with metabolic cluster activation; DNA fluorescence in situ hybridization confirmed that transcriptional activation of the clustered genes was correlated with open chromatin conformation. Stamen development thus appears to involve transcriptional activation of physically clustered genes through chromatin de-condensation. PMID:28175342

  14. The genes and enzymes for the catabolism of galactitol, D-tagatose, and related carbohydrates in Klebsiella oxytoca M5a1 and other enteric bacteria display convergent evolution.

    PubMed

    Shakeri-Garakani, A; Brinkkötter, A; Schmid, K; Turgut, S; Lengeler, J W

    2004-07-01

    Enteric bacteria (Enteriobacteriaceae) carry on their single chromosome about 4000 genes that all strains have in common (referred to here as "obligatory genes"), and up to 1300 "facultative" genes that vary from strain to strain and from species to species. In closely related species, obligatory and facultative genes are orthologous genes that are found at similar loci. We have analyzed a set of facultative genes involved in the degradation of the carbohydrates galactitol, D-tagatose, D-galactosamine and N-acetyl-galactosamine in various pathogenic and non-pathogenic strains of these bacteria. The four carbohydrates are transported into the cell by phosphotransferase (PTS) uptake systems, and are metabolized by closely related or even identical catabolic enzymes via pathways that share several intermediates. In about 60% of Escherichia coli strains the genes for galactitol degradation map to a gat operon at 46.8 min. In strains of Salmonella enterica, Klebsiella pneumoniae and K. oxytoca, the corresponding gat genes, although orthologous to their E. coli counterparts, are found at 70.7 min, clustered in a regulon together with three tag genes for the degradation of D-tagatose, an isomer of D-fructose. In contrast, in all the E. coli strains tested, this chromosomal site was found to be occupied by an aga/kba gene cluster for the degradation of D-galactosamine and N-acetyl-galactosamine. The aga/kba and the tag genes were paralogous either to the gat cluster or to the fru genes for degradation of D-fructose. Finally, in more then 90% of strains of both Klebsiella species, and in about 5% of the E. coli strains, two operons were found at 46.8 min that comprise paralogous genes for catabolism of the isomers D-arabinitol (genes atl or dal) and ribitol (genes rtl or rbt). In these strains gat genes were invariably absent from this location, and they were totally absent in S. enterica. These results strongly indicate that these various gene clusters and metabolic pathways have been subject to convergent evolution among the Enterobacteriaceae. This apparently involved recent horizontal gene transfer and recombination events, as indicated by major chromosomal rearrangements found in their immediate vicinity.

  15. Identifying and Assessing Interesting Subgroups in a Heterogeneous Population

    PubMed Central

    Lee, Woojoo; Alexeyenko, Andrey; Pernemalm, Maria; Guegan, Justine; Dessen, Philippe; Lazar, Vladimir; Lehtiö, Janne; Pawitan, Yudi

    2015-01-01

    Biological heterogeneity is common in many diseases and it is often the reason for therapeutic failures. Thus, there is great interest in classifying a disease into subtypes that have clinical significance in terms of prognosis or therapy response. One of the most popular methods to uncover unrecognized subtypes is cluster analysis. However, classical clustering methods such as k-means clustering or hierarchical clustering are not guaranteed to produce clinically interesting subtypes. This could be because the main statistical variability—the basis of cluster generation—is dominated by genes not associated with the clinical phenotype of interest. Furthermore, a strong prognostic factor might be relevant for a certain subgroup but not for the whole population; thus an analysis of the whole sample may not reveal this prognostic factor. To address these problems we investigate methods to identify and assess clinically interesting subgroups in a heterogeneous population. The identification step uses a clustering algorithm and to assess significance we use a false discovery rate- (FDR-) based measure. Under the heterogeneity condition the standard FDR estimate is shown to overestimate the true FDR value, but this is remedied by an improved FDR estimation procedure. As illustrations, two real data examples from gene expression studies of lung cancer are provided. PMID:26339613

  16. Transgenic over expression of nicotinic receptor alpha 5, alpha 3, and beta 4 subunit genes reduces ethanol intake in mice.

    PubMed

    Gallego, Xavier; Ruiz-Medina, Jessica; Valverde, Olga; Molas, Susanna; Robles, Noemí; Sabrià, Josefa; Crabbe, John C; Dierssen, Mara

    2012-05-01

    Abuse of alcohol and smoking are extensively co-morbid. Some studies suggest partial commonality of action of alcohol and nicotine mediated through nicotinic acetylcholine receptors (nAChRs). We tested mice with transgenic over expression of the alpha 5, alpha 3, beta 4 receptor subunit genes, which lie in a cluster on human chromosome 15, that were previously shown to have increased nicotine self-administration, for several responses to ethanol. Transgenic and wild-type mice did not differ in sensitivity to several acute behavioral responses to ethanol. However, transgenic mice drank less ethanol than wild-type in a two-bottle (ethanol vs. water) preference test. These results suggest a complex role for this receptor subunit gene cluster in the modulation of ethanol's as well as nicotine's effects. Copyright © 2012. Published by Elsevier Inc.

  17. Simultaneous clustering of gene expression data with clinical chemistry and pathological evaluations reveals phenotypic prototypes

    PubMed Central

    Bushel, Pierre R; Wolfinger, Russell D; Gibson, Greg

    2007-01-01

    Background Commonly employed clustering methods for analysis of gene expression data do not directly incorporate phenotypic data about the samples. Furthermore, clustering of samples with known phenotypes is typically performed in an informal fashion. The inability of clustering algorithms to incorporate biological data in the grouping process can limit proper interpretation of the data and its underlying biology. Results We present a more formal approach, the modk-prototypes algorithm, for clustering biological samples based on simultaneously considering microarray gene expression data and classes of known phenotypic variables such as clinical chemistry evaluations and histopathologic observations. The strategy involves constructing an objective function with the sum of the squared Euclidean distances for numeric microarray and clinical chemistry data and simple matching for histopathology categorical values in order to measure dissimilarity of the samples. Separate weighting terms are used for microarray, clinical chemistry and histopathology measurements to control the influence of each data domain on the clustering of the samples. The dynamic validity index for numeric data was modified with a category utility measure for determining the number of clusters in the data sets. A cluster's prototype, formed from the mean of the values for numeric features and the mode of the categorical values of all the samples in the group, is representative of the phenotype of the cluster members. The approach is shown to work well with a simulated mixed data set and two real data examples containing numeric and categorical data types. One from a heart disease study and another from acetaminophen (an analgesic) exposure in rat liver that causes centrilobular necrosis. Conclusion The modk-prototypes algorithm partitioned the simulated data into clusters with samples in their respective class group and the heart disease samples into two groups (sick and buff denoting samples having pain type representative of angina and non-angina respectively) with an accuracy of 79%. This is on par with, or better than, the assignment accuracy of the heart disease samples by several well-known and successful clustering algorithms. Following modk-prototypes clustering of the acetaminophen-exposed samples, informative genes from the cluster prototypes were identified that are descriptive of, and phenotypically anchored to, levels of necrosis of the centrilobular region of the rat liver. The biological processes cell growth and/or maintenance, amine metabolism, and stress response were shown to discern between no and moderate levels of acetaminophen-induced centrilobular necrosis. The use of well-known and traditional measurements directly in the clustering provides some guarantee that the resulting clusters will be meaningfully interpretable. PMID:17408499

  18. Development of phoH as a Novel Signature Gene for Assessing Marine Phage Diversity▿

    PubMed Central

    Goldsmith, Dawn B.; Crosti, Giuseppe; Dwivedi, Bhakti; McDaniel, Lauren D.; Varsani, Arvind; Suttle, Curtis A.; Weinbauer, Markus G.; Sandaa, Ruth-Anne; Breitbart, Mya

    2011-01-01

    Phages play a key role in the marine environment by regulating the transfer of energy between trophic levels and influencing global carbon and nutrient cycles. The diversity of marine phage communities remains difficult to characterize because of the lack of a signature gene common to all phages. Recent studies have demonstrated the presence of host-derived auxiliary metabolic genes in phage genomes, such as those belonging to the Pho regulon, which regulates phosphate uptake and metabolism under low-phosphate conditions. Among the completely sequenced phage genomes in GenBank, this study identified Pho regulon genes in nearly 40% of the marine phage genomes, while only 4% of nonmarine phage genomes contained these genes. While several Pho regulon genes were identified, phoH was the most prevalent, appearing in 42 out of 602 completely sequenced phage genomes. Phylogenetic analysis demonstrated that phage phoH sequences formed a cluster distinct from those of their bacterial hosts. PCR primers designed to amplify a region of the phoH gene were used to determine the diversity of phage phoH sequences throughout a depth profile in the Sargasso Sea and at six locations worldwide. phoH was present at all sites examined, and a high diversity of phoH sequences was recovered. Most phoH sequences belonged to clusters without any cultured representatives. Each depth and geographic location had a distinct phoH composition, although most phoH clusters were recovered from multiple sites. Overall, phoH is an effective signature gene for examining phage diversity in the marine environment. PMID:21926220

  19. Diametrical clustering for identifying anti-correlated gene clusters.

    PubMed

    Dhillon, Inderjit S; Marcotte, Edward M; Roshan, Usman

    2003-09-01

    Clustering genes based upon their expression patterns allows us to predict gene function. Most existing clustering algorithms cluster genes together when their expression patterns show high positive correlation. However, it has been observed that genes whose expression patterns are strongly anti-correlated can also be functionally similar. Biologically, this is not unintuitive-genes responding to the same stimuli, regardless of the nature of the response, are more likely to operate in the same pathways. We present a new diametrical clustering algorithm that explicitly identifies anti-correlated clusters of genes. Our algorithm proceeds by iteratively (i). re-partitioning the genes and (ii). computing the dominant singular vector of each gene cluster; each singular vector serving as the prototype of a 'diametric' cluster. We empirically show the effectiveness of the algorithm in identifying diametrical or anti-correlated clusters. Testing the algorithm on yeast cell cycle data, fibroblast gene expression data, and DNA microarray data from yeast mutants reveals that opposed cellular pathways can be discovered with this method. We present systems whose mRNA expression patterns, and likely their functions, oppose the yeast ribosome and proteosome, along with evidence for the inverse transcriptional regulation of a number of cellular systems.

  20. Old meets new: using interspecies interactions to detect secondary metabolite production in actinomycetes.

    PubMed

    Seyedsayamdost, Mohammad R; Traxler, Matthew F; Clardy, Jon; Kolter, Roberto

    2012-01-01

    Actinomycetes, a group of filamentous, Gram-positive bacteria, have long been a remarkable source of useful therapeutics. Recent genome sequencing and transcriptomic studies have shown that these bacteria, responsible for half of the clinically used antibiotics, also harbor a large reservoir of gene clusters, which have the potential to produce novel secreted small molecules. Yet, many of these clusters are not expressed under common culture conditions. One reason why these clusters have not been linked to a secreted small molecule lies in the way that actinomycetes have typically been studied: as pure cultures in nutrient-rich media that do not mimic the complex environments in which these bacteria evolved. New methods based on multispecies culture conditions provide an alternative approach to investigating the products of these gene clusters. We have recently implemented binary interspecies interaction assays to mine for new secondary metabolites and to study the underlying biology of interactinomycete interactions. Here, we describe the detailed biological and chemical methods comprising these studies. Copyright © 2012 Elsevier Inc. All rights reserved.

  1. Differentiating Botulinum Neurotoxin-Producing Clostridia with a Simple, Multiplex PCR Assay.

    PubMed

    Williamson, Charles H D; Vazquez, Adam J; Hill, Karen; Smith, Theresa J; Nottingham, Roxanne; Stone, Nathan E; Sobek, Colin J; Cocking, Jill H; Fernández, Rafael A; Caballero, Patricia A; Leiser, Owen P; Keim, Paul; Sahl, Jason W

    2017-09-15

    Diverse members of the genus Clostridium produce botulinum neurotoxins (BoNTs), which cause a flaccid paralysis known as botulism. While multiple species of clostridia produce BoNTs, the majority of human botulism cases have been attributed to Clostridium botulinum groups I and II. Recent comparative genomic studies have demonstrated the genomic diversity within these BoNT-producing species. This report introduces a multiplex PCR assay for differentiating members of C. botulinum group I, C. sporogenes , and two major subgroups within C. botulinum group II. Coding region sequences unique to each of the four species/subgroups were identified by in silico analyses of thousands of genome assemblies, and PCR primers were designed to amplify each marker. The resulting multiplex PCR assay correctly assigned 41 tested isolates to the appropriate species or subgroup. A separate PCR assay to determine the presence of the ntnh gene (a gene associated with the botulinum neurotoxin gene cluster) was developed and validated. The ntnh gene PCR assay provides information about the presence or absence of the botulinum neurotoxin gene cluster and the type of gene cluster present ( ha positive [ ha + ] or orfX + ). The increased availability of whole-genome sequence data and comparative genomic tools enabled the design of these assays, which provide valuable information for characterizing BoNT-producing clostridia. The PCR assays are rapid, inexpensive tests that can be applied to a variety of sample types to assign isolates to species/subgroups and to detect clostridia with botulinum neurotoxin gene ( bont ) clusters. IMPORTANCE Diverse clostridia produce the botulinum neurotoxin, one of the most potent known neurotoxins. In this study, a multiplex PCR assay was developed to differentiate clostridia that are most commonly isolated in connection with human botulism cases: C. botulinum group I, C. sporogenes , and two major subgroups within C. botulinum group II. Since BoNT-producing and nontoxigenic isolates can be found in each species, a PCR assay to determine the presence of the ntnh gene, which is a universally present component of bont gene clusters, and to provide information about the type ( ha + or orfX + ) of bont gene cluster present in a sample was also developed. The PCR assays provide simple, rapid, and inexpensive tools for screening uncharacterized isolates from clinical or environmental samples. The information provided by these assays can inform epidemiological studies, aid with identifying mixtures of isolates and unknown isolates in culture collections, and confirm the presence of bacteria of interest. Copyright © 2017 Williamson et al.

  2. Breakup of a homeobox cluster after genome duplication in teleosts

    PubMed Central

    Mulley, John F.; Chiu, Chi-hua; Holland, Peter W. H.

    2006-01-01

    Several families of homeobox genes are arranged in genomic clusters in metazoan genomes, including the Hox, ParaHox, NK, Rhox, and Iroquois gene clusters. The selective pressures responsible for maintenance of these gene clusters are poorly understood. The ParaHox gene cluster is evolutionarily conserved between amphioxus and human but is fragmented in teleost fishes. We show that two basal ray-finned fish, Polypterus and Amia, each possess an intact ParaHox cluster; this implies that the selective pressure maintaining clustering was lost after whole-genome duplication in teleosts. Cluster breakup is because of gene loss, not transposition or inversion, and the total number of ParaHox genes is the same in teleosts, human, mouse, and frog. We propose that this homeobox gene cluster is held together in chordates by the existence of interdigitated control regions that could be separated after locus duplication in the teleost fish. PMID:16801555

  3. Novel species including Mycobacterium fukienense sp. is found from tuberculosis patients in Fujian Province, China, using phylogenetic analysis of Mycobacterium chelonae/abscessus complex.

    PubMed

    Zhang, Yuan Yuan; Li, Yan Bing; Huang, Ming Xiang; Zhao, Xiu Qin; Zhang, Li Shui; Liu, Wen En; Wan, Kang Lin

    2013-11-01

    To identify the novel species 'Mycobacterium fukienense' sp. nov of Mycobacterium chelonae/abscessus complex from tuberculosis patients in Fujian Province, China. Five of 27 clinical Mycobacterium isolates (Cls) were previously identified as M. chelonae/abscessus complex by sequencing the hsp65, rpoB, 16S-23S rRNA internal transcribed spacer region (its), recA and sodA house-keeping genes commonly used to describe the molecular characteristics of Mycobacterium. Clinical Mycobacterium isolates were classified according to the gene sequence using a clustering analysis program. Sequence similarity within clusters and diversity between clusters were analyzed. The 5 isolates were identified with distinct sequences exhibiting 99.8% homology in the hsp65 gene. However, a complete lack of homology was observed among the sequences of the rpoB, 16S-23S rRNA internal transcribed spacer region (its), sodA, and recA genes as compared with the M. abscessus. Furthermore, no match for rpoB, sodA, and recA genes was identified among the published sequences. The novel species, Mycobacterium fukienense, is identified from tuberculosis patients in Fujian Province, China, which does not belong to any existing subspecies of M. chelonea/abscessus complex. Copyright © 2013 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.

  4. Prediction of epigenetically regulated genes in breast cancer cell lines

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Loss, Leandro A; Sadanandam, Anguraj; Durinck, Steffen

    Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines,more » which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fxed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis. Our two-step dimensionality reduction compressed 90% of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically signifcant negative correlation between methylation profles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identifed 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes. Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators.« less

  5. A Cluster of Cuticle Protein Genes of Drosophila Melanogaster at 65a: Sequence, Structure and Evolution

    PubMed Central

    Charles, J. P.; Chihara, C.; Nejad, S.; Riddiford, L. M.

    1997-01-01

    A 36-kb genomic DNA segment of the Drosophila melanogaster genome containing 12 clustered cuticle genes has been mapped and partially sequenced. The cluster maps at 65A 5-6 on the left arm of the third chromosome, in agreement with the previously determined location of a putative cluster encompassing the genes for the third instar larval cuticle proteins LCP5, LCP6 and LCP8. This cluster is the largest cuticle gene cluster discovered to date and shows a number of surprising features that explain in part the genetic complexity of the LCP5, LCP6 and LCP8 loci. The genes encoding LCP5 and LCP8 are multiple copy genes and the presence of extensive similarity in their coding regions gives the first evidence for gene conversion in cuticle genes. In addition, five genes in the cluster are intronless. Four of these five have arisen by retroposition. The other genes in the cluster have a single intron located at an unusual location for insect cuticle genes. PMID:9383064

  6. A cryptic pigment biosynthetic pathway uncovered by heterologous expression is essential for conidial development in Pestalotiopsis fici.

    PubMed

    Zhang, Peng; Wang, Xiuna; Fan, Aili; Zheng, Yanjing; Liu, Xingzhong; Wang, Shihua; Zou, Huixi; Oakley, Berl R; Keller, Nancy P; Yin, Wen-Bing

    2017-08-01

    Spore pigmentation is very common in the fungal kingdom. The best studied pigment in fungi is melanin which coats the surface of single cell spores. What and how pigments function in a fungal species with multiple cell conidia is poorly understood. Here, we identified and deleted a polyketide synthase (PKS) gene PfmaE and showed that it is essential for multicellular conidial pigmentation and development in a plant endophytic fungus, Pestalotiopsis fici. To further characterize the melanin pathway, we utilized an advanced Aspergillus nidulans heterologous system for the expression of the PKS PfmaE and the Pfma gene cluster. By structural elucidation of the pathway metabolite scytalone in A. nidulans, we provided chemical evidence that the Pfma cluster synthesizes DHN melanin. Combining genetic deletion and combinatorial gene expression of Pfma cluster genes, we determined that the putative reductase PfmaG and the PKS are sufficient for the synthesis of scytalone. Feeding scytalone back to the P. fici ΔPfmaE mutant restored pigmentation and multicellular adherence of the conidia. These results cement a growing understanding that pigments are essential not simply for protection of spores from biotic and abiotic stresses but also for spore structural development. © 2017 John Wiley & Sons Ltd.

  7. Isolation and purification of a new kalimantacin/batumin-related polyketide antibiotic and elucidation of its biosynthesis gene cluster.

    PubMed

    Mattheus, Wesley; Gao, Ling-Jie; Herdewijn, Piet; Landuyt, Bart; Verhaegen, Jan; Masschelein, Joleen; Volckaert, Guido; Lavigne, Rob

    2010-02-26

    Kal/bat, a polyketide, isolated to high purity (>95%) is characterized by strong and selective antibacterial activity against Staphylococcus species (minimum inhibitory concentration, 0.05 microg/mL), and no resistance was observed in strains already resistant to commonly used antibiotics. The kal/bat biosynthesis gene cluster was determined to a 62 kb genomic region of Pseudomonas fluorescens BCCM_ID9359. The kal/bat gene cluster consists of 16 open reading frames (ORF), encoding a hybrid PKS-NRPS system, extended with trans-acting tailoring functions. A full model for kal/bat biosynthesis is postulated and experimentally tested by gene inactivation, structural confirmation (using NMR spectroscopy), and complementation. The structural and microbiological study of biosynthetic kal/bat analogs revealed the importance of the carbamoyl group and 17-keto group for antibacterial activity. The mechanism of self-resistance lies within the production of an inactive intermediate, which is activated in a one-step enzymatic oxidation upon export. The genetic basis and biochemical elucidation of the biosynthesis pathway of this antibiotic will facilitate rational engineering for the design of novel structures with improved activities. This makes it a promising new therapeutic option to cope with multidrug-resistant clinical infections. Copyright 2010 Elsevier Ltd. All rights reserved.

  8. The Lepidoptera Odorant Binding Protein gene family: Gene gain and loss within the GOBP/PBP complex of moths and butterflies.

    PubMed

    Vogt, Richard G; Große-Wilde, Ewald; Zhou, Jing-Jiang

    2015-07-01

    Butterflies and moths differ significantly in their daily activities: butterflies are diurnal while moths are largely nocturnal or crepuscular. This life history difference is presumably reflected in their sensory biology, and especially the balance between the use of chemical versus visual signals. Odorant Binding Proteins (OBP) are a class of insect proteins, at least some of which are thought to orchestrate the transfer of odor molecules within an olfactory sensillum (olfactory organ), between the air and odor receptor proteins (ORs) on the olfactory neurons. A Lepidoptera specific subclass of OBPs are the GOBPs and PBPs; these were the first OBPs studied and have well documented associations with olfactory sensilla. We have used the available genomes of two moths, Manduca sexta and Bombyx mori, and two butterflies, Danaus plexippus and Heliconius melpomene, to characterize the GOBP/PBP genes, attempting to identify gene orthologs and document specific gene gain and loss. First, we identified the full repertoire of OBPs in the M. sexta genome, and compared these with the full repertoire of OBPs from the other three lepidopteran genomes, the OBPs of Drosophila melanogaster and select OBPs from other Lepidoptera. We also evaluated the tissue specific expression of the M. sexta OBPs using an available RNAseq databases. In the four lepidopteran species, GOBP2 and all PBPs reside in single gene clusters; in two species GOBP1 is documented to be nearby, about 100 kb from the cluster; all GOBP/PBP genes share a common gene structure indicating a common origin. As such, the GOBP/PBP genes form a gene complex. Our findings suggest that (1) the lepidopteran GOBP/PBP complex is a monophyletic lineage with origins deep within Lepidoptera phylogeny, (2) within this lineage PBP gene evolution is much more dynamic than GOBP gene evolution, and (3) butterflies may have lost a PBP gene that plays an important role in moth pheromone detection, correlating with a shift from olfactory (moth) to visual (butterfly) communication, at least regarding long distance mate recognition. These findings will be clarified by additional lepidopteran genomic data, but the observation that moths and butterflies share most of the PBP/GOBP genes suggests that they also share common chemosensory-based behavioral pathways. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. Tissue-specific promoter utilisation of the kallikrein-related peptidase genes, KLK5 and KLK7, and cellular localisation of the encoded proteins suggest roles in exocrine pancreatic function.

    PubMed

    Dong, Ying; Matigian, Nick; Harvey, Tracey J; Samaratunga, Hemamali; Hooper, John D; Clements, Judith A

    2008-02-01

    Abstract Tissue kallikrein (kallikrein 1) was first identified in pancreas and is the namesake of the kallikrein-related peptidase (KLK) family. KLK1 and the other 14 members of the human KLK family are encoded by 15 serine protease genes clustered at chromosome 19q13.4. Our Northern blot analysis of 19 normal human tissues for expression of KLK4 to KLK15 identified pancreas as a common expression site for the gene cluster spanning KLK5 to KLK13, as well as for KLK15 which is located adjacent to KLK1. Consistent with previous reports detailing the ability of KLK genes to generate organ- and disease-specific transcripts, detailed molecular and in silico analyses indicated that KLK5 and KLK7 generate transcripts in pancreas variant from those in skin or ovary. Consistently, we identified in the promoters of these KLK genes motifs which conform with consensus binding sites for transcription factors conferring pancreatic expression. In addition, immunohistochemical analysis revealed predominant localisation of KLK5 and KLK7 in acinar cells of the exocrine pancreas, suggesting roles for these enzymes in digestion. Our data also support expression patterns derived from gene duplication events in the human KLK cluster. These findings suggest that, in addition to KLK1, other related KLK enzymes will function in the exocrine pancreas.

  10. Sexuality generates diversity in the aflatoxin gene cluster: evidence on a global scale

    USDA-ARS?s Scientific Manuscript database

    The worldwide costs associated with aflatoxin monitoring and crop losses are in the hundreds of millions of dollars. Aflatoxins also account for considerable health risks, even in countries where food contamination is regulated. Aspergillus flavus and A. parasiticus are the most common agents of af...

  11. A Systems Biology Analysis Unfolds the Molecular Pathways and Networks of Two Proteobacteria in Spaceflight and Simulated Microgravity Conditions

    NASA Astrophysics Data System (ADS)

    Roy, Raktim; Phani Shilpa, P.; Bagh, Sangram

    2016-09-01

    Bacteria are important organisms for space missions due to their increased pathogenesis in microgravity that poses risks to the health of astronauts and for projected synthetic biology applications at the space station. We understand little about the effect, at the molecular systems level, of microgravity on bacteria, despite their significant incidence. In this study, we proposed a systems biology pipeline and performed an analysis on published gene expression data sets from multiple seminal studies on Pseudomonas aeruginosa and Salmonella enterica serovar Typhimurium under spaceflight and simulated microgravity conditions. By applying gene set enrichment analysis on the global gene expression data, we directly identified a large number of new, statistically significant cellular and metabolic pathways involved in response to microgravity. Alteration of metabolic pathways in microgravity has rarely been reported before, whereas in this analysis metabolic pathways are prevalent. Several of those pathways were found to be common across studies and species, indicating a common cellular response in microgravity. We clustered genes based on their expression patterns using consensus non-negative matrix factorization. The genes from different mathematically stable clusters showed protein-protein association networks with distinct biological functions, suggesting the plausible functional or regulatory network motifs in response to microgravity. The newly identified pathways and networks showed connection with increased survival of pathogens within macrophages, virulence, and antibiotic resistance in microgravity. Our work establishes a systems biology pipeline and provides an integrated insight into the effect of microgravity at the molecular systems level.

  12. Comparative transcriptome analysis of pepper (Capsicum annuum) revealed common regulons in multiple stress conditions and hormone treatments.

    PubMed

    Lee, Sanghyeob; Choi, Doil

    2013-09-01

    Global transcriptome analysis revealed common regulons for biotic/abiotic stresses, and some of these regulons encoding signaling components in both stresses were newly identified in this study. In this study, we aimed to identify plant responses to multiple stress conditions and discover the common regulons activated under a variety of stress conditions. Global transcriptome analysis revealed that salicylic acid (SA) may affect the activation of abiotic stress-responsive genes in pepper. Our data indicate that methyl jasmonate (MeJA) and ethylene (ET)-responsive genes were primarily activated by biotic stress, while abscisic acid (ABA)-responsive genes were activated under both types of stresses. We also identified differentially expressed gene (DEG) responses to specific stress conditions. Biotic stress induces more DEGs than those induced by abiotic and hormone applications. The clustering analysis using DEGs indicates that there are common regulons for biotic or abiotic stress conditions. Although SA and MeJA have an antagonistic effect on gene expression levels, SA and MeJA show a largely common regulation as compared to the regulation at the DEG expression level induced by other hormones. We also monitored the expression profiles of DEG encoding signaling components. Twenty-two percent of these were commonly expressed in both stress conditions. The importance of this study is that several genes commonly regulated by both stress conditions may have future applications for creating broadly stress-tolerant pepper plants. This study revealed that there are complex regulons in pepper plant to both biotic and abiotic stress conditions.

  13. Clusternomics: Integrative context-dependent clustering for heterogeneous datasets

    PubMed Central

    Wernisch, Lorenz

    2017-01-01

    Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm. PMID:29036190

  14. Clusternomics: Integrative context-dependent clustering for heterogeneous datasets.

    PubMed

    Gabasova, Evelina; Reid, John; Wernisch, Lorenz

    2017-10-01

    Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm.

  15. Identification of lethal cluster of genes in the yeast transcription network

    NASA Astrophysics Data System (ADS)

    Rho, K.; Jeong, H.; Kahng, B.

    2006-05-01

    Identification of essential or lethal genes would be one of the ultimate goals in drug designs. Here we introduce an in silico method to select the cluster with a high population of lethal genes, called lethal cluster, through microarray assay. We construct a gene transcription network based on the microarray expression level. Links are added one by one in the descending order of the Pearson correlation coefficients between two genes. As the link density p increases, two meaningful link densities pm and ps are observed. At pm, which is smaller than the percolation threshold, the number of disconnected clusters is maximum, and the lethal genes are highly concentrated in a certain cluster that needs to be identified. Thus the deletion of all genes in that cluster could efficiently lead to a lethal inviable mutant. This lethal cluster can be identified by an in silico method. As p increases further beyond the percolation threshold, the power law behavior in the degree distribution of a giant cluster appears at ps. We measure the degree of each gene at ps. With the information pertaining to the degrees of each gene at ps, we return to the point pm and calculate the mean degree of genes of each cluster. We find that the lethal cluster has the largest mean degree.

  16. Genetic interrelations in the actinomycin biosynthetic gene clusters of Streptomyces antibioticus IMRU 3720 and Streptomyces chrysomallus ATCC11523, producers of actinomycin X and actinomycin C

    PubMed Central

    Crnovčić, Ivana; Rückert, Christian; Semsary, Siamak; Lang, Manuel; Kalinowski, Jörn; Keller, Ullrich

    2017-01-01

    Sequencing the actinomycin (acm) biosynthetic gene cluster of Streptomyces antibioticus IMRU 3720, which produces actinomycin X (Acm X), revealed 20 genes organized into a highly similar framework as in the bi-armed acm C biosynthetic gene cluster of Streptomyces chrysomallus but without an attached additional extra arm of orthologues as in the latter. Curiously, the extra arm of the S. chrysomallus gene cluster turned out to perfectly match the single arm of the S. antibioticus gene cluster in the same order of orthologues including the the presence of two pseudogenes, scacmM and scacmN, encoding a cytochrome P450 and its ferredoxin, respectively. Orthologues of the latter genes were both missing in the principal arm of the S. chrysomallus acm C gene cluster. All orthologues of the extra arm showed a G +C-contents different from that of their counterparts in the principal arm. Moreover, the similarities of translation products from the extra arm were all higher to the corresponding translation products of orthologue genes from the S. antibioticus acm X gene cluster than to those encoded by the principal arm of their own gene cluster. This suggests that the duplicated structure of the S. chrysomallus acm C biosynthetic gene cluster evolved from previous fusion between two one-armed acm gene clusters each from a different genetic background. However, while scacmM and scacmN in the extra arm of the S. chrysomallus acm C gene cluster are mutated and therefore are non-functional, their orthologues saacmM and saacmN in the S. antibioticus acm C gene cluster show no defects seemingly encoding active enzymes with functions specific for Acm X biosynthesis. Both acm biosynthetic gene clusters lack a kynurenine-3-monooxygenase gene necessary for biosynthesis of 3-hydroxy-4-methylanthranilic acid, the building block of the Acm chromophore, which suggests participation of a genome-encoded relevant monooxygenase during Acm biosynthesis in both S. chrysomallus and S. antibioticus. PMID:28435299

  17. Genetic interrelations in the actinomycin biosynthetic gene clusters of Streptomyces antibioticus IMRU 3720 and Streptomyces chrysomallus ATCC11523, producers of actinomycin X and actinomycin C.

    PubMed

    Crnovčić, Ivana; Rückert, Christian; Semsary, Siamak; Lang, Manuel; Kalinowski, Jörn; Keller, Ullrich

    2017-01-01

    Sequencing the actinomycin ( acm ) biosynthetic gene cluster of Streptomyces antibioticus IMRU 3720, which produces actinomycin X (Acm X), revealed 20 genes organized into a highly similar framework as in the bi-armed acm C biosynthetic gene cluster of Streptomyces chrysomallus but without an attached additional extra arm of orthologues as in the latter. Curiously, the extra arm of the S. chrysomallus gene cluster turned out to perfectly match the single arm of the S. antibioticus gene cluster in the same order of orthologues including the the presence of two pseudogenes, scacmM and scacmN , encoding a cytochrome P450 and its ferredoxin, respectively. Orthologues of the latter genes were both missing in the principal arm of the S. chrysomallus acm C gene cluster. All orthologues of the extra arm showed a G +C-contents different from that of their counterparts in the principal arm. Moreover, the similarities of translation products from the extra arm were all higher to the corresponding translation products of orthologue genes from the S. antibioticus acm X gene cluster than to those encoded by the principal arm of their own gene cluster. This suggests that the duplicated structure of the S. chrysomallus acm C biosynthetic gene cluster evolved from previous fusion between two one-armed acm gene clusters each from a different genetic background. However, while scacmM and scacmN in the extra arm of the S. chrysomallus acm C gene cluster are mutated and therefore are non-functional, their orthologues saacmM and saacmN in the S. antibioticus acm C gene cluster show no defects seemingly encoding active enzymes with functions specific for Acm X biosynthesis. Both acm biosynthetic gene clusters lack a kynurenine-3-monooxygenase gene necessary for biosynthesis of 3-hydroxy-4-methylanthranilic acid, the building block of the Acm chromophore, which suggests participation of a genome-encoded relevant monooxygenase during Acm biosynthesis in both S. chrysomallus and S. antibioticus .

  18. Detecting false positive sequence homology: a machine learning approach.

    PubMed

    Fujimoto, M Stanley; Suvorov, Anton; Jensen, Nicholas O; Clement, Mark J; Bybee, Seth M

    2016-02-24

    Accurate detection of homologous relationships of biological sequences (DNA or amino acid) amongst organisms is an important and often difficult task that is essential to various evolutionary studies, ranging from building phylogenies to predicting functional gene annotations. There are many existing heuristic tools, most commonly based on bidirectional BLAST searches that are used to identify homologous genes and combine them into two fundamentally distinct classes: orthologs and paralogs. Due to only using heuristic filtering based on significance score cutoffs and having no cluster post-processing tools available, these methods can often produce multiple clusters constituting unrelated (non-homologous) sequences. Therefore sequencing data extracted from incomplete genome/transcriptome assemblies originated from low coverage sequencing or produced by de novo processes without a reference genome are susceptible to high false positive rates of homology detection. In this paper we develop biologically informative features that can be extracted from multiple sequence alignments of putative homologous genes (orthologs and paralogs) and further utilized in context of guided experimentation to verify false positive outcomes. We demonstrate that our machine learning method trained on both known homology clusters obtained from OrthoDB and randomly generated sequence alignments (non-homologs), successfully determines apparent false positives inferred by heuristic algorithms especially among proteomes recovered from low-coverage RNA-seq data. Almost ~42 % and ~25 % of predicted putative homologies by InParanoid and HaMStR respectively were classified as false positives on experimental data set. Our process increases the quality of output from other clustering algorithms by providing a novel post-processing method that is both fast and efficient at removing low quality clusters of putative homologous genes recovered by heuristic-based approaches.

  19. High-Resolution Analysis by Whole-Genome Sequencing of an International Lineage (Sequence Type 111) of Pseudomonas aeruginosa Associated with Metallo-Carbapenemases in the United Kingdom.

    PubMed

    Turton, Jane F; Wright, Laura; Underwood, Anthony; Witney, Adam A; Chan, Yuen-Ting; Al-Shahib, Ali; Arnold, Catherine; Doumith, Michel; Patel, Bharat; Planche, Timothy D; Green, Jonathan; Holliman, Richard; Woodford, Neil

    2015-08-01

    Whole-genome sequencing (WGS) was carried out on 87 isolates of sequence type 111 (ST-111) of Pseudomonas aeruginosa collected between 2005 and 2014 from 65 patients and 12 environmental isolates from 24 hospital laboratories across the United Kingdom on an Illumina HiSeq instrument. Most isolates (73) carried VIM-2, but others carried IMP-1 or IMP-13 (5) or NDM-1 (1); one isolate had VIM-2 and IMP-18, and 7 carried no metallo-beta-lactamase (MBL) gene. Single nucleotide polymorphism analysis divided the isolates into distinct clusters; the NDM-1 isolate was an outlier, and the IMP isolates and 6/7 MBL-negative isolates clustered separately from the main set of 73 VIM-2 isolates. Within the VIM-2 set, there were at least 3 distinct clusters, including a tightly clustered set of isolates from 3 hospital laboratories consistent with an outbreak from a single introduction that was quickly brought under control and a much broader set dominated by isolates from a long-running outbreak in a London hospital likely seeded from an environmental source, requiring different control measures; isolates from 7 other hospital laboratories in London and southeast England were also included. Bayesian evolutionary analysis indicated that all the isolates shared a common ancestor dating back ∼50 years (1960s), with the main VIM-2 set separating approximately 20 to 30 years ago. Accessory gene profiling revealed blocks of genes associated with particular clusters, with some having high similarity (≥95%) to bacteriophage genes. WGS of widely found international lineages such as ST-111 provides the necessary resolution to inform epidemiological investigations and intervention policies. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  20. Fungal Community Shifts in Structure and Function across a Boreal Forest Fire Chronosequence

    PubMed Central

    Santalahti, Minna; Pumpanen, Jukka; Köster, Kajar; Berninger, Frank; Raffaello, Tommaso; Jumpponen, Ari; Asiegbu, Fred O.; Heinonsalo, Jussi

    2015-01-01

    Forest fires are a common natural disturbance in forested ecosystems and have a large impact on the microbial communities in forest soils. The response of soil fungal communities to forest fire is poorly documented. Here, we investigated fungal community structure and function across a 152-year boreal forest fire chronosequence using high-throughput sequencing of the internal transcribed spacer 2 (ITS2) region and a functional gene array (GeoChip). Our results demonstrate that the boreal forest soil fungal community was most diverse soon after a fire disturbance and declined over time. The differences in the fungal communities were explained by changes in the abundance of basidiomycetes and ascomycetes. Ectomycorrhizal (ECM) fungi contributed to the increase in basidiomycete abundance over time, with the operational taxonomic units (OTUs) representing the genera Cortinarius and Piloderma dominating in abundance. Hierarchical cluster analysis by using gene signal intensity revealed that the sites with different fire histories formed separate clusters, suggesting differences in the potential to maintain essential biogeochemical soil processes. The site with the greatest biological diversity had also the most diverse genes. The genes involved in organic matter degradation in the mature forest, in which ECM fungi were the most abundant, were as common in the youngest site, in which saprotrophic fungi had a relatively higher abundance. This study provides insight into the impact of fire disturbance on soil fungal community dynamics. PMID:26341215

  1. Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters.

    PubMed

    Lukashin, A V; Fuchs, R

    2001-05-01

    Cluster analysis of genome-wide expression data from DNA microarray hybridization studies has proved to be a useful tool for identifying biologically relevant groupings of genes and samples. In the present paper, we focus on several important issues related to clustering algorithms that have not yet been fully studied. We describe a simple and robust algorithm for the clustering of temporal gene expression profiles that is based on the simulated annealing procedure. In general, this algorithm guarantees to eventually find the globally optimal distribution of genes over clusters. We introduce an iterative scheme that serves to evaluate quantitatively the optimal number of clusters for each specific data set. The scheme is based on standard approaches used in regular statistical tests. The basic idea is to organize the search of the optimal number of clusters simultaneously with the optimization of the distribution of genes over clusters. The efficiency of the proposed algorithm has been evaluated by means of a reverse engineering experiment, that is, a situation in which the correct distribution of genes over clusters is known a priori. The employment of this statistically rigorous test has shown that our algorithm places greater than 90% genes into correct clusters. Finally, the algorithm has been tested on real gene expression data (expression changes during yeast cell cycle) for which the fundamental patterns of gene expression and the assignment of genes to clusters are well understood from numerous previous studies.

  2. Heterologous expression of pikromycin biosynthetic gene cluster using Streptomyces artificial chromosome system.

    PubMed

    Pyeon, Hye-Rim; Nah, Hee-Ju; Kang, Seung-Hoon; Choi, Si-Sun; Kim, Eung-Soo

    2017-05-31

    Heterologous expression of biosynthetic gene clusters of natural microbial products has become an essential strategy for titer improvement and pathway engineering of various potentially-valuable natural products. A Streptomyces artificial chromosomal conjugation vector, pSBAC, was previously successfully applied for precise cloning and tandem integration of a large polyketide tautomycetin (TMC) biosynthetic gene cluster (Nah et al. in Microb Cell Fact 14(1):1, 2015), implying that this strategy could be employed to develop a custom overexpression scheme of natural product pathway clusters present in actinomycetes. To validate the pSBAC system as a generally-applicable heterologous overexpression system for a large-sized polyketide biosynthetic gene cluster in Streptomyces, another model polyketide compound, the pikromycin biosynthetic gene cluster, was preciously cloned and heterologously expressed using the pSBAC system. A unique HindIII restriction site was precisely inserted at one of the border regions of the pikromycin biosynthetic gene cluster within the chromosome of Streptomyces venezuelae, followed by site-specific recombination of pSBAC into the flanking region of the pikromycin gene cluster. Unlike the previous cloning process, one HindIII site integration step was skipped through pSBAC modification. pPik001, a pSBAC containing the pikromycin biosynthetic gene cluster, was directly introduced into two heterologous hosts, Streptomyces lividans and Streptomyces coelicolor, resulting in the production of 10-deoxymethynolide, a major pikromycin derivative. When two entire pikromycin biosynthetic gene clusters were tandemly introduced into the S. lividans chromosome, overproduction of 10-deoxymethynolide and the presence of pikromycin, which was previously not detected, were both confirmed. Moreover, comparative qRT-PCR results confirmed that the transcription of pikromycin biosynthetic genes was significantly upregulated in S. lividans containing tandem clusters of pikromycin biosynthetic gene clusters. The 60 kb pikromycin biosynthetic gene cluster was isolated in a single integration pSBAC vector. Introduction of the pikromycin biosynthetic gene cluster into the pikromycin non-producing strains resulted in higher pikromycin production. The utility of the pSBAC system as a precise cloning tool for large-sized biosynthetic gene clusters was verified through heterologous expression of the pikromycin biosynthetic gene cluster. Moreover, this pSBAC-driven heterologous expression strategy was confirmed to be an ideal approach for production of low and inconsistent natural products such as pikromycin in S. venezuelae, implying that this strategy could be employed for development of a custom overexpression scheme of natural product biosynthetic gene clusters in actinomycetes.

  3. Assembly and features of secondary metabolite biosynthetic gene clusters in Streptomyces ansochromogenes.

    PubMed

    Zhong, Xingyu; Tian, Yuqing; Niu, Guoqing; Tan, Huarong

    2013-07-01

    A draft genome sequence of Streptomyces ansochromogenes 7100 was generated using 454 sequencing technology. In combination with local BLAST searches and gap filling techniques, a comprehensive antiSMASH-based method was adopted to assemble the secondary metabolite biosynthetic gene clusters in the draft genome of S. ansochromogenes. A total of at least 35 putative gene clusters were identified and assembled. Transcriptional analysis showed that 20 of the 35 gene clusters were expressed in either or all of the three different media tested, whereas the other 15 gene clusters were silent in all three different media. This study provides a comprehensive method to identify and assemble secondary metabolite biosynthetic gene clusters in draft genomes of Streptomyces, and will significantly promote functional studies of these secondary metabolite biosynthetic gene clusters.

  4. Genomic insight into pathogenicity of dematiaceous fungus Corynespora cassiicola

    PubMed Central

    Looi, Hong Keat; Toh, Yue Fen; Yew, Su Mei; Na, Shiang Ling; Tan, Yung-Chie; Chong, Pei-Sin; Khoo, Jia-Shiun; Yee, Wai-Yan; Ng, Kee Peng

    2017-01-01

    Corynespora cassiicola is a common plant pathogen that causes leaf spot disease in a broad range of crop, and it heavily affect rubber trees in Malaysia (Hsueh, 2011; Nghia et al., 2008). The isolation of UM 591 from a patient’s contact lens indicates the pathogenic potential of this dematiaceous fungus in human. However, the underlying factors that contribute to the opportunistic cross-infection have not been fully studied. We employed genome sequencing and gene homology annotations in attempt to identify these factors in UM 591 using data obtained from publicly available bioinformatics databases. The assembly size of UM 591 genome is 41.8 Mbp, and a total of 13,531 (≥99 bp) genes have been predicted. UM 591 is enriched with genes that encode for glycoside hydrolases, carbohydrate esterases, auxiliary activity enzymes and cell wall degrading enzymes. Virulent genes comprising of CAZymes, peptidases, and hypervirulence-associated cutinases were found to be present in the fungal genome. Comparative analysis result shows that UM 591 possesses higher number of carbohydrate esterases family 10 (CE10) CAZymes compared to other species of fungi in this study, and these enzymes hydrolyses wide range of carbohydrate and non-carbohydrate substrates. Putative melanin, siderophore, ent-kaurene, and lycopene biosynthesis gene clusters are predicted, and these gene clusters denote that UM 591 are capable of protecting itself from the UV and chemical stresses, allowing it to adapt to different environment. Putative sterigmatocystin, HC-toxin, cercosporin, and gliotoxin biosynthesis gene cluster are predicted. This finding have highlighted the necrotrophic and invasive nature of UM 591. PMID:28149676

  5. The Core and Accessory Genomes of Burkholderia pseudomallei: Implications for Human Melioidosis

    PubMed Central

    Lin, Chi Ho; Karuturi, R. Krishna M.; Wuthiekanun, Vanaporn; Tuanyok, Apichai; Chua, Hui Hoon; Ong, Catherine; Paramalingam, Sivalingam Suppiah; Tan, Gladys; Tang, Lynn; Lau, Gary; Ooi, Eng Eong; Woods, Donald; Feil, Edward; Peacock, Sharon J.; Tan, Patrick

    2008-01-01

    Natural isolates of Burkholderia pseudomallei (Bp), the causative agent of melioidosis, can exhibit significant ecological flexibility that is likely reflective of a dynamic genome. Using whole-genome Bp microarrays, we examined patterns of gene presence and absence across 94 South East Asian strains isolated from a variety of clinical, environmental, or animal sources. 86% of the Bp K96243 reference genome was common to all the strains representing the Bp “core genome”, comprising genes largely involved in essential functions (eg amino acid metabolism, protein translation). In contrast, 14% of the K96243 genome was variably present across the isolates. This Bp accessory genome encompassed multiple genomic islands (GIs), paralogous genes, and insertions/deletions, including three distinct lipopolysaccharide (LPS)-related gene clusters. Strikingly, strains recovered from cases of human melioidosis clustered on a tree based on accessory gene content, and were significantly more likely to harbor certain GIs compared to animal and environmental isolates. Consistent with the inference that the GIs may contribute to pathogenesis, experimental mutation of BPSS2053, a GI gene, reduced microbial adherence to human epithelial cells. Our results suggest that the Bp accessory genome is likely to play an important role in microbial adaptation and virulence. PMID:18927621

  6. Genome wide in silico characterization of Dof gene families of pigeonpea (Cajanus cajan (L) Millsp.).

    PubMed

    Malviya, N; Gupta, S; Singh, V K; Yadav, M K; Bisht, N C; Sarangi, B K; Yadav, D

    2015-02-01

    The DNA binding with One Finger (Dof) protein is a plant specific transcription factor involved in the regulation of wide range of processes. The analysis of whole genome sequence of pigeonpea has identified 38 putative Dof genes (CcDof) distributed on 8 chromosomes. A total of 17 out of 38 CcDof genes were found to be intronless. A comprehensive in silico characterization of CcDof gene family including the gene structure, chromosome location, protein motif, phylogeny, gene duplication and functional divergence has been attempted. The phylogenetic analysis resulted in 3 major clusters with closely related members in phylogenetic tree revealed common motif distribution. The in silico cis-regulatory element analysis revealed functional diversity with predominance of light responsive and stress responsive elements indicating the possibility of these CcDof genes to be associated with photoperiodic control and biotic and abiotic stress. The duplication pattern showed that tandem duplication is predominant over segmental duplication events. The comparative phylogenetic analysis of these Dof proteins along with 78 soybean, 36 Arabidopsis and 30 rice Dof proteins revealed 7 major clusters. Several groups of orthologs and paralogs were identified based on phylogenetic tree constructed. Our study provides useful information for functional characterization of CcDof genes.

  7. Ancient origin of placental expression in the growth hormone genes of anthropoid primates

    PubMed Central

    Papper, Zack; Jameson, Natalie M.; Romero, Roberto; Weckle, Amy L.; Mittal, Pooja; Benirschke, Kurt; Santolaya-Forgas, Joaquin; Uddin, Monica; Haig, David; Goodman, Morris; Wildman, Derek E.

    2009-01-01

    In anthropoid primates, growth hormone (GH) genes have undergone at least 2 independent locus expansions, one in platyrrhines (New World monkeys) and another in catarrhines (Old World monkeys and apes). In catarrhines, the GH cluster has a pituitary-expressed gene called GH1; the remaining GH genes include placental GHs and placental lactogens. Here, we provide cDNA sequence evidence that the platyrrhine GH cluster also includes at least 3 placenta expressed genes and phylogenetic evidence that placenta expressed anthropoid GH genes have undergone strong adaptive evolution, whereas pituitary-expressed GH genes have faced strict functional constraint. Our phylogenetic evidence also points to lineage-specific gene gain and loss in early placental mammalian evolution, with at least three copies of the GH gene present at the time of the last common ancestor (LCA) of primates, rodents, and laurasiatherians. Anthropoid primates and laurasiatherians share gene descendants of one of these three copies, whereas rodents and strepsirrhine primates each maintain a separate copy. Eight of the amino-acid replacements that occurred on the lineage leading to the LCA of extant anthropoids have been implicated in GH signaling at the maternal-fetal interface. Thus, placental expression of GH may have preceded the separate series of GH gene duplications that occurred in catarrhines and platyrrhines (i.e., the roles played by placenta-expressed GHs in human pregnancy may have a longer evolutionary history than previously appreciated). PMID:19805162

  8. Ancient origin of placental expression in the growth hormone genes of anthropoid primates.

    PubMed

    Papper, Zack; Jameson, Natalie M; Romero, Roberto; Weckle, Amy L; Mittal, Pooja; Benirschke, Kurt; Santolaya-Forgas, Joaquin; Uddin, Monica; Haig, David; Goodman, Morris; Wildman, Derek E

    2009-10-06

    In anthropoid primates, growth hormone (GH) genes have undergone at least 2 independent locus expansions, one in platyrrhines (New World monkeys) and another in catarrhines (Old World monkeys and apes). In catarrhines, the GH cluster has a pituitary-expressed gene called GH1; the remaining GH genes include placental GHs and placental lactogens. Here, we provide cDNA sequence evidence that the platyrrhine GH cluster also includes at least 3 placenta expressed genes and phylogenetic evidence that placenta expressed anthropoid GH genes have undergone strong adaptive evolution, whereas pituitary-expressed GH genes have faced strict functional constraint. Our phylogenetic evidence also points to lineage-specific gene gain and loss in early placental mammalian evolution, with at least three copies of the GH gene present at the time of the last common ancestor (LCA) of primates, rodents, and laurasiatherians. Anthropoid primates and laurasiatherians share gene descendants of one of these three copies, whereas rodents and strepsirrhine primates each maintain a separate copy. Eight of the amino-acid replacements that occurred on the lineage leading to the LCA of extant anthropoids have been implicated in GH signaling at the maternal-fetal interface. Thus, placental expression of GH may have preceded the separate series of GH gene duplications that occurred in catarrhines and platyrrhines (i.e., the roles played by placenta-expressed GHs in human pregnancy may have a longer evolutionary history than previously appreciated).

  9. Lineage-specific evolution of cnidarian Wnt ligands.

    PubMed

    Hensel, Katrin; Lotan, Tamar; Sanders, Steve M; Cartwright, Paulyn; Frank, Uri

    2014-09-01

    We have studied the evolution of Wnt genes in cnidarians and the expression pattern of all Wnt ligands in the hydrozoan Hydractinia echinata. Current views favor a scenario in which 12 Wnt sub-families were jointly inherited by cnidarians and bilaterians from their last common ancestor. Our phylogenetic analyses clustered all medusozoan genes in distinct, well-supported clades, but many orthologous relationships between medusozoan Wnts and anthozoan and bilaterian Wnt genes were poorly supported. Only seven anthozoan genes, Wnt2, Wnt4, Wnt5, Wnt6, Wnt 10, Wnt11, and Wnt16 were recovered with strong support with bilaterian genes and of those, only the Wnt2, Wnt5, Wnt11, and Wnt16 clades also included medusozoan genes. Although medusozoan Wnt8 genes clustered with anthozoan and bilaterian genes, this was not well supported. In situ hybridization studies revealed poor conservation of expression patterns of putative Wnt orthologs within Cnidaria. In polyps, only Wnt1, Wnt3, and Wnt7 were expressed at the same position in the studied cnidarian models Hydra, Hydractinia, and Nematostella. Different expression patterns are consistent with divergent functions. Our data do not fully support previous assertions regarding Wnt gene homology, and suggest a more complex history of Wnt family genes than previously suggested. This includes high rates of sequence divergence and lineage-specific duplications of Wnt genes within medusozoans, followed by functional divergence over evolutionary time scales. © 2014 Wiley Periodicals, Inc.

  10. Supervised group Lasso with applications to microarray data analysis

    PubMed Central

    Ma, Shuangge; Song, Xiao; Huang, Jian

    2007-01-01

    Background A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure. Results We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data. Conclusion We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods. PMID:17316436

  11. Modularity of Plant Metabolic Gene Clusters: A Trio of Linked Genes That Are Collectively Required for Acylation of Triterpenes in Oat[W][OA

    PubMed Central

    Mugford, Sam T.; Louveau, Thomas; Melton, Rachel; Qi, Xiaoquan; Bakht, Saleha; Hill, Lionel; Tsurushima, Tetsu; Honkanen, Suvi; Rosser, Susan J.; Lomonossoff, George P.; Osbourn, Anne

    2013-01-01

    Operon-like gene clusters are an emerging phenomenon in the field of plant natural products. The genes encoding some of the best-characterized plant secondary metabolite biosynthetic pathways are scattered across plant genomes. However, an increasing number of gene clusters encoding the synthesis of diverse natural products have recently been reported in plant genomes. These clusters have arisen through the neo-functionalization and relocation of existing genes within the genome, and not by horizontal gene transfer from microbes. The reasons for clustering are not yet clear, although this form of gene organization is likely to facilitate co-inheritance and co-regulation. Oats (Avena spp) synthesize antimicrobial triterpenoids (avenacins) that provide protection against disease. The synthesis of these compounds is encoded by a gene cluster. Here we show that a module of three adjacent genes within the wider biosynthetic gene cluster is required for avenacin acylation. Through the characterization of these genes and their encoded proteins we present a model of the subcellular organization of triterpenoid biosynthesis. PMID:23532069

  12. Prokaryotic Gene Clusters: A Rich Toolbox for Synthetic Biology

    PubMed Central

    Fischbach, Michael; Voigt, Christopher A.

    2014-01-01

    Bacteria construct elaborate nanostructures, obtain nutrients and energy from diverse sources, synthesize complex molecules, and implement signal processing to react to their environment. These complex phenotypes require the coordinated action of multiple genes, which are often encoded in a contiguous region of the genome, referred to as a gene cluster. Gene clusters sometimes contain all of the genes necessary and sufficient for a particular function. As an evolutionary mechanism, gene clusters facilitate the horizontal transfer of the complete function between species. Here, we review recent work on a number of clusters whose functions are relevant to biotechnology. Engineering these clusters has been hindered by their regulatory complexity, the need to balance the expression of many genes, and a lack of tools to design and manipulate DNA at this scale. Advances in synthetic biology will enable the large-scale bottom-up engineering of the clusters to optimize their functions, wake up cryptic clusters, or to transfer them between organisms. Understanding and manipulating gene clusters will move towards an era of genome engineering, where multiple functions can be “mixed-and-matched” to create a designer organism. PMID:21154668

  13. GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data.

    PubMed

    Schulz, Tizian; Stoye, Jens; Doerr, Daniel

    2018-05-08

    Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.

  14. MPIGeneNet: Parallel Calculation of Gene Co-Expression Networks on Multicore Clusters.

    PubMed

    Gonzalez-Dominguez, Jorge; Martin, Maria J

    2017-10-10

    In this work we present MPIGeneNet, a parallel tool that applies Pearson's correlation and Random Matrix Theory to construct gene co-expression networks. It is based on the state-of-the-art sequential tool RMTGeneNet, which provides networks with high robustness and sensitivity at the expenses of relatively long runtimes for large scale input datasets. MPIGeneNet returns the same results as RMTGeneNet but improves the memory management, reduces the I/O cost, and accelerates the two most computationally demanding steps of co-expression network construction by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on two different systems using three typical input datasets shows that MPIGeneNet is significantly faster than RMTGeneNet. As an example, our tool is up to 175.41 times faster on a cluster with eight nodes, each one containing two 12-core Intel Haswell processors. Source code of MPIGeneNet, as well as a reference manual, are available at https://sourceforge.net/projects/mpigenenet/.

  15. Finding approximate gene clusters with Gecko 3.

    PubMed

    Winter, Sascha; Jahn, Katharina; Wehner, Stefanie; Kuchenbecker, Leon; Marz, Manja; Stoye, Jens; Böcker, Sebastian

    2016-11-16

    Gene-order-based comparison of multiple genomes provides signals for functional analysis of genes and the evolutionary process of genome organization. Gene clusters are regions of co-localized genes on genomes of different species. The rapid increase in sequenced genomes necessitates bioinformatics tools for finding gene clusters in hundreds of genomes. Existing tools are often restricted to few (in many cases, only two) genomes, and often make restrictive assumptions such as short perfect conservation, conserved gene order or monophyletic gene clusters. We present Gecko 3, an open-source software for finding gene clusters in hundreds of bacterial genomes, that comes with an easy-to-use graphical user interface. The underlying gene cluster model is intuitive, can cope with low degrees of conservation as well as misannotations and is complemented by a sound statistical evaluation. To evaluate the biological benefit of Gecko 3 and to exemplify our method, we search for gene clusters in a dataset of 678 bacterial genomes using Synechocystis sp. PCC 6803 as a reference. We confirm detected gene clusters reviewing the literature and comparing them to a database of operons; we detect two novel clusters, which were confirmed by publicly available experimental RNA-Seq data. The computational analysis is carried out on a laptop computer in <40 min. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Molecular insight into the association between cartilage regeneration and ear wound healing in genetic mouse models: targeting new genes in regeneration.

    PubMed

    Rai, Muhammad Farooq; Schmidt, Eric J; McAlinden, Audrey; Cheverud, James M; Sandell, Linda J

    2013-11-06

    Tissue regeneration is a complex trait with few genetic models available. Mouse strains LG/J and MRL are exceptional healers. Using recombinant inbred strains from a large (LG/J, healer) and small (SM/J, nonhealer) intercross, we have previously shown a positive genetic correlation between ear wound healing, knee cartilage regeneration, and protection from osteoarthritis. We hypothesize that a common set of genes operates in tissue healing and articular cartilage regeneration. Taking advantage of archived histological sections from recombinant inbred strains, we analyzed expression of candidate genes through branched-chain DNA technology directly from tissue lysates. We determined broad-sense heritability of candidates, Pearson correlation of candidates with healing phenotypes, and Ward minimum variance cluster analysis for strains. A bioinformatic assessment of allelic polymorphisms within and near candidate genes was also performed. The expression of several candidates was significantly heritable among strains. Although several genes correlated with both ear wound healing and cartilage healing at a marginal level, the expression of four genes representing DNA repair (Xrcc2, Pcna) and Wnt signaling (Axin2, Wnt16) pathways was significantly positively correlated with both phenotypes. Cluster analysis accurately classified healers and nonhealers for seven out of eight strains based on gene expression. Specific sequence differences between LG/J and SM/J were identified as potential causal polymorphisms. Our study suggests a common genetic basis between tissue healing and osteoarthritis susceptibility. Mapping genetic variations causing differences in diverse healing responses in multiple tissues may reveal generic healing processes in pursuit of new therapeutic targets designed to induce or enhance regeneration and, potentially, protection from osteoarthritis.

  17. Molecular Epidemiology of Vancomycin-Resistant Enterococcus faecium: a Prospective, Multicenter Study in South American Hospitals▿

    PubMed Central

    Panesso, Diana; Reyes, Jinnethe; Rincón, Sandra; Díaz, Lorena; Galloway-Peña, Jessica; Zurita, Jeannete; Carrillo, Carlos; Merentes, Altagracia; Guzmán, Manuel; Adachi, Javier A.; Murray, Barbara E.; Arias, Cesar A.

    2010-01-01

    Enterococcus faecium has emerged as an important nosocomial pathogen worldwide, and this trend has been associated with the dissemination of a genetic lineage designated clonal cluster 17 (CC17). Enterococcal isolates were collected prospectively (2006 to 2008) from 32 hospitals in Colombia, Ecuador, Perú, and Venezuela and subjected to antimicrobial susceptibility testing. Genotyping was performed with all vancomycin-resistant E. faecium (VREfm) isolates by pulsed-field gel electrophoresis (PFGE) and multilocus sequence typing. All VREfm isolates were evaluated for the presence of 16 putative virulence genes (14 fms genes, the esp gene of E. faecium [espEfm], and the hyl gene of E. faecium [hylEfm]) and plasmids carrying the fms20-fms21 (pilA), hylEfm, and vanA genes. Of 723 enterococcal isolates recovered, E. faecalis was the most common (78%). Vancomycin resistance was detected in 6% of the isolates (74% of which were E. faecium). Eleven distinct PFGE types were found among the VREfm isolates, with most belonging to sequence types 412 and 18. The ebpAEfm-ebpBEfm-ebpCEfm (pilB) and fms11-fms19-fms16 clusters were detected in all VREfm isolates from the region, whereas espEfm and hylEfm were detected in 69% and 23% of the isolates, respectively. The fms20-fms21 (pilA) cluster, which encodes a putative pilus-like protein, was found on plasmids from almost all VREfm isolates and was sometimes found to coexist with hylEfm and the vanA gene cluster. The population genetics of VREfm in South America appear to resemble those of such strains in the United States in the early years of the CC17 epidemic. The overwhelming presence of plasmids encoding putative virulence factors and vanA genes suggests that E. faecium from the CC17 genogroup may disseminate in the region in the coming years. PMID:20220167

  18. Discrimination of multilocus sequence typing-based Campylobacter jejuni subgroups by MALDI-TOF mass spectrometry.

    PubMed

    Zautner, Andreas Erich; Masanta, Wycliffe Omurwa; Tareen, Abdul Malik; Weig, Michael; Lugert, Raimond; Groß, Uwe; Bader, Oliver

    2013-11-07

    Campylobacter jejuni, the most common bacterial pathogen causing gastroenteritis, shows a wide genetic diversity. Previously, we demonstrated by the combination of multi locus sequence typing (MLST)-based UPGMA-clustering and analysis of 16 genetic markers that twelve different C. jejuni subgroups can be distinguished. Among these are two prominent subgroups. The first subgroup contains the majority of hyperinvasive strains and is characterized by a dimeric form of the chemotaxis-receptor Tlp7(m+c). The second has an extended amino acid metabolism and is characterized by the presence of a periplasmic asparaginase (ansB) and gamma-glutamyl-transpeptidase (ggt). Phyloproteomic principal component analysis (PCA) hierarchical clustering of MALDI-TOF based intact cell mass spectrometry (ICMS) spectra was able to group particular C. jejuni subgroups of phylogenetic related isolates in distinct clusters. Especially the aforementioned Tlp7(m+c)(+) and ansB+/ ggt+ subgroups could be discriminated by PCA. Overlay of ICMS spectra of all isolates led to the identification of characteristic biomarker ions for these specific C. jejuni subgroups. Thus, mass peak shifts can be used to identify the C. jejuni subgroup with an extended amino acid metabolism. Although the PCA hierarchical clustering of ICMS-spectra groups the tested isolates into a different order as compared to MLST-based UPGMA-clustering, the isolates of the indicator-groups form predominantly coherent clusters. These clusters reflect phenotypic aspects better than phylogenetic clustering, indicating that the genes corresponding to the biomarker ions are phylogenetically coupled to the tested marker genes. Thus, PCA clustering could be an additional tool for analyzing the relatedness of bacterial isolates.

  19. Tracing common origins of Genomic Islands in prokaryotes based on genome signature analyses.

    PubMed

    van Passel, Mark Wj

    2011-09-01

    Horizontal gene transfer constitutes a powerful and innovative force in evolution, but often little is known about the actual origins of transferred genes. Sequence alignments are generally of limited use in tracking the original donor, since still only a small fraction of the total genetic diversity is thought to be uncovered. Alternatively, approaches based on similarities in the genome specific relative oligonucleotide frequencies do not require alignments. Even though the exact origins of horizontally transferred genes may still not be established using these compositional analyses, it does suggest that compositionally very similar regions are likely to have had a common origin. These analyses have shown that up to a third of large acquired gene clusters that reside in the same genome are compositionally very similar, indicative of a shared origin. This brings us closer to uncovering the original donors of horizontally transferred genes, and could help in elucidating possible regulatory interactions between previously unlinked sequences.

  20. Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.

    PubMed

    Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C

    2015-10-01

    Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  1. Subgrouping of ESBL-producing Escherichia coli from animal and human sources: an approach to quantify the distribution of ESBL types between different reservoirs.

    PubMed

    Valentin, Lars; Sharp, Hannah; Hille, Katja; Seibt, Uwe; Fischer, Jennie; Pfeifer, Yvonne; Michael, Geovana Brenner; Nickel, Silke; Schmiedel, Judith; Falgenhauer, Linda; Friese, Anika; Bauerfeind, Rolf; Roesler, Uwe; Imirzalioglu, Can; Chakraborty, Trinad; Helmuth, Reiner; Valenza, Giuseppe; Werner, Guido; Schwarz, Stefan; Guerra, Beatriz; Appel, Bernd; Kreienbrock, Lothar; Käsbohrer, Annemarie

    2014-10-01

    Escherichia (E.) coli producing extended-spectrum beta-lactamases (ESBLs) are an increasing problem for public health. The success of ESBLs may be due to spread of ESBL-producing bacterial clones, transfer of ESBL gene-carrying plasmids or exchange of ESBL encoding genes on mobile elements. This makes it difficult to identify transmission routes and sources for ESBL-producing bacteria. The objectives of this study were to compare the distribution of genotypic and phenotypic properties of E. coli isolates from different animal and human sources collected in studies in the scope of the national research project RESET. ESBL-producing E. coli from two longitudinal and four cross-sectional studies in broiler, swine and cattle farms, a cross-sectional and a case-control study in humans and diagnostic isolates from humans and animals were used. In the RESET consortium, all laboratories followed harmonized methodologies for antimicrobial susceptibility testing, confirmation of the ESBL phenotype, specific PCR assays for the detection of bla(TEM), bla(CTX), and bla(SHV) genes and sequence analysis of the complete ESBL gene as well as a multiplex PCR for the detection of the four major phylogenetic groups of E. coli. Most ESBL genes were found in both, human and non-human populations but quantitative differences for distinct ESBL-types were detectable. The enzymes CTX-M-1 (63.3% of all animal isolates, 29.3% of all human isolates), CTX-M-15 (17.7% vs. 48.0%) and CTX-M-14 (5.3% vs. 8.7%) were the most common ones. More than 70% of the animal isolates and more than 50% of the human isolates contained the broadly distributed ESBL genes bla(CTX-M-1), bla(CTX-M-15), or the combinations bla(SHV-12)+bla(TEM) or bla(CTX-M-1)+bla(TEM). While the majority of animal isolates carried bla(CTX-M-1) (37.5%) or the combination bla(CTX-M-1)+bla(TEM) (25.8%), this was the case for only 16.7% and 12.6%, respectively, of the human isolates. In contrast, 28.2% of the human isolates carried bla(CTX-M-15) compared to 10.8% of the animal isolates. When grouping data by ESBL types and phylogroups bla(CTX-M-1) genes, mostly combined with phylogroup A or B1, were detected frequently in all settings. In contrast, bla(CTX-M-15) genes common in human and animal populations were mainly combined with phylogroup A, but not with the more virulent phylogroup B2 with the exception of companion animals, where a few isolates were detectable. When E. coli subtype definition included ESBL types, phylogenetic grouping and antimicrobial susceptibility data, the proportion of isolates allocated to common clusters was markedly reduced. Nevertheless, relevant proportions of same subtypes were detected in isolates from the human and livestock and companion animal populations included in this study, suggesting exchange of bacteria or bacterial genes between these populations or a common reservoir. In addition, these results clearly showed that there is some similarity between ESBL genes, and bacterial properties in isolates from the different populations. Finally, our current approach provides good insight into common and population-specific clusters, which can be used as a basis for the selection of ESBL-producing isolates from interesting clusters for further detailed characterizations, e.g. by whole genome sequencing. Copyright © 2014 The Authors. Published by Elsevier GmbH.. All rights reserved.

  2. Differentiation of Trichophyton rubrum clinical isolates from Japanese and Chinese patients by randomly amplified polymorphic DNA and DNA sequence analysis of the non-transcribed spacer region of the rRNA gene.

    PubMed

    Yang, Xiumin; Sugita, Takashi; Takashima, Masako; Hiruma, Masataro; Li, Ruoyu; Sudo, Hajime; Ogawa, Hideoki; Ikeda, Shigaku

    2009-04-01

    Trichophyton rubrum is the most common pathogen causing dermatophytosis worldwide. Recent genetic investigations showed that the microorganism originated in Africa and then spread to Europe and North America via Asia. We investigated the intraspecific diversity of T. rubrum isolated from two closely located Asian countries, Japan and China. A total of 150 clinical isolates of T. rubrum obtained from Japanese and Chinese patients were analyzed by randomly amplified polymorphic DNA (RAPD) and DNA sequence analysis of the non-transcribed spacer (NTS) region in the rRNA gene. RAPD analysis divided the 150 strains into two major clusters, A and B. Of the Japanese isolates, 30% belonged to cluster A and 70% belonged to cluster B, whereas 91% of the Chinese isolates were in cluster A. The NTS region of the rRNA gene was divided into four major groups (I-IV) based on DNA sequencing. The majority of Japanese isolates were type IV (51%), and the majority of Chinese isolates were type III (75%). These results suggest that although Japan and China are neighboring countries, the origins of T. rubrum isolates from these countries may not be identical. These findings provide information useful for tracing the global transmission routes of T. rubrum.

  3. Weighted gene co-expression network analysis of gene modules for the prognosis of esophageal cancer.

    PubMed

    Zhang, Cong; Sun, Qian

    2017-06-01

    Esophageal cancer is a common malignant tumor, whose pathogenesis and prognosis factors are not fully understood. This study aimed to discover the gene clusters that have similar functions and can be used to predict the prognosis of esophageal cancer. The matched microarray and RNA sequencing data of 185 patients with esophageal cancer were downloaded from The Cancer Genome Atlas (TCGA), and gene co-expression networks were built without distinguishing between squamous carcinoma and adenocarcinoma. The result showed that 12 modules were associated with one or more survival data such as recurrence status, recurrence time, vital status or vital time. Furthermore, survival analysis showed that 5 out of the 12 modules were related to progression-free survival (PFS) or overall survival (OS). As the most important module, the midnight blue module with 82 genes was related to PFS, apart from the patient age, tumor grade, primary treatment success, and duration of smoking and tumor histological type. Gene ontology enrichment analysis revealed that "glycoprotein binding" was the top enriched function of midnight blue module genes. Additionally, the blue module was the exclusive gene clusters related to OS. Platelet activating factor receptor (PTAFR) and feline Gardner-Rasheed (FGR) were the top hub genes in both modeling datasets and the STRING protein interaction database. In conclusion, our study provides novel insights into the prognosis-associated genes and screens out candidate biomarkers for esophageal cancer.

  4. Candidatus Frankia Datiscae Dg1, the Actinobacterial Microsymbiont of Datisca glomerata, Expresses the Canonical nod Genes nodABC in Symbiosis with Its Host Plant

    PubMed Central

    Persson, Tomas; Battenberg, Kai; Demina, Irina V.; Vigil-Stenman, Theoden; Vanden Heuvel, Brian; Pujic, Petar; Facciotti, Marc T.; Wilbanks, Elizabeth G.; O'Brien, Anna; Fournier, Pascale; Cruz Hernandez, Maria Antonia; Mendoza Herrera, Alberto; Médigue, Claudine; Normand, Philippe; Pawlowski, Katharina; Berry, Alison M.

    2015-01-01

    Frankia strains are nitrogen-fixing soil actinobacteria that can form root symbioses with actinorhizal plants. Phylogenetically, symbiotic frankiae can be divided into three clusters, and this division also corresponds to host specificity groups. The strains of cluster II which form symbioses with actinorhizal Rosales and Cucurbitales, thus displaying a broad host range, show suprisingly low genetic diversity and to date can not be cultured. The genome of the first representative of this cluster, Candidatus Frankia datiscae Dg1 (Dg1), a microsymbiont of Datisca glomerata, was recently sequenced. A phylogenetic analysis of 50 different housekeeping genes of Dg1 and three published Frankia genomes showed that cluster II is basal among the symbiotic Frankia clusters. Detailed analysis showed that nodules of D. glomerata, independent of the origin of the inoculum, contain several closely related cluster II Frankia operational taxonomic units. Actinorhizal plants and legumes both belong to the nitrogen-fixing plant clade, and bacterial signaling in both groups involves the common symbiotic pathway also used by arbuscular mycorrhizal fungi. However, so far, no molecules resembling rhizobial Nod factors could be isolated from Frankia cultures. Alone among Frankia genomes available to date, the genome of Dg1 contains the canonical nod genes nodA, nodB and nodC known from rhizobia, and these genes are arranged in two operons which are expressed in D. glomerata nodules. Furthermore, Frankia Dg1 nodC was able to partially complement a Rhizobium leguminosarum A34 nodC::Tn5 mutant. Phylogenetic analysis showed that Dg1 Nod proteins are positioned at the root of both α- and β-rhizobial NodABC proteins. NodA-like acyl transferases were found across the phylum Actinobacteria, but among Proteobacteria only in nodulators. Taken together, our evidence indicates an Actinobacterial origin of rhizobial Nod factors. PMID:26020781

  5. Drivers of genetic diversity in secondary metabolic gene clusters within a fungal species

    PubMed Central

    Lind, Abigail L.; Wisecaver, Jennifer H.; Lameiras, Catarina; Wiemann, Philipp; Palmer, Jonathan M.; Keller, Nancy P.; Rodrigues, Fernando; Goldman, Gustavo H.

    2017-01-01

    Filamentous fungi produce a diverse array of secondary metabolites (SMs) critical for defense, virulence, and communication. The metabolic pathways that produce SMs are found in contiguous gene clusters in fungal genomes, an atypical arrangement for metabolic pathways in other eukaryotes. Comparative studies of filamentous fungal species have shown that SM gene clusters are often either highly divergent or uniquely present in one or a handful of species, hampering efforts to determine the genetic basis and evolutionary drivers of SM gene cluster divergence. Here, we examined SM variation in 66 cosmopolitan strains of a single species, the opportunistic human pathogen Aspergillus fumigatus. Investigation of genome-wide within-species variation revealed 5 general types of variation in SM gene clusters: nonfunctional gene polymorphisms; gene gain and loss polymorphisms; whole cluster gain and loss polymorphisms; allelic polymorphisms, in which different alleles corresponded to distinct, nonhomologous clusters; and location polymorphisms, in which a cluster was found to differ in its genomic location across strains. These polymorphisms affect the function of representative A. fumigatus SM gene clusters, such as those involved in the production of gliotoxin, fumigaclavine, and helvolic acid as well as the function of clusters with undefined products. In addition to enabling the identification of polymorphisms, the detection of which requires extensive genome-wide synteny conservation (e.g., mobile gene clusters and nonhomologous cluster alleles), our approach also implicated multiple underlying genetic drivers, including point mutations, recombination, and genomic deletion and insertion events as well as horizontal gene transfer from distant fungi. Finally, most of the variants that we uncover within A. fumigatus have been previously hypothesized to contribute to SM gene cluster diversity across entire fungal classes and phyla. We suggest that the drivers of genetic diversity operating within a fungal species shown here are sufficient to explain SM cluster macroevolutionary patterns. PMID:29149178

  6. Comparison of two schemes for automatic keyword extraction from MEDLINE for functional gene clustering.

    PubMed

    Liu, Ying; Ciliax, Brian J; Borges, Karin; Dasigi, Venu; Ram, Ashwin; Navathe, Shamkant B; Dingledine, Ray

    2004-01-01

    One of the key challenges of microarray studies is to derive biological insights from the unprecedented quatities of data on gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the nature of the functional links among genes within the derived clusters. However, the quality of the keyword lists extracted from biomedical literature for each gene significantly affects the clustering results. We extracted keywords from MEDLINE that describes the most prominent functions of the genes, and used the resulting weights of the keywords as feature vectors for gene clustering. By analyzing the resulting cluster quality, we compared two keyword weighting schemes: normalized z-score and term frequency-inverse document frequency (TFIDF). The best combination of background comparison set, stop list and stemming algorithm was selected based on precision and recall metrics. In a test set of four known gene groups, a hierarchical algorithm correctly assigned 25 of 26 genes to the appropriate clusters based on keywords extracted by the TDFIDF weighting scheme, but only 23 og 26 with the z-score method. To evaluate the effectiveness of the weighting schemes for keyword extraction for gene clusters from microarray profiles, 44 yeast genes that are differentially expressed during the cell cycle were used as a second test set. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords had higher purity, lower entropy, and higher mutual information than those produced from normalized z-score weighted keywords. The optimized algorithms should be useful for sorting genes from microarray lists into functionally discrete clusters.

  7. Characterization of polyploid wheat genomic diversity using a high-density 90 000 single nucleotide polymorphism array

    PubMed Central

    Wang, Shichen; Wong, Debbie; Forrest, Kerrie; Allen, Alexandra; Chao, Shiaoman; Huang, Bevan E; Maccaferri, Marco; Salvi, Silvio; Milner, Sara G; Cattivelli, Luigi; Mastrangelo, Anna M; Whan, Alex; Stephen, Stuart; Barker, Gary; Wieseke, Ralf; Plieske, Joerg; International Wheat Genome Sequencing Consortium; Lillemo, Morten; Mather, Diane; Appels, Rudi; Dolferus, Rudy; Brown-Guedira, Gina; Korol, Abraham; Akhunova, Alina R; Feuillet, Catherine; Salse, Jerome; Morgante, Michele; Pozniak, Curtis; Luo, Ming-Cheng; Dvorak, Jan; Morell, Matthew; Dubcovsky, Jorge; Ganal, Martin; Tuberosa, Roberto; Lawley, Cindy; Mikoulitch, Ivan; Cavanagh, Colin; Edwards, Keith J; Hayden, Matthew; Akhunov, Eduard

    2014-01-01

    High-density single nucleotide polymorphism (SNP) genotyping arrays are a powerful tool for studying genomic patterns of diversity, inferring ancestral relationships between individuals in populations and studying marker–trait associations in mapping experiments. We developed a genotyping array including about 90 000 gene-associated SNPs and used it to characterize genetic variation in allohexaploid and allotetraploid wheat populations. The array includes a significant fraction of common genome-wide distributed SNPs that are represented in populations of diverse geographical origin. We used density-based spatial clustering algorithms to enable high-throughput genotype calling in complex data sets obtained for polyploid wheat. We show that these model-free clustering algorithms provide accurate genotype calling in the presence of multiple clusters including clusters with low signal intensity resulting from significant sequence divergence at the target SNP site or gene deletions. Assays that detect low-intensity clusters can provide insight into the distribution of presence–absence variation (PAV) in wheat populations. A total of 46 977 SNPs from the wheat 90K array were genetically mapped using a combination of eight mapping populations. The developed array and cluster identification algorithms provide an opportunity to infer detailed haplotype structure in polyploid wheat and will serve as an invaluable resource for diversity studies and investigating the genetic basis of trait variation in wheat. PMID:24646323

  8. Transcriptional Coupling of Neighboring Genes and Gene Expression Noise: Evidence that Gene Orientation and Noncoding Transcripts Are Modulators of Noise

    PubMed Central

    Wang, Guang-Zhong; Lercher, Martin J.; Hurst, Laurence D.

    2011-01-01

    Abstract How is noise in gene expression modulated? Do mechanisms of noise control impact genome organization? In yeast, the expression of one gene can affect that of a very close neighbor. As the effect is highly regionalized, we hypothesize that genes in different orientations will have differing degrees of coupled expression and, in turn, different noise levels. Divergently organized gene pairs, in particular those with bidirectional promoters, have close promoters, maximizing the likelihood that expression of one gene affects the neighbor. With more distant promoters, the same is less likely to hold for gene pairs in nondivergent orientation. Stochastic models suggest that coupled chromatin dynamics will typically result in low abundance-corrected noise (ACN). Transcription of noncoding RNA (ncRNA) from a bidirectional promoter, we thus hypothesize to be a noise-reduction, expression-priming, mechanism. The hypothesis correctly predicts that protein-coding genes with a bidirectional promoter, including those with a ncRNA partner, have lower ACN than other genes and divergent gene pairs uniquely have correlated ACN. Moreover, as predicted, ACN increases with the distance between promoters. The model also correctly predicts ncRNA transcripts to be often divergently transcribed from genes that a priori would be under selection for low noise (essential genes, protein complex genes) and that the latter genes should commonly reside in divergent orientation. Likewise, that genes with bidirectional promoters are rare subtelomerically, cluster together, and are enriched in essential gene clusters is expected and observed. We conclude that gene orientation and transcription of ncRNAs are candidate modulators of noise. PMID:21402863

  9. A novel polyketide biosynthesis gene cluster is involved in fruiting body morphogenesis in the filamentous fungi Sordaria macrospora and Neurospora crassa.

    PubMed

    Nowrousian, Minou

    2009-04-01

    During fungal fruiting body development, hyphae aggregate to form multicellular structures that protect and disperse the sexual spores. Analysis of microarray data revealed a gene cluster strongly upregulated during fruiting body development in the ascomycete Sordaria macrospora. Real time PCR analysis showed that the genes from the orthologous cluster in Neurospora crassa are also upregulated during development. The cluster encodes putative polyketide biosynthesis enzymes, including a reducing polyketide synthase. Analysis of knockout strains of a predicted dehydrogenase gene from the cluster showed that mutants in N. crassa and S. macrospora are delayed in fruiting body formation. In addition to the upregulated cluster, the N. crassa genome comprises another cluster containing a polyketide synthase gene, and five additional reducing polyketide synthase (rpks) genes that are not part of clusters. To study the role of these genes in sexual development, expression of the predicted rpks genes in S. macrospora (five genes) and N. crassa (six genes) was analyzed; all but one are upregulated during sexual development. Analysis of knockout strains for the N. crassa rpks genes showed that one of them is essential for fruiting body formation. These data indicate that polyketides produced by RPKSs are involved in sexual development in filamentous ascomycetes.

  10. Up-regulation of HOXB cluster genes are epigenetically regulated in tamoxifen-resistant MCF7 breast cancer cells.

    PubMed

    Yang, Seoyeon; Lee, Ji-Yeon; Hur, Ho; Oh, Ji Hoon; Kim, Myoung Hee

    2018-05-28

    Tamoxifen (TAM) is commonly used to treat estrogen receptor (ER)-positive breast cancer. Despite the remarkable benefits, resistance to TAM presents a serious therapeutic challenge. Since several HOX transcription factors have been proposed as strong candidates in the development of resistance to TAM therapy in breast cancer, we generated an in vitro model of acquired TAM resistance using ER-positive MCF7 breast cancer cells (MCF7-TAMR), and analyzed the expression pattern and epigenetic states of HOX genes. HOXB cluster genes were uniquely up-regulated in MCF7-TAMR cells. Survival analysis of in slico data showed the correlation of high expression of HOXB genes with poor response to TAM in ER-positive breast cancer patients treated with TAM. Gain- and loss-of-function experiments showed that the overexpression of multi HOXB genes in MCF7 renders cancer cells more resistant to TAM, whereas the knockdown restores TAM sensitivity. Furthermore, activation of HOXB genes in MCF7-TAMR was associated with histone modifications, particularly the gain of H3K9ac. These findings imply that the activation of HOXB genes mediate the development of TAM resistance, and represent a target for development of new strategies to prevent or reverse TAM resistance.

  11. Barriers to Gene Flow in the Marine Environment: Insights from Two Common Intertidal Limpet Species of the Atlantic and Mediterranean

    PubMed Central

    Sá-Pinto, Alexandra; Branco, Madalena S.; Alexandrino, Paulo B.; Fontaine, Michaël C.; Baird, Stuart J. E.

    2012-01-01

    Knowledge of the scale of dispersal and the mechanisms governing gene flow in marine environments remains fragmentary despite being essential for understanding evolution of marine biota and to design management plans. We use the limpets Patella ulyssiponensis and Patella rustica as models for identifying factors affecting gene flow in marine organisms across the North-East Atlantic and the Mediterranean Sea. A set of allozyme loci and a fragment of the mitochondrial gene cytochrome C oxidase subunit I were screened for genetic variation through starch gel electrophoresis and DNA sequencing, respectively. An approach combining clustering algorithms with clinal analyses was used to test for the existence of barriers to gene flow and estimate their geographic location and abruptness. Sharp breaks in the genetic composition of individuals were observed in the transitions between the Atlantic and the Mediterranean and across southern Italian shores. An additional break within the Atlantic cluster separates samples from the Alboran Sea and Atlantic African shores from those of the Iberian Atlantic shores. The geographic congruence of the genetic breaks detected in these two limpet species strongly supports the existence of transpecific barriers to gene flow in the Mediterranean Sea and Northeastern Atlantic. This leads to testable hypotheses regarding factors restricting gene flow across the study area. PMID:23239977

  12. A Meta-Analysis: Identification of Common Mir-145 Target Genes that have Similar Behavior in Different GEO Datasets.

    PubMed

    Pashaei, Elnaz; Guzel, Esra; Ozgurses, Mete Emir; Demirel, Goksun; Aydin, Nizamettin; Ozen, Mustafa

    MicroRNAs, which are small regulatory RNAs, post-transcriptionally regulate gene expression by binding 3'-UTR of their mRNA targets. Their deregulation has been shown to cause increased proliferation, migration, invasion, and apoptosis. miR-145, an important tumor supressor microRNA, has shown to be downregulated in many cancer types and has crucial roles in tumor initiation, progression, metastasis, invasion, recurrence, and chemo-radioresistance. Our aim is to investigate potential common target genes of miR-145, and to help understanding the underlying molecular pathways of tumor pathogenesis in association with those common target genes. Eight published microarray datasets, where targets of mir-145 were investigated in cell lines upon mir-145 over expression, were included into this study for meta-analysis. Inter group variabilities were assessed by box-plot analysis. Microarray datasets were analyzed using GEOquery package in Bioconducter 3.2 with R version 3.2.2 and two-way Hierarchical Clustering was used for gene expression data analysis. Meta-analysis of different GEO datasets showed that UNG, FUCA2, DERA, GMFB, TF, and SNX2 were commonly downregulated genes, whereas MYL9 and TAGLN were found to be commonly upregulated upon mir-145 over expression in prostate, breast, esophageal, bladder cancer, and head and neck squamous cell carcinoma. Biological process, molecular function, and pathway analysis of these potential targets of mir-145 through functional enrichments in PPI network demonstrated that those genes are significantly involved in telomere maintenance, DNA binding and repair mechanisms. As a conclusion, our results indicated that mir-145, through targeting its common potential targets, may significantly contribute to tumor pathogenesis in distinct cancer types and might serve as an important target for cancer therapy.

  13. Multiscale mutation clustering algorithm identifies pan-cancer mutational clusters associated with pathway-level changes in gene expression

    PubMed Central

    Poole, William; Leinonen, Kalle; Shmulevich, Ilya

    2017-01-01

    Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C. PMID:28170390

  14. Multiscale mutation clustering algorithm identifies pan-cancer mutational clusters associated with pathway-level changes in gene expression.

    PubMed

    Poole, William; Leinonen, Kalle; Shmulevich, Ilya; Knijnenburg, Theo A; Bernard, Brady

    2017-02-01

    Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C.

  15. Analysis of multiplex gene expression maps obtained by voxelation.

    PubMed

    An, Li; Xie, Hongbo; Chin, Mark H; Obradovic, Zoran; Smith, Desmond J; Megalooikonomou, Vasileios

    2009-04-29

    Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions. To analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum. The experimental results confirm the hypothesis that genes with similar gene expression maps might have similar gene functions. The voxelation data takes into account the location information of gene expression level in mouse brain, which is novel in related research. The proposed approach can potentially be used to predict gene functions and provide helpful suggestions to biologists.

  16. Evolution of Streptococcus pneumoniae and Its Close Commensal Relatives

    PubMed Central

    Kilian, Mogens; Poulsen, Knud; Blomqvist, Trinelise; Håvarstein, Leiv S.; Bek-Thomsen, Malene; Tettelin, Hervé; Sørensen, Uffe B. S.

    2008-01-01

    Streptococcus pneumoniae is a member of the Mitis group of streptococci which, according to 16S rRNA-sequence based phylogenetic reconstruction, includes 12 species. While other species of this group are considered prototypes of commensal bacteria, S. pneumoniae is among the most frequent microbial killers worldwide. Population genetic analysis of 118 strains, supported by demonstration of a distinct cell wall carbohydrate structure and competence pheromone sequence signature, shows that S. pneumoniae is one of several hundred evolutionary lineages forming a cluster separate from Streptococcus oralis and Streptococcus infantis. The remaining lineages of this distinct cluster are commensals previously collectively referred to as Streptococcus mitis and each represent separate species by traditional taxonomic standard. Virulence genes including the operon for capsule polysaccharide synthesis and genes encoding IgA1 protease, pneumolysin, and autolysin were randomly distributed among S. mitis lineages. Estimates of the evolutionary age of the lineages, the identical location of remnants of virulence genes in the genomes of commensal strains, the pattern of genome reductions, and the proportion of unique genes and their origin support the model that the entire cluster of S. pneumoniae, S. pseudopneumoniae, and S. mitis lineages evolved from pneumococcus-like bacteria presumably pathogenic to the common immediate ancestor of hominoids. During their adaptation to a commensal life style, most of the lineages gradually lost the majority of genes determining virulence and became genetically distinct due to sexual isolation in their respective hosts. PMID:18628950

  17. The ergot alkaloid gene cluster in Claviceps purpurea: extension of the cluster sequence and intra species evolution.

    PubMed

    Haarmann, Thomas; Machado, Caroline; Lübbe, Yvonne; Correia, Telmo; Schardl, Christopher L; Panaccione, Daniel G; Tudzynski, Paul

    2005-06-01

    The genomic region of Claviceps purpurea strain P1 containing the ergot alkaloid gene cluster [Tudzynski, P., Hölter, K., Correia, T., Arntz, C., Grammel, N., Keller, U., 1999. Evidence for an ergot alkaloid gene cluster in Claviceps purpurea. Mol. Gen. Genet. 261, 133-141] was explored by chromosome walking, and additional genes probably involved in the ergot alkaloid biosynthesis have been identified. The putative cluster sequence (extending over 68.5kb) contains 4 different nonribosomal peptide synthetase (NRPS) genes and several putative oxidases. Northern analysis showed that most of the genes were co-regulated (repressed by high phosphate), and identified probable flanking genes by lack of co-regulation. Comparison of the cluster sequences of strain P1, an ergotamine producer, with that of strain ECC93, an ergocristine producer, showed high conservation of most of the cluster genes, but significant variation in the NRPS modules, strongly suggesting that evolution of these chemical races of C. purpurea is determined by evolution of NRPS module specificity.

  18. Platypus globin genes and flanking loci suggest a new insertional model for beta-globin evolution in birds and mammals.

    PubMed

    Patel, Vidushi S; Cooper, Steven J B; Deakin, Janine E; Fulton, Bob; Graves, Tina; Warren, Wesley C; Wilson, Richard K; Graves, Jennifer A M

    2008-07-25

    Vertebrate alpha (alpha)- and beta (beta)-globin gene families exemplify the way in which genomes evolve to produce functional complexity. From tandem duplication of a single globin locus, the alpha- and beta-globin clusters expanded, and then were separated onto different chromosomes. The previous finding of a fossil beta-globin gene (omega) in the marsupial alpha-cluster, however, suggested that duplication of the alpha-beta cluster onto two chromosomes, followed by lineage-specific gene loss and duplication, produced paralogous alpha- and beta-globin clusters in birds and mammals. Here we analyse genomic data from an egg-laying monotreme mammal, the platypus (Ornithorhynchus anatinus), to explore haemoglobin evolution at the stem of the mammalian radiation. The platypus alpha-globin cluster (chromosome 21) contains embryonic and adult alpha- globin genes, a beta-like omega-globin gene, and the GBY globin gene with homology to cytoglobin, arranged as 5'-zeta-zeta'-alphaD-alpha3-alpha2-alpha1-omega-GBY-3'. The platypus beta-globin cluster (chromosome 2) contains single embryonic and adult globin genes arranged as 5'-epsilon-beta-3'. Surprisingly, all of these globin genes were expressed in some adult tissues. Comparison of flanking sequences revealed that all jawed vertebrate alpha-globin clusters are flanked by MPG-C16orf35 and LUC7L, whereas all bird and mammal beta-globin clusters are embedded in olfactory genes. Thus, the mammalian alpha- and beta-globin clusters are orthologous to the bird alpha- and beta-globin clusters respectively. We propose that alpha- and beta-globin clusters evolved from an ancient MPG-C16orf35-alpha-beta-GBY-LUC7L arrangement 410 million years ago. A copy of the original beta (represented by omega in marsupials and monotremes) was inserted into an array of olfactory genes before the amniote radiation (>315 million years ago), then duplicated and diverged to form orthologous clusters of beta-globin genes with different expression profiles in different lineages.

  19. Deregulation upon DNA damage revealed by joint analysis of context-specific perturbation data

    PubMed Central

    2011-01-01

    Background Deregulation between two different cell populations manifests itself in changing gene expression patterns and changing regulatory interactions. Accumulating knowledge about biological networks creates an opportunity to study these changes in their cellular context. Results We analyze re-wiring of regulatory networks based on cell population-specific perturbation data and knowledge about signaling pathways and their target genes. We quantify deregulation by merging regulatory signal from the two cell populations into one score. This joint approach, called JODA, proves advantageous over separate analysis of the cell populations and analysis without incorporation of knowledge. JODA is implemented and freely available in a Bioconductor package 'joda'. Conclusions Using JODA, we show wide-spread re-wiring of gene regulatory networks upon neocarzinostatin-induced DNA damage in Human cells. We recover 645 deregulated genes in thirteen functional clusters performing the rich program of response to damage. We find that the clusters contain many previously characterized neocarzinostatin target genes. We investigate connectivity between those genes, explaining their cooperation in performing the common functions. We review genes with the most extreme deregulation scores, reporting their involvement in response to DNA damage. Finally, we investigate the indirect impact of the ATM pathway on the deregulated genes, and build a hypothetical hierarchy of direct regulation. These results prove that JODA is a step forward to a systems level, mechanistic understanding of changes in gene regulation between different cell populations. PMID:21693013

  20. Deregulation upon DNA damage revealed by joint analysis of context-specific perturbation data.

    PubMed

    Szczurek, Ewa; Markowetz, Florian; Gat-Viks, Irit; Biecek, Przemysław; Tiuryn, Jerzy; Vingron, Martin

    2011-06-21

    Deregulation between two different cell populations manifests itself in changing gene expression patterns and changing regulatory interactions. Accumulating knowledge about biological networks creates an opportunity to study these changes in their cellular context. We analyze re-wiring of regulatory networks based on cell population-specific perturbation data and knowledge about signaling pathways and their target genes. We quantify deregulation by merging regulatory signal from the two cell populations into one score. This joint approach, called JODA, proves advantageous over separate analysis of the cell populations and analysis without incorporation of knowledge. JODA is implemented and freely available in a Bioconductor package 'joda'. Using JODA, we show wide-spread re-wiring of gene regulatory networks upon neocarzinostatin-induced DNA damage in Human cells. We recover 645 deregulated genes in thirteen functional clusters performing the rich program of response to damage. We find that the clusters contain many previously characterized neocarzinostatin target genes. We investigate connectivity between those genes, explaining their cooperation in performing the common functions. We review genes with the most extreme deregulation scores, reporting their involvement in response to DNA damage. Finally, we investigate the indirect impact of the ATM pathway on the deregulated genes, and build a hypothetical hierarchy of direct regulation. These results prove that JODA is a step forward to a systems level, mechanistic understanding of changes in gene regulation between different cell populations.

  1. De novo intrachromosomal gene conversion from OPN1MW to OPN1LW in the male germline results in Blue Cone Monochromacy

    PubMed Central

    Buena-Atienza, Elena; Rüther, Klaus; Baumann, Britta; Bergholz, Richard; Birch, David; De Baere, Elfride; Dollfus, Helene; Greally, Marie T.; Gustavsson, Peter; Hamel, Christian P.; Heckenlively, John R.; Leroy, Bart P.; Plomp, Astrid S.; Pott, Jan Willem R.; Rose, Katherine; Rosenberg, Thomas; Stark, Zornitza; Verheij, Joke B. G. M.; Weleber, Richard; Zobor, Ditta; Weisschuh, Nicole; Kohl, Susanne; Wissinger, Bernd

    2016-01-01

    X-linked cone dysfunction disorders such as Blue Cone Monochromacy and X-linked Cone Dystrophy are characterized by complete loss (of) or reduced L- and M- cone function due to defects in the OPN1LW/OPN1MW gene cluster. Here we investigated 24 affected males from 16 families with either a structurally intact gene cluster or at least one intact single (hybrid) gene but harbouring rare combinations of common SNPs in exon 3 in single or multiple OPN1LW and OPN1MW gene copies. We assessed twelve different OPN1LW/MW exon 3 haplotypes by semi-quantitative minigene splicing assay. Nine haplotypes resulted in aberrant splicing of ≥20% of transcripts including the known pathogenic haplotypes (i.e. ‘LIAVA’, ‘LVAVA’) with absent or minute amounts of correctly spliced transcripts, respectively. De novo formation of the ‘LIAVA’ haplotype derived from an ancestral less deleterious ‘LIAVS’ haplotype was observed in one family with strikingly different phenotypes among affected family members. We could establish intrachromosomal gene conversion in the male germline as underlying mechanism. Gene conversion in the OPN1LW/OPN1MW genes has been postulated, however, we are first to demonstrate a de novo gene conversion within the lineage of a pedigree. PMID:27339364

  2. Are There Genetic Paths Common to Obesity, Cardiovascular Disease Outcomes, and Cardiovascular Risk Factors?

    PubMed Central

    Rankinen, Tuomo; Sarzynski, Mark A.; Ghosh, Sujoy; Bouchard, Claude

    2015-01-01

    Clustering of obesity, coronary artery disease, and cardiovascular disease risk factors is observed in epidemiological studies and clinical settings. Twin and family studies have provided some supporting evidence for the clustering hypothesis. Loci nearest a lead single nucleotide polymorphism (SNP) showing genome-wide significant associations with coronary artery disease, body mass index, C-reactive protein, blood pressure, lipids, and type 2 diabetes mellitus were selected for pathway and network analyses. Eighty-seven autosomal regions (181 SNPs), mapping to 56 genes, were found to be pleiotropic. Most pleiotropic regions contained genes associated with coronary artery disease and plasma lipids, whereas some exhibited coaggregation between obesity and cardiovascular disease risk factors. We observed enrichment for liver X receptor (LXR)/retinoid X receptor (RXR) and farnesoid X receptor/RXR nuclear receptor signaling among pleiotropic genes and for signatures of coronary artery disease and hepatic steatosis. In the search for functionally interacting networks, we found that 43 pleiotropic genes were interacting in a network with an additional 24 linker genes. ENCODE (Encyclopedia of DNA Elements) data were queried for distribution of pleiotropic SNPs among regulatory elements and coding sequence variations. Of the 181 SNPs, 136 were annotated to ≥1 regulatory feature. An enrichment analysis found over-representation of enhancers and DNAse hypersensitive regions when compared against all SNPs of the 1000 Genomes pilot project. In summary, there are genomic regions exerting pleiotropic effects on cardiovascular disease risk factors, although only a few included obesity. Further studies are needed to resolve the clustering in terms of DNA variants, genes, pathways, and actionable targets. PMID:25722444

  3. CORM: An R Package Implementing the Clustering of Regression Models Method for Gene Clustering

    PubMed Central

    Shi, Jiejun; Qin, Li-Xuan

    2014-01-01

    We report a new R package implementing the clustering of regression models (CORM) method for clustering genes using gene expression data and provide data examples illustrating each clustering function in the package. The CORM package is freely available at CRAN from http://cran.r-project.org. PMID:25452684

  4. Clustering approaches to identifying gene expression patterns from DNA microarray data.

    PubMed

    Do, Jin Hwan; Choi, Dong-Kug

    2008-04-30

    The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

  5. Identification and Functional Analysis of the Nocardithiocin Gene Cluster in Nocardia pseudobrasiliensis

    PubMed Central

    Sakai, Kanae; Komaki, Hisayuki; Gonoi, Tohru

    2015-01-01

    Nocardithiocin is a thiopeptide compound isolated from the opportunistic pathogen Nocardia pseudobrasiliensis. It shows a strong activity against acid-fast bacteria and is also active against rifampicin-resistant Mycobacterium tuberculosis. Here, we report the identification of the nocardithiocin gene cluster in N. pseudobrasiliensis IFM 0761 based on conserved thiopeptide biosynthesis gene sequence and the whole genome sequence. The predicted gene cluster was confirmed by gene disruption and complementation. As expected, strains containing the disrupted gene did not produce nocardithiocin while gene complementation restored nocardithiocin production in these strains. The predicted cluster was further analyzed using RNA-seq which showed that the nocardithiocin gene cluster contains 12 genes within a 15.2-kb region. This finding will promote the improvement of nocardithiocin productivity and its derivatives production. PMID:26588225

  6. The diversity of Klebsiella pneumoniae surface polysaccharides.

    PubMed

    Follador, Rainer; Heinz, Eva; Wyres, Kelly L; Ellington, Matthew J; Kowarik, Michael; Holt, Kathryn E; Thomson, Nicholas R

    2016-08-01

    Klebsiella pneumoniae is considered an urgent health concern due to the emergence of multi-drug-resistant strains for which vaccination offers a potential remedy. Vaccines based on surface polysaccharides are highly promising but need to address the high diversity of surface-exposed polysaccharides, synthesized as O-antigens (lipopolysaccharide, LPS) and K-antigens (capsule polysaccharide, CPS), present in K. pneumoniae . We present a comprehensive and clinically relevant study of the diversity of O- and K-antigen biosynthesis gene clusters across a global collection of over 500 K. pneumoniae whole-genome sequences and the seroepidemiology of human isolates from different infection types. Our study defines the genetic diversity of O- and K-antigen biosynthesis cluster sequences across this collection, identifying sequences for known serotypes as well as identifying novel LPS and CPS gene clusters found in circulating contemporary isolates. Serotypes O1, O2 and O3 were most prevalent in our sample set, accounting for approximately 80 % of all infections. In contrast, K serotypes showed an order of magnitude higher diversity and differ among infection types. In addition we investigated a potential association of O or K serotypes with phylogenetic lineage, infection type and the presence of known virulence genes. K1 and K2 serotypes, which are associated with hypervirulent K. pneumoniae , were associated with a higher abundance of virulence genes and more diverse O serotypes compared to other common K serotypes.

  7. The diversity of Klebsiella pneumoniae surface polysaccharides

    PubMed Central

    Heinz, Eva; Wyres, Kelly L.; Ellington, Matthew J.; Kowarik, Michael; Holt, Kathryn E.; Thomson, Nicholas R.

    2016-01-01

    Klebsiella pneumoniae is considered an urgent health concern due to the emergence of multi-drug-resistant strains for which vaccination offers a potential remedy. Vaccines based on surface polysaccharides are highly promising but need to address the high diversity of surface-exposed polysaccharides, synthesized as O-antigens (lipopolysaccharide, LPS) and K-antigens (capsule polysaccharide, CPS), present in K. pneumoniae. We present a comprehensive and clinically relevant study of the diversity of O- and K-antigen biosynthesis gene clusters across a global collection of over 500 K. pneumoniae whole-genome sequences and the seroepidemiology of human isolates from different infection types. Our study defines the genetic diversity of O- and K-antigen biosynthesis cluster sequences across this collection, identifying sequences for known serotypes as well as identifying novel LPS and CPS gene clusters found in circulating contemporary isolates. Serotypes O1, O2 and O3 were most prevalent in our sample set, accounting for approximately 80 % of all infections. In contrast, K serotypes showed an order of magnitude higher diversity and differ among infection types. In addition we investigated a potential association of O or K serotypes with phylogenetic lineage, infection type and the presence of known virulence genes. K1 and K2 serotypes, which are associated with hypervirulent K. pneumoniae, were associated with a higher abundance of virulence genes and more diverse O serotypes compared to other common K serotypes. PMID:28348868

  8. Improved efficiency in amplification of Escherichia coli o-antigen gene clusters using genome-wide sequence comparison

    USDA-ARS?s Scientific Manuscript database

    Background: In many bacteria including E. coli, genes encoding O-antigens are clustered in the chromosome, with a 39-bp JUMPstart sequence and gnd gene located upstream and downstream of the cluster, respectively. For determining the DNA sequence of the E. coli O-antigen gene cluster, one set of P...

  9. Application of DNA probes for rRNA and vanA genes to investigation of a nosocomial cluster of vancomycin-resistant enterococci.

    PubMed Central

    Woodford, N; Morrison, D; Johnson, A P; Briant, V; George, R C; Cookson, B

    1993-01-01

    DNA probes specific for genes encoding rRNA and the glycopeptide resistance gene vanA were used to investigate a cluster of vancomycin-resistant (MICs, > 512 mg/liter) Enterococcus faecalis and Enterococcus faecium isolated from separate patients in a renal unit in a London hospital. When digested with BamHI, 12 of 13 vancomycin-resistant E. faecalis isolates exhibited a common restriction fragment length polymorphism pattern of rRNA genes (ribotype). A vanA probe hybridized with chromosomal DNA in these 12 isolates. The other isolate of vancomycin-resistant E. faecalis had a different ribotype and the vanA gene was located on plasmid DNA. These data suggest that cross-infection with a single strain of vancomycin-resistant E. faecalis occurred in most instances. In contrast, 23 vancomycin-resistant E. faecium isolates showed greater heterogeneity, comprising 8 ribotypes, suggesting that multiple strains were present in the unit. Twenty-one of these 23 isolates harbored a 24-MDa plasmid which hybridized with the vanA probe, implying that interstrain dissemination of a vancomycin resistance plasmid may have occurred among E. faecium isolates in the renal unit. Images PMID:8096216

  10. DNA methylation alterations in response to pesticide exposure in vitro

    PubMed Central

    Zhang, Xiao; Wallace, Andrew D.; Du, Pan; Kibbe, Warren A.; Jafari, Nadereh; Xie, Hehuang; Lin, Simon; Baccarelli, Andrea; Soares, Marcelo Bento; Hou, Lifang

    2013-01-01

    Although pesticides are subject to extensive carcinogenicity testing before regulatory approval, pesticide exposure has repeatedly been associated with various cancers. This suggests that pesticides may cause cancer via non-mutagenicity mechanisms. The present study provides evidence to support the hypothesis that pesticide-induced cancer may be mediated in part by epigenetic mechanisms. We examined whether exposure to 7 commonly used pesticides (i.e., fonofos, parathion, terbufos, chlorpyrifos, diazinon, malathion, and phorate) induces DNA methylation alterations in vitro. We conducted genome-wide DNA methylation analyses on DNA samples obtained from the human hematopoietic K562 cell line exposed to ethanol (control) and several OPs using the Illumina Infinium HumanMethylation27 BeadChip. Bayesian-adjusted t-tests were used to identify differentially methylated gene promoter CpG sites. In this report, we present our results on three pesticides (fonofos, parathion, and terbufos) that clustered together based on principle component analysis and hierarchical clustering. These three pesticides induced similar methylation changes in the promoter regions of 712 genes, while also exhibiting their own OP-specific methylation alterations. Functional analysis of methylation changes specific to each OP, or common to all three OPs, revealed that differential methylation was associated with numerous genes that are involved in carcinogenesis-related processes. Our results provide experimental evidence that pesticides may modify gene promoter DNA methylation levels, suggesting that epigenetic mechanisms may contribute to pesticide-induced carcinogenesis. Further studies in other cell types and human samples are required, as well as determining the impact of these methylation changes on gene expression. PMID:22847954

  11. A Systems Biology Analysis Unfolds the Molecular Pathways and Networks of Two Proteobacteria in Spaceflight and Simulated Microgravity Conditions.

    PubMed

    Roy, Raktim; Shilpa, P Phani; Bagh, Sangram

    2016-09-01

    Bacteria are important organisms for space missions due to their increased pathogenesis in microgravity that poses risks to the health of astronauts and for projected synthetic biology applications at the space station. We understand little about the effect, at the molecular systems level, of microgravity on bacteria, despite their significant incidence. In this study, we proposed a systems biology pipeline and performed an analysis on published gene expression data sets from multiple seminal studies on Pseudomonas aeruginosa and Salmonella enterica serovar Typhimurium under spaceflight and simulated microgravity conditions. By applying gene set enrichment analysis on the global gene expression data, we directly identified a large number of new, statistically significant cellular and metabolic pathways involved in response to microgravity. Alteration of metabolic pathways in microgravity has rarely been reported before, whereas in this analysis metabolic pathways are prevalent. Several of those pathways were found to be common across studies and species, indicating a common cellular response in microgravity. We clustered genes based on their expression patterns using consensus non-negative matrix factorization. The genes from different mathematically stable clusters showed protein-protein association networks with distinct biological functions, suggesting the plausible functional or regulatory network motifs in response to microgravity. The newly identified pathways and networks showed connection with increased survival of pathogens within macrophages, virulence, and antibiotic resistance in microgravity. Our work establishes a systems biology pipeline and provides an integrated insight into the effect of microgravity at the molecular systems level. Systems biology-Microgravity-Pathways and networks-Bacteria. Astrobiology 16, 677-689.

  12. Type 2 diabetes mellitus: distribution of genetic markers in Kazakh population.

    PubMed

    Sikhayeva, Nurgul; Talzhanov, Yerkebulan; Iskakova, Aisha; Dzharmukhanov, Jarkyn; Nugmanova, Raushan; Zholdybaeva, Elena; Ramanculov, Erlan

    2018-01-01

    Ethnic differences exist in the frequencies of genetic variations that contribute to the risk of common disease. This study aimed to analyse the distribution of several genes, previously associated with susceptibility to type 2 diabetes and obesity-related phenotypes, in a Kazakh population. A total of 966 individuals belonging to the Kazakh ethnicity were recruited from an outpatient clinic. We genotyped 41 common single nucleotide polymorphisms (SNPs) previously associated with type 2 diabetes in other ethnic groups and 31 of these were in Hardy-Weinberg equilibrium. The obtained allele frequencies were further compared to publicly available data from other ethnic populations. Allele frequencies for other (compared) populations were pooled from the haplotype map (HapMap) database. Principal component analysis (PCA), cluster analysis, and multidimensional scaling (MDS) were used for the analysis of genetic relationship between the populations. Comparative analysis of allele frequencies of the studied SNPs showed significant differentiation among the studied populations. The Kazakh population was grouped with Asian populations according to the cluster analysis and with the Caucasian populations according to PCA. According to MDS, results of the current study show that the Kazakh population holds an intermediate position between Caucasian and Asian populations. A high percentage of population differentiation was observed between Kazakh and world populations. The Kazakh population was clustered with Caucasian populations, and this result may indicate a significant Caucasian component in the Kazakh gene pool.

  13. A distinct class of homeodomain proteins is encoded by two sequentially expressed Drosophila genes from the 93D/E cluster.

    PubMed Central

    Jagla, K; Stanceva, I; Dretzen, G; Bellard, F; Bellard, M

    1994-01-01

    Homeodomains appear to be one of the most frequently employed DNA-binding domains in a superfamily of transacting factors. It is likely that during evolution several sub-types of homeodomain have evolved from a common ancestral domain, resulting in distinct but closely related DNA-binding preferences. Here we describe the conservation of a distinct type of homeodomain encoded by the Drosophila lady-bird-late (lbl) gene, previously named nkch4 (1). Using degenerate PCR primers corresponding to the most divergent regions of the first and third helix of the Lbl homeodomain we have amplified, from genomic DNA of the fly, a lady-bird-like homeobox fragment. The Drosophila PCR products contained both the lbl (1) and a highly related homeobox sequence, which we named lady-bird-early (lbe). This new Drosophila gene resides directly upstream to lbl and together with tinman/NK4 (2, 3, 4, 5), bagpipe/NK3 (2, 4) S59/NK1 (4, 6) and 93Bal (7) compose the 93D/E homeobox gene cluster. Ibe and lbl are transcribed from the same strand and in a temporal order corresponding to their 5'-3' chromosomal location. Transcripts of both genes are found in the epiderm of Drosophila embryos, in cells known to express a segment polarity gene wingless (8), and their spatial and temporal colinearity of expression strongly suggests that they cooperate during segmentation. The amino-acid composition of both Lady-bird homeodomains differ from that of Antp-type at several positions involved in DNA recognition. These substitutions appear to modify DNA-binding preferences since Lbl homeodomain is unable to recognize the most common homeodomain binding TAAT motif in gel retardation experiments. Images PMID:7909370

  14. Molecular Subtypes of Glioblastoma Are Relevant to Lower Grade Glioma

    PubMed Central

    Sloan, Andrew E.; Chen, Yanwen; Brat, Daniel J.; O’Neill, Brian Patrick; de Groot, John; Yust-Katz, Shlomit; Yung, Wai-Kwan Alfred; Cohen, Mark L.; Aldape, Kenneth D.; Rosenfeld, Steven; Verhaak, Roeland G. W.; Barnholtz-Sloan, Jill S.

    2014-01-01

    Background Gliomas are the most common primary malignant brain tumors in adults with great heterogeneity in histopathology and clinical course. The intent was to evaluate the relevance of known glioblastoma (GBM) expression and methylation based subtypes to grade II and III gliomas (ie. lower grade gliomas). Methods Gene expression array, single nucleotide polymorphism (SNP) array and clinical data were obtained for 228 GBMs and 176 grade II/II gliomas (GII/III) from the publically available Rembrandt dataset. Two additional datasets with IDH1 mutation status were utilized as validation datasets (one publicly available dataset and one newly generated dataset from MD Anderson). Unsupervised clustering was performed and compared to gene expression subtypes assigned using the Verhaak et al 840-gene classifier. The glioma-CpG Island Methylator Phenotype (G-CIMP) was assigned using prediction models by Fine et al. Results Unsupervised clustering by gene expression aligned with the Verhaak 840-gene subtype group assignments. GII/IIIs were preferentially assigned to the proneural subtype with IDH1 mutation and G-CIMP. GBMs were evenly distributed among the four subtypes. Proneural, IDH1 mutant, G-CIMP GII/III s had significantly better survival than other molecular subtypes. Only 6% of GBMs were proneural and had either IDH1 mutation or G-CIMP but these tumors had significantly better survival than other GBMs. Copy number changes in chromosomes 1p and 19q were associated with GII/IIIs, while these changes in CDKN2A, PTEN and EGFR were more commonly associated with GBMs. Conclusions GBM gene-expression and methylation based subtypes are relevant for GII/III s and associate with overall survival differences. A better understanding of the association between these subtypes and GII/IIIs could further knowledge regarding prognosis and mechanisms of glioma progression. PMID:24614622

  15. Role and Regulation of the Flp/Tad Pilus in the Virulence of Pectobacterium atrosepticum SCRI1043 and Pectobacterium wasabiae SCC3193

    PubMed Central

    Nykyri, Johanna; Mattinen, Laura; Niemi, Outi; Adhikari, Satish; Kõiv, Viia; Somervuo, Panu; Fang, Xin; Auvinen, Petri; Mäe, Andres; Palva, E. Tapio; Pirhonen, Minna

    2013-01-01

    In this study, we characterized a putative Flp/Tad pilus-encoding gene cluster, and we examined its regulation at the transcriptional level and its role in the virulence of potato pathogenic enterobacteria of the genus Pectobacterium. The Flp/Tad pilus-encoding gene clusters in Pectobacterium atrosepticum, Pectobacterium wasabiae and Pectobacterium aroidearum were compared to previously characterized flp/tad gene clusters, including that of the well-studied Flp/Tad pilus model organism Aggregatibacter actinomycetemcomitans, in which this pilus is a major virulence determinant. Comparative analyses revealed substantial protein sequence similarity and open reading frame synteny between the previously characterized flp/tad gene clusters and the cluster in Pectobacterium, suggesting that the predicted flp/tad gene cluster in Pectobacterium encodes a Flp/Tad pilus-like structure. We detected genes for a novel two-component system adjacent to the flp/tad gene cluster in Pectobacterium, and mutant analysis demonstrated that this system has a positive effect on the transcription of selected Flp/Tad pilus biogenesis genes, suggesting that this response regulator regulate the flp/tad gene cluster. Mutagenesis of either the predicted regulator gene or selected Flp/Tad pilus biogenesis genes had a significant impact on the maceration ability of the bacterial strains in potato tubers, indicating that the Flp/Tad pilus-encoding gene cluster represents a novel virulence determinant in Pectobacterium. Soft-rot enterobacteria in the genera Pectobacterium and Dickeya are of great agricultural importance, and an investigation of the virulence of these pathogens could facilitate improvements in agricultural practices, thus benefiting farmers, the potato industry and consumers. PMID:24040039

  16. Role and regulation of the Flp/Tad pilus in the virulence of Pectobacterium atrosepticum SCRI1043 and Pectobacterium wasabiae SCC3193.

    PubMed

    Nykyri, Johanna; Mattinen, Laura; Niemi, Outi; Adhikari, Satish; Kõiv, Viia; Somervuo, Panu; Fang, Xin; Auvinen, Petri; Mäe, Andres; Palva, E Tapio; Pirhonen, Minna

    2013-01-01

    In this study, we characterized a putative Flp/Tad pilus-encoding gene cluster, and we examined its regulation at the transcriptional level and its role in the virulence of potato pathogenic enterobacteria of the genus Pectobacterium. The Flp/Tad pilus-encoding gene clusters in Pectobacterium atrosepticum, Pectobacterium wasabiae and Pectobacterium aroidearum were compared to previously characterized flp/tad gene clusters, including that of the well-studied Flp/Tad pilus model organism Aggregatibacter actinomycetemcomitans, in which this pilus is a major virulence determinant. Comparative analyses revealed substantial protein sequence similarity and open reading frame synteny between the previously characterized flp/tad gene clusters and the cluster in Pectobacterium, suggesting that the predicted flp/tad gene cluster in Pectobacterium encodes a Flp/Tad pilus-like structure. We detected genes for a novel two-component system adjacent to the flp/tad gene cluster in Pectobacterium, and mutant analysis demonstrated that this system has a positive effect on the transcription of selected Flp/Tad pilus biogenesis genes, suggesting that this response regulator regulate the flp/tad gene cluster. Mutagenesis of either the predicted regulator gene or selected Flp/Tad pilus biogenesis genes had a significant impact on the maceration ability of the bacterial strains in potato tubers, indicating that the Flp/Tad pilus-encoding gene cluster represents a novel virulence determinant in Pectobacterium. Soft-rot enterobacteria in the genera Pectobacterium and Dickeya are of great agricultural importance, and an investigation of the virulence of these pathogens could facilitate improvements in agricultural practices, thus benefiting farmers, the potato industry and consumers.

  17. The Complete Mitochondrial Genome of Gossypium hirsutum and Evolutionary Analysis of Higher Plant Mitochondrial Genomes

    PubMed Central

    Su, Aiguo; Geng, Jianing; Grover, Corrinne E.; Hu, Songnian; Hua, Jinping

    2013-01-01

    Background Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. Methodology/Principal Findings We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. Conclusion The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species. PMID:23940520

  18. The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

    PubMed

    Liu, Guozheng; Cao, Dandan; Li, Shuangshuang; Su, Aiguo; Geng, Jianing; Grover, Corrinne E; Hu, Songnian; Hua, Jinping

    2013-01-01

    Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.

  19. An effective fuzzy kernel clustering analysis approach for gene expression data.

    PubMed

    Sun, Lin; Xu, Jiucheng; Yin, Jiaojiao

    2015-01-01

    Fuzzy clustering is an important tool for analyzing microarray data. A major problem in applying fuzzy clustering method to microarray gene expression data is the choice of parameters with cluster number and centers. This paper proposes a new approach to fuzzy kernel clustering analysis (FKCA) that identifies desired cluster number and obtains more steady results for gene expression data. First of all, to optimize characteristic differences and estimate optimal cluster number, Gaussian kernel function is introduced to improve spectrum analysis method (SAM). By combining subtractive clustering with max-min distance mean, maximum distance method (MDM) is proposed to determine cluster centers. Then, the corresponding steps of improved SAM (ISAM) and MDM are given respectively, whose superiority and stability are illustrated through performing experimental comparisons on gene expression data. Finally, by introducing ISAM and MDM into FKCA, an effective improved FKCA algorithm is proposed. Experimental results from public gene expression data and UCI database show that the proposed algorithms are feasible for cluster analysis, and the clustering accuracy is higher than the other related clustering algorithms.

  20. Microarray characterization of gene expression changes in blood during acute ethanol exposure

    PubMed Central

    2013-01-01

    Background As part of the civil aviation safety program to define the adverse effects of ethanol on flying performance, we performed a DNA microarray analysis of human whole blood samples from a five-time point study of subjects administered ethanol orally, followed by breathalyzer analysis, to monitor blood alcohol concentration (BAC) to discover significant gene expression changes in response to the ethanol exposure. Methods Subjects were administered either orange juice or orange juice with ethanol. Blood samples were taken based on BAC and total RNA was isolated from PaxGene™ blood tubes. The amplified cDNA was used in microarray and quantitative real-time polymerase chain reaction (RT-qPCR) analyses to evaluate differential gene expression. Microarray data was analyzed in a pipeline fashion to summarize and normalize and the results evaluated for relative expression across time points with multiple methods. Candidate genes showing distinctive expression patterns in response to ethanol were clustered by pattern and further analyzed for related function, pathway membership and common transcription factor binding within and across clusters. RT-qPCR was used with representative genes to confirm relative transcript levels across time to those detected in microarrays. Results Microarray analysis of samples representing 0%, 0.04%, 0.08%, return to 0.04%, and 0.02% wt/vol BAC showed that changes in gene expression could be detected across the time course. The expression changes were verified by qRT-PCR. The candidate genes of interest (GOI) identified from the microarray analysis and clustered by expression pattern across the five BAC points showed seven coordinately expressed groups. Analysis showed function-based networks, shared transcription factor binding sites and signaling pathways for members of the clusters. These include hematological functions, innate immunity and inflammation functions, metabolic functions expected of ethanol metabolism, and pancreatic and hepatic function. Five of the seven clusters showed links to the p38 MAPK pathway. Conclusions The results of this study provide a first look at changing gene expression patterns in human blood during an acute rise in blood ethanol concentration and its depletion because of metabolism and excretion, and demonstrate that it is possible to detect changes in gene expression using total RNA isolated from whole blood. The analysis approach for this study serves as a workflow to investigate the biology linked to expression changes across a time course and from these changes, to identify target genes that could serve as biomarkers linked to pilot performance. PMID:23883607

  1. Transcriptome reveals the overexpression of a kallikrein gene cluster (KLK1/3/7/8/12) in the Tibetans with high altitude-associated polycythemia.

    PubMed

    Li, Kang; Gesang, Luobu; Dan, Zeng; Gusang, Lamu

    2017-02-01

    High altitude-associated polycythemia (HAPC) is a very common disease. However, it the disease is still unmanageable and the related molecular mechanisms remain largely unclear. In the present study, we aimed to explore the molecular mechanisms responsible for the development of HAPC using transcriptome analysis. Transcriptome analysis was conducted in 3 pairs of gastric mucosa tissues from patients with HAPC and healthy residents at a similar altitude. Endoscopy and histopathological analyses were used to examine the injury to gastric tissues. Molecular remodeling was performed for the interaction between different KLK members and cholesterol. HAPC was found to lead to morphological changes and pathological damage to the gastric mucosa of patients. A total of 10,304 differentially expressed genes (DEGs) were identified. Among these genes, 4,941 DEGs were upregulated, while 5,363 DEGs were downregulated in the patients with HAPC (fold change ≥2, P<0.01 and FDR <0.01). In particular, the kallikrein gene cluster (KLK1/3/7/8/12) was upregulated >17-fold. All the members had high-score binding cholesterol, particularly for the polymers of KLK7. The kallikrein gene cluster (KLK1/3/7/8/12) is on chromosome 19q13.3-13.4. The elevated levels of KLK1, KLK3, KLK7, KLK8 and KLK12 may be closely associated with the hypertension, inflammation, obesity and other gastric injuries associated with polycythemia. The interaction of KLKs and cholesterol maybe play an important role in the development of hypertension. The findings of the present study revealed that HAPC induces gastric injury by upregulating the kallikrein gene cluster (KLK1/3/7/8/12), which can bind cholesterol and result in kallikrein hypertension. These findings provide some basic information for understanding the molecular mechanisms responsible for HAPC and HAPC-related diseases.

  2. Widespread Enhancer Activity from Core Promoters.

    PubMed

    Medina-Rivera, Alejandra; Santiago-Algarra, David; Puthier, Denis; Spicuglia, Salvatore

    2018-06-01

    Gene expression in higher eukaryotes is precisely regulated in time and space through the interplay between promoters and gene-distal regulatory regions, known as enhancers. The original definition of enhancers implies the ability to activate gene expression remotely, while promoters entail the capability to locally induce gene expression. Despite the conventional distinction between them, promoters and enhancers share many genomic and epigenomic features. One intriguing finding in the gene regulation field comes from the observation that many core promoter regions display enhancer activity. Recent high-throughput reporter assays along with clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9-related approaches have indicated that this phenomenon is common and might have a strong impact on our global understanding of genome organisation and gene expression regulation. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. Regulatory Feedback Loop of Two phz Gene Clusters through 5′-Untranslated Regions in Pseudomonas sp. M18

    PubMed Central

    Li, Yaqian; Du, Xilin; Lu, Zhi John; Wu, Daqiang; Zhao, Yilei; Ren, Bin; Huang, Jiaofang; Huang, Xianqing; Xu, Yuhong; Xu, Yuquan

    2011-01-01

    Background Phenazines are important compounds produced by pseudomonads and other bacteria. Two phz gene clusters called phzA1-G1 and phzA2-G2, respectively, were found in the genome of Pseudomonas sp. M18, an effective biocontrol agent, which is highly homologous to the opportunistic human pathogen P. aeruginosa PAO1, however little is known about the correlation between the expressions of two phz gene clusters. Methodology/Principal Findings Two chromosomal insertion inactivated mutants for the two gene clusters were constructed respectively and the correlation between the expressions of two phz gene clusters was investigated in strain M18. Phenazine-1-carboxylic acid (PCA) molecules produced from phzA2-G2 gene cluster are able to auto-regulate expression itself and activate the expression of phzA1-G1 gene cluster in a circulated amplification pattern. However, the post-transcriptional expression of phzA1-G1 transcript was blocked principally through 5′-untranslated region (UTR). In contrast, the phzA2-G2 gene cluster was transcribed to a lesser extent and translated efficiently and was negatively regulated by the GacA signal transduction pathway, mainly at a post-transcriptional level. Conclusions/Significance A single molecule, PCA, produced in different quantities by the two phz gene clusters acted as the functional mediator and the two phz gene clusters developed a specific regulatory mechanism which acts through 5′-UTR to transfer a single, but complex bacterial signaling event in Pseudomonas sp. strain M18. PMID:21559370

  4. Identification of an unusual type II thioesterase in the dithiolopyrrolone antibiotics biosynthetic pathway

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhai, Ying; Bai, Silei; Liu, Jingjing

    Dithiolopyrrolone group antibiotics characterized by an electronically unique dithiolopyrrolone heterobicyclic core are known for their antibacterial, antifungal, insecticidal and antitumor activities. Recently the biosynthetic gene clusters for two dithiolopyrrolone compounds, holomycin and thiomarinol, have been identified respectively in different bacterial species. Here, we report a novel dithiolopyrrolone biosynthetic gene cluster (aut) isolated from Streptomyces thioluteus DSM 40027 which produces two pyrrothine derivatives, aureothricin and thiolutin. By comparison with other characterized dithiolopyrrolone clusters, eight genes in the aut cluster were verified to be responsible for the assembly of dithiolopyrrolone core. The aut cluster was further confirmed by heterologous expression and in-framemore » gene deletion experiments. Intriguingly, we found that the heterogenetic thioesterase HlmK derived from the holomycin (hlm) gene cluster in Streptomyces clavuligerus significantly improved heterologous biosynthesis of dithiolopyrrolones in Streptomyces albus through coexpression with the aut cluster. In the previous studies, HlmK was considered invalid because it has a Ser to Gly point mutation within the canonical Ser-His-Asp catalytic triad of thioesterases. However, gene inactivation and complementation experiments in our study unequivocally demonstrated that HlmK is an active distinctive type II thioesterase that plays a beneficial role in dithiolopyrrolone biosynthesis. - Highlights: • Cloning of the aureothricin biosynthetic gene cluster from Streptomyces thioluteus DSM 40027. • Identification of the aureothricin gene cluster by heterologous expression and in-frame gene deletion. • The heterogenetic thioesterase HlmK significantly improved dithiolopyrrolones production of the aureothricin gene cluster. • Identification of HlmK as an unusual type II thioesterase.« less

  5. Differential regulation of ParaHox genes by retinoic acid in the invertebrate chordate amphioxus (Branchiostoma floridae).

    PubMed

    Osborne, Peter W; Benoit, Gérard; Laudet, Vincent; Schubert, Michael; Ferrier, David E K

    2009-03-01

    The ParaHox cluster is the evolutionary sister to the Hox cluster. Like the Hox cluster, the ParaHox cluster displays spatial and temporal regulation of the component genes along the anterior/posterior axis in a manner that correlates with the gene positions within the cluster (a feature called collinearity). The ParaHox cluster is however a simpler system to study because it is composed of only three genes. We provide a detailed analysis of the amphioxus ParaHox cluster and, for the first time in a single species, examine the regulation of the cluster in response to a single developmental signalling molecule, retinoic acid (RA). Embryos treated with either RA or RA antagonist display altered ParaHox gene expression: AmphiGsx expression shifts in the neural tube, and the endodermal boundary between AmphiXlox and AmphiCdx shifts its anterior/posterior position. We identified several putative retinoic acid response elements and in vitro assays suggest some may participate in RA regulation of the ParaHox genes. By comparison to vertebrate ParaHox gene regulation we explore the evolutionary implications. This work highlights how insights into the regulation and evolution of more complex vertebrate arrangements can be obtained through studies of a simpler, unduplicated amphioxus gene cluster.

  6. A Stationary Wavelet Entropy-Based Clustering Approach Accurately Predicts Gene Expression

    PubMed Central

    Nguyen, Nha; Vo, An; Choi, Inchan

    2015-01-01

    Abstract Studying epigenetic landscapes is important to understand the condition for gene regulation. Clustering is a useful approach to study epigenetic landscapes by grouping genes based on their epigenetic conditions. However, classical clustering approaches that often use a representative value of the signals in a fixed-sized window do not fully use the information written in the epigenetic landscapes. Clustering approaches to maximize the information of the epigenetic signals are necessary for better understanding gene regulatory environments. For effective clustering of multidimensional epigenetic signals, we developed a method called Dewer, which uses the entropy of stationary wavelet of epigenetic signals inside enriched regions for gene clustering. Interestingly, the gene expression levels were highly correlated with the entropy levels of epigenetic signals. Dewer separates genes better than a window-based approach in the assessment using gene expression and achieved a correlation coefficient above 0.9 without using any training procedure. Our results show that the changes of the epigenetic signals are useful to study gene regulation. PMID:25383910

  7. Degradation of Benzene by Pseudomonas veronii 1YdBTEX2 and 1YB2 Is Catalyzed by Enzymes Encoded in Distinct Catabolism Gene Clusters.

    PubMed

    de Lima-Morales, Daiana; Chaves-Moreno, Diego; Wos-Oxley, Melissa L; Jáuregui, Ruy; Vilchez-Vargas, Ramiro; Pieper, Dietmar H

    2016-01-01

    Pseudomonas veronii 1YdBTEX2, a benzene and toluene degrader, and Pseudomonas veronii 1YB2, a benzene degrader, have previously been shown to be key players in a benzene-contaminated site. These strains harbor unique catabolic pathways for the degradation of benzene comprising a gene cluster encoding an isopropylbenzene dioxygenase where genes encoding downstream enzymes were interrupted by stop codons. Extradiol dioxygenases were recruited from gene clusters comprising genes encoding a 2-hydroxymuconic semialdehyde dehydrogenase necessary for benzene degradation but typically absent from isopropylbenzene dioxygenase-encoding gene clusters. The benzene dihydrodiol dehydrogenase-encoding gene was not clustered with any other aromatic degradation genes, and the encoded protein was only distantly related to dehydrogenases of aromatic degradation pathways. The involvement of the different gene clusters in the degradation pathways was suggested by real-time quantitative reverse transcription PCR. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  8. LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.

    PubMed

    Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun

    2012-01-01

    Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene ontology (GO) annotation, promoter identification, gene expression (co-expression), and evolutionary analysis. This database not only provides a way to define lineage-specific and species-specific gene clusters but also facilitates future studies on gene co-regulation, epigenetic control of gene expression (DNA methylation and histone marks), and chromosomal structures in a context of gene clusters and species evolution. LCGbase is freely available at http://lcgbase.big.ac.cn/LCGbase.

  9. Using the clustered circular layout as an informative method for visualizing protein-protein interaction networks.

    PubMed

    Fung, David C Y; Wilkins, Marc R; Hart, David; Hong, Seok-Hee

    2010-07-01

    The force-directed layout is commonly used in computer-generated visualizations of protein-protein interaction networks. While it is good for providing a visual outline of the protein complexes and their interactions, it has two limitations when used as a visual analysis method. The first is poor reproducibility. Repeated running of the algorithm does not necessarily generate the same layout, therefore, demanding cognitive readaptation on the investigator's part. The second limitation is that it does not explicitly display complementary biological information, e.g. Gene Ontology, other than the protein names or gene symbols. Here, we present an alternative layout called the clustered circular layout. Using the human DNA replication protein-protein interaction network as a case study, we compared the two network layouts for their merits and limitations in supporting visual analysis.

  10. Iterative local Gaussian clustering for expressed genes identification linked to malignancy of human colorectal carcinoma

    PubMed Central

    Wasito, Ito; Hashim, Siti Zaiton M; Sukmaningrum, Sri

    2007-01-01

    Gene expression profiling plays an important role in the identification of biological and clinical properties of human solid tumors such as colorectal carcinoma. Profiling is required to reveal underlying molecular features for diagnostic and therapeutic purposes. A non-parametric density-estimation-based approach called iterative local Gaussian clustering (ILGC), was used to identify clusters of expressed genes. We used experimental data from a previous study by Muro and others consisting of 1,536 genes in 100 colorectal cancer and 11 normal tissues. In this dataset, the ILGC finds three clusters, two large and one small gene clusters, similar to their results which used Gaussian mixture clustering. The correlation of each cluster of genes and clinical properties of malignancy of human colorectal cancer was analysed for the existence of tumor or normal, the existence of distant metastasis and the existence of lymph node metastasis. PMID:18305825

  11. Iterative local Gaussian clustering for expressed genes identification linked to malignancy of human colorectal carcinoma.

    PubMed

    Wasito, Ito; Hashim, Siti Zaiton M; Sukmaningrum, Sri

    2007-12-30

    Gene expression profiling plays an important role in the identification of biological and clinical properties of human solid tumors such as colorectal carcinoma. Profiling is required to reveal underlying molecular features for diagnostic and therapeutic purposes. A non-parametric density-estimation-based approach called iterative local Gaussian clustering (ILGC), was used to identify clusters of expressed genes. We used experimental data from a previous study by Muro and others consisting of 1,536 genes in 100 colorectal cancer and 11 normal tissues. In this dataset, the ILGC finds three clusters, two large and one small gene clusters, similar to their results which used Gaussian mixture clustering. The correlation of each cluster of genes and clinical properties of malignancy of human colorectal cancer was analysed for the existence of tumor or normal, the existence of distant metastasis and the existence of lymph node metastasis.

  12. A cluster merging method for time series microarray with production values.

    PubMed

    Chira, Camelia; Sedano, Javier; Camara, Monica; Prieto, Carlos; Villar, Jose R; Corchado, Emilio

    2014-09-01

    A challenging task in time-course microarray data analysis is to cluster genes meaningfully combining the information provided by multiple replicates covering the same key time points. This paper proposes a novel cluster merging method to accomplish this goal obtaining groups with highly correlated genes. The main idea behind the proposed method is to generate a clustering starting from groups created based on individual temporal series (representing different biological replicates measured in the same time points) and merging them by taking into account the frequency by which two genes are assembled together in each clustering. The gene groups at the level of individual time series are generated using several shape-based clustering methods. This study is focused on a real-world time series microarray task with the aim to find co-expressed genes related to the production and growth of a certain bacteria. The shape-based clustering methods used at the level of individual time series rely on identifying similar gene expression patterns over time which, in some models, are further matched to the pattern of production/growth. The proposed cluster merging method is able to produce meaningful gene groups which can be naturally ranked by the level of agreement on the clustering among individual time series. The list of clusters and genes is further sorted based on the information correlation coefficient and new problem-specific relevant measures. Computational experiments and results of the cluster merging method are analyzed from a biological perspective and further compared with the clustering generated based on the mean value of time series and the same shape-based algorithm.

  13. Common Genetic Variants in ARNTL and NPAS2 and at Chromosome 12p13 are Associated with Objectively Measured Sleep Traits in the Elderly

    PubMed Central

    Evans, Daniel S.; Parimi, Neeta; Nievergelt, Caroline M.; Blackwell, Terri; Redline, Susan; Ancoli-Israel, Sonia; Orwoll, Eric S.; Cummings, Steven R.; Stone, Katie L.; Tranah, Gregory J.

    2013-01-01

    Study Objectives: To determine the association between common genetic variation in the clock gene pathway and objectively measured acti-graphic sleep and activity rhythm traits. Design: Genetic association study in two population-based cohorts of elderly participants: the Study of Osteoporotic Fractures (SOF) and the Osteoporotic Fractures in Men (MrOS) study. Setting: Population-based. Participants: SOF participants (n = 1,407, 100% female, mean age 84 years) and MrOS participants (n = 2,527, 100% male, mean age 77 years) with actigraphy and genotype data. Interventions: N/A. Measurements and Results: Common genetic variation in 30 candidate genes was captured using 529 single nucleotide polymorphisms (SNPs). Sleep and activity rhythm traits were objectively measured using wrist actigraphy. In a region of high linkage disequilibrium on chromosome 12p13 containing the candidate gene GNB3, the rs1047776 A allele and the rs2238114 C allele were significantly associated with higher wake after sleep onset (meta-analysis: rs1047776 PADD = 2 × 10-5, rs2238114 PADD = 5 × 10-5) and lower LRRC23 gene expression (rs1047776: ρ = -0.22, P = 0.02; rs2238114: ρ = -0.50, P = 5 × 10-8). In MrOS participants, SNPs in ARNTL and NPAS2, genes coding for binding partners, were associated with later sleep and wake onset time (sleep onset time: ARNTL rs3816358 P2DF = 1 × 10-4, NPAS2 rs3768984 P2DF = 5 × 10-5; wake onset time: rs3816358 P2DF = 3 × 10-3, rs3768984 P2DF = 2 × 10-4) and the SNP interaction was significant (sleep onset time PINT = 0.003, wake onset time PINT = 0.001). A SNP association in the CLOCK gene replicated in the MrOS cohort, and rs3768984 was associated with sleep duration in a previously reported study. Cluster analysis identified four clusters of genetic associations. Conclusions: These findings support a role for common genetic variation in clock genes in the regulation of inter-related sleep traits in the elderly. Citation: Evans DS; Parimi N; Nievergelt CM; Blackwell T; Redline S; Ancoli-Israel S; Orwoll ES; Cummings SR; Stone KL; Tranah GJ. Common genetic variants in ARNTL and NPAS2 and at chromosome 12p13 are associated with objectively measured sleep traits in the elderly. SLEEP 2013;36(3):431-446. PMID:23449886

  14. Functional and evolutionary correlates of gene constellations in the Drosophila melanogaster genome that deviate from the stereotypical gene architecture

    PubMed Central

    2010-01-01

    Background The biological dimensions of genes are manifold. These include genomic properties, (e.g., X/autosomal linkage, recombination) and functional properties (e.g., expression level, tissue specificity). Multiple properties, each generally of subtle influence individually, may affect the evolution of genes or merely be (auto-)correlates. Results of multidimensional analyses may reveal the relative importance of these properties on the evolution of genes, and therefore help evaluate whether these properties should be considered during analyses. While numerous properties are now considered during studies, most work still assumes the stereotypical solitary gene as commonly depicted in textbooks. Here, we investigate the Drosophila melanogaster genome to determine whether deviations from the stereotypical gene architecture correlate with other properties of genes. Results Deviations from the stereotypical gene architecture were classified as the following gene constellations: Overlapping genes were defined as those that overlap in the 5-prime, exonic, or intronic regions. Chromatin co-clustering genes were defined as genes that co-clustered within 20 kb of transcriptional territories. If this scheme is applied the stereotypical gene emerges as a rare occurrence (7.5%), slightly varied schemes yielded between ~1%-50%. Moreover, when following our scheme, paired-overlapping genes and chromatin co-clustering genes accounted for 50.1 and 42.4% of the genes analyzed, respectively. Gene constellation was a correlate of a number of functional and evolutionary properties of genes, but its statistical effect was ~1-2 orders of magnitude lower than the effects of recombination, chromosome linkage and protein function. Analysis of datasets on male reproductive proteins showed these were biased in their representation of gene constellations and evolutionary rate Ka/Ks estimates, but these biases did not overwhelm the biologically meaningful observation of high evolutionary rates of male reproductive genes. Conclusion Given the rarity of the solitary stereotypical gene, and the abundance of gene constellations that deviate from it, the presence of gene constellations, while once thought to be exceptional in large Eukaryote genomes, might have broader relevance to the understanding and study of the genome. However, according to our definition, while gene constellations can be significant correlates of functional properties of genes, they generally are weak correlates of the evolution of genes. Thus, the need for their consideration would depend on the context of studies. PMID:20497561

  15. Epigenetic Regulation of Newborns' Imprinted Genes Related to Gestational Growth: Patterning by Parental Race/Ethnicity and Maternal Socioeconomic Status

    EPA Science Inventory

    BACKGROUND: Children born to parents with lower income and education are at risk for obesity and later-life risk of common chronic diseases, and epigenetics has been hypothesised to link these associations. However, epigenetic targets are unknown. We focus on a cluster of well­ c...

  16. Text analysis of MEDLINE for discovering functional relationships among genes: evaluation of keyword extraction weighting schemes.

    PubMed

    Liu, Ying; Navathe, Shamkant B; Pivoshenko, Alex; Dasigi, Venu G; Dingledine, Ray; Ciliax, Brian J

    2006-01-01

    One of the key challenges of microarray studies is to derive biological insights from the gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the functional links among genes. However, the quality of the keyword lists significantly affects the clustering results. We compared two keyword weighting schemes: normalised z-score and term frequency-inverse document frequency (TFIDF). Two gene sets were tested to evaluate the effectiveness of the weighting schemes for keyword extraction for gene clustering. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords outperformed those produced from normalised z-score weighted keywords. The optimised algorithms should be useful for partitioning genes from microarray lists into functionally discrete clusters.

  17. Biosynthetic Investigations of Lactonamycin and Lactonamycin Z: Cloning of the Biosynthetic Gene Clusters and Discovery of an Unusual Starter Unit▿ †

    PubMed Central

    Zhang, Xiujun; Alemany, Lawrence B.; Fiedler, Hans-Peter; Goodfellow, Michael; Parry, Ronald J.

    2008-01-01

    The antibiotics lactonamycin and lactonamycin Z provide attractive leads for antibacterial drug development. Both antibiotics contain a novel aglycone core called lactonamycinone. To gain insight into lactonamycinone biosynthesis, cloning and precursor incorporation experiments were undertaken. The lactonamycin gene cluster was initially cloned from Streptomyces rishiriensis. Sequencing of ca. 61 kb of S. rishiriensis DNA revealed the presence of 57 open reading frames. These included genes coding for the biosynthesis of l-rhodinose, the sugar found in lactonamycin, and genes similar to those in the tetracenomycin biosynthetic gene cluster. Since lactonamycin production by S. rishiriensis could not be sustained, additional proof for the identity of the S. rishiriensis cluster was obtained by cloning the lactonamycin Z gene cluster from Streptomyces sanglieri. Partial sequencing of the S. sanglieri cluster revealed 15 genes that exhibited a very high degree of similarity to genes within the lactonamycin cluster, as well as an identical organization. Double-crossover disruption of one gene in the S. sanglieri cluster abolished lactonamycin Z production, and production was restored by complementation. These results confirm the identity of the genetic locus cloned from S. sanglieri and indicate that the highly similar locus in S. rishiriensis encodes lactonamycin biosynthetic genes. Precursor incorporation experiments with S. sanglieri revealed that lactonamycinone is biosynthesized in an unusual manner whereby glycine or a glycine derivative serves as a starter unit that is extended by nine acetate units. Analysis of the gene clusters and of the precursor incorporation data suggested a hypothetical scheme for lactonamycinone biosynthesis. PMID:18070976

  18. UniGene Tabulator: a full parser for the UniGene format.

    PubMed

    Lenzi, Luca; Frabetti, Flavia; Facchin, Federica; Casadei, Raffaella; Vitale, Lorenza; Canaider, Silvia; Carinci, Paolo; Zannotti, Maria; Strippoli, Pierluigi

    2006-10-15

    UniGene Tabulator 1.0 provides a solution for full parsing of UniGene flat file format; it implements a structured graphical representation of each data field present in UniGene following import into a common database managing system usable in a personal computer. This database includes related tables for sequence, protein similarity, sequence-tagged site (STS) and transcript map interval (TXMAP) data, plus a summary table where each record represents a UniGene cluster. UniGene Tabulator enables full local management of UniGene data, allowing parsing, querying, indexing, retrieving, exporting and analysis of UniGene data in a relational database form, usable on Macintosh (OS X 10.3.9 or later) and Windows (2000, with service pack 4, XP, with service pack 2 or later) operating systems-based computers. The current release, including both the FileMaker runtime applications, is freely available at http://apollo11.isto.unibo.it/software/

  19. Establishment of the Inducible Tet-On System for the Activation of the Silent Trichosetin Gene Cluster in Fusarium fujikuroi

    PubMed Central

    Janevska, Slavica; Arndt, Birgit; Baumann, Leonie; Apken, Lisa Helene; Mauriz Marques, Lucas Maciel; Humpf, Hans-Ulrich; Tudzynski, Bettina

    2017-01-01

    The PKS-NRPS-derived tetramic acid equisetin and its N-desmethyl derivative trichosetin exhibit remarkable biological activities against a variety of organisms, including plants and bacteria, e.g., Staphylococcus aureus. The equisetin biosynthetic gene cluster was first described in Fusarium heterosporum, a species distantly related to the notorious rice pathogen Fusarium fujikuroi. Here we present the activation and characterization of a homologous, but silent, gene cluster in F. fujikuroi. Bioinformatic analysis revealed that this cluster does not contain the equisetin N-methyltransferase gene eqxD and consequently, trichosetin was isolated as final product. The adaption of the inducible, tetracycline-dependent Tet-on promoter system from Aspergillus niger achieved a controlled overproduction of this toxic metabolite and a functional characterization of each cluster gene in F. fujikuroi. Overexpression of one of the two cluster-specific transcription factor (TF) genes, TF22, led to an activation of the three biosynthetic cluster genes, including the PKS-NRPS key gene. In contrast, overexpression of TF23, encoding a second Zn(II)2Cys6 TF, did not activate adjacent cluster genes. Instead, TF23 was induced by the final product trichosetin and was required for expression of the transporter-encoding gene MFS-T. TF23 and MFS-T likely act in consort and contribute to detoxification of trichosetin and therefore, self-protection of the producing fungus. PMID:28379186

  20. Establishment of the Inducible Tet-On System for the Activation of the Silent Trichosetin Gene Cluster in Fusarium fujikuroi.

    PubMed

    Janevska, Slavica; Arndt, Birgit; Baumann, Leonie; Apken, Lisa Helene; Mauriz Marques, Lucas Maciel; Humpf, Hans-Ulrich; Tudzynski, Bettina

    2017-04-05

    The PKS-NRPS-derived tetramic acid equisetin and its N -desmethyl derivative trichosetin exhibit remarkable biological activities against a variety of organisms, including plants and bacteria, e.g., Staphylococcus aureus . The equisetin biosynthetic gene cluster was first described in Fusarium heterosporum , a species distantly related to the notorious rice pathogen Fusarium fujikuroi . Here we present the activation and characterization of a homologous, but silent, gene cluster in F. fujikuroi . Bioinformatic analysis revealed that this cluster does not contain the equisetin N -methyltransferase gene eqxD and consequently, trichosetin was isolated as final product. The adaption of the inducible, tetracycline-dependent Tet-on promoter system from Aspergillus niger achieved a controlled overproduction of this toxic metabolite and a functional characterization of each cluster gene in F. fujikuroi . Overexpression of one of the two cluster-specific transcription factor (TF) genes, TF22 , led to an activation of the three biosynthetic cluster genes, including the PKS-NRPS key gene. In contrast, overexpression of TF23 , encoding a second Zn(II)₂Cys₆ TF, did not activate adjacent cluster genes. Instead, TF23 was induced by the final product trichosetin and was required for expression of the transporter-encoding gene MFS-T . TF23 and MFS-T likely act in consort and contribute to detoxification of trichosetin and therefore, self-protection of the producing fungus.

  1. Characterisation of the paralytic shellfish toxin biosynthesis gene clusters in Anabaena circinalis AWQC131C and Aphanizomenon sp. NH-5.

    PubMed

    Mihali, Troco K; Kellmann, Ralf; Neilan, Brett A

    2009-03-30

    Saxitoxin and its analogues collectively known as the paralytic shellfish toxins (PSTs) are neurotoxic alkaloids and are the cause of the syndrome named paralytic shellfish poisoning. PSTs are produced by a unique biosynthetic pathway, which involves reactions that are rare in microbial metabolic pathways. Nevertheless, distantly related organisms such as dinoflagellates and cyanobacteria appear to produce these toxins using the same pathway. Hypothesised explanations for such an unusual phylogenetic distribution of this shared uncommon metabolic pathway, include a polyphyletic origin, an involvement of symbiotic bacteria, and horizontal gene transfer. We describe the identification, annotation and bioinformatic characterisation of the putative paralytic shellfish toxin biosynthesis clusters in an Australian isolate of Anabaena circinalis and an American isolate of Aphanizomenon sp., both members of the Nostocales. These putative PST gene clusters span approximately 28 kb and contain genes coding for the biosynthesis and export of the toxin. A putative insertion/excision site in the Australian Anabaena circinalis AWQC131C was identified, and the organization and evolution of the gene clusters are discussed. A biosynthetic pathway leading to the formation of saxitoxin and its analogues in these organisms is proposed. The PST biosynthesis gene cluster presents a mosaic structure, whereby genes have apparently transposed in segments of varying size, resulting in different gene arrangements in all three sxt clusters sequenced so far. The gene cluster organizational structure and sequence similarity seems to reflect the phylogeny of the producer organisms, indicating that the gene clusters have an ancient origin, or that their lateral transfer was also an ancient event. The knowledge we gain from the characterisation of the PST biosynthesis gene clusters, including the identity and sequence of the genes involved in the biosynthesis, may also afford the identification of these gene clusters in dinoflagellates, the cause of human mortalities and significant financial loss to the tourism and shellfish industries.

  2. Characterisation of the paralytic shellfish toxin biosynthesis gene clusters in Anabaena circinalis AWQC131C and Aphanizomenon sp. NH-5

    PubMed Central

    Mihali, Troco K; Kellmann, Ralf; Neilan, Brett A

    2009-01-01

    Background Saxitoxin and its analogues collectively known as the paralytic shellfish toxins (PSTs) are neurotoxic alkaloids and are the cause of the syndrome named paralytic shellfish poisoning. PSTs are produced by a unique biosynthetic pathway, which involves reactions that are rare in microbial metabolic pathways. Nevertheless, distantly related organisms such as dinoflagellates and cyanobacteria appear to produce these toxins using the same pathway. Hypothesised explanations for such an unusual phylogenetic distribution of this shared uncommon metabolic pathway, include a polyphyletic origin, an involvement of symbiotic bacteria, and horizontal gene transfer. Results We describe the identification, annotation and bioinformatic characterisation of the putative paralytic shellfish toxin biosynthesis clusters in an Australian isolate of Anabaena circinalis and an American isolate of Aphanizomenon sp., both members of the Nostocales. These putative PST gene clusters span approximately 28 kb and contain genes coding for the biosynthesis and export of the toxin. A putative insertion/excision site in the Australian Anabaena circinalis AWQC131C was identified, and the organization and evolution of the gene clusters are discussed. A biosynthetic pathway leading to the formation of saxitoxin and its analogues in these organisms is proposed. Conclusion The PST biosynthesis gene cluster presents a mosaic structure, whereby genes have apparently transposed in segments of varying size, resulting in different gene arrangements in all three sxt clusters sequenced so far. The gene cluster organizational structure and sequence similarity seems to reflect the phylogeny of the producer organisms, indicating that the gene clusters have an ancient origin, or that their lateral transfer was also an ancient event. The knowledge we gain from the characterisation of the PST biosynthesis gene clusters, including the identity and sequence of the genes involved in the biosynthesis, may also afford the identification of these gene clusters in dinoflagellates, the cause of human mortalities and significant financial loss to the tourism and shellfish industries. PMID:19331657

  3. VIZARD: analysis of Affymetrix Arabidopsis GeneChip data

    NASA Technical Reports Server (NTRS)

    Moseyko, Nick; Feldman, Lewis J.

    2002-01-01

    SUMMARY: The Affymetrix GeneChip Arabidopsis genome array has proved to be a very powerful tool for the analysis of gene expression in Arabidopsis thaliana, the most commonly studied plant model organism. VIZARD is a Java program created at the University of California, Berkeley, to facilitate analysis of Arabidopsis GeneChip data. It includes several integrated tools for filtering, sorting, clustering and visualization of gene expression data as well as tools for the discovery of regulatory motifs in upstream sequences. VIZARD also includes annotation and upstream sequence databases for the majority of genes represented on the Affymetrix Arabidopsis GeneChip array. AVAILABILITY: VIZARD is available free of charge for educational, research, and not-for-profit purposes, and can be downloaded at http://www.anm.f2s.com/research/vizard/ CONTACT: moseyko@uclink4.berkeley.edu.

  4. Platypus globin genes and flanking loci suggest a new insertional model for beta-globin evolution in birds and mammals

    PubMed Central

    Patel, Vidushi S; Cooper, Steven JB; Deakin, Janine E; Fulton, Bob; Graves, Tina; Warren, Wesley C; Wilson, Richard K; Graves, Jennifer AM

    2008-01-01

    Background Vertebrate alpha (α)- and beta (β)-globin gene families exemplify the way in which genomes evolve to produce functional complexity. From tandem duplication of a single globin locus, the α- and β-globin clusters expanded, and then were separated onto different chromosomes. The previous finding of a fossil β-globin gene (ω) in the marsupial α-cluster, however, suggested that duplication of the α-β cluster onto two chromosomes, followed by lineage-specific gene loss and duplication, produced paralogous α- and β-globin clusters in birds and mammals. Here we analyse genomic data from an egg-laying monotreme mammal, the platypus (Ornithorhynchus anatinus), to explore haemoglobin evolution at the stem of the mammalian radiation. Results The platypus α-globin cluster (chromosome 21) contains embryonic and adult α- globin genes, a β-like ω-globin gene, and the GBY globin gene with homology to cytoglobin, arranged as 5'-ζ-ζ'-αD-α3-α2-α1-ω-GBY-3'. The platypus β-globin cluster (chromosome 2) contains single embryonic and adult globin genes arranged as 5'-ε-β-3'. Surprisingly, all of these globin genes were expressed in some adult tissues. Comparison of flanking sequences revealed that all jawed vertebrate α-globin clusters are flanked by MPG-C16orf35 and LUC7L, whereas all bird and mammal β-globin clusters are embedded in olfactory genes. Thus, the mammalian α- and β-globin clusters are orthologous to the bird α- and β-globin clusters respectively. Conclusion We propose that α- and β-globin clusters evolved from an ancient MPG-C16orf35-α-β-GBY-LUC7L arrangement 410 million years ago. A copy of the original β (represented by ω in marsupials and monotremes) was inserted into an array of olfactory genes before the amniote radiation (>315 million years ago), then duplicated and diverged to form orthologous clusters of β-globin genes with different expression profiles in different lineages. PMID:18657265

  5. Integrated analysis of chromosome copy number variation and gene expression in cervical carcinoma

    PubMed Central

    Yan, Deng; Yi, Song; Chiu, Wang Chi; Qin, Liu Gui; Kin, Wong Hoi; Kwok Hung, Chung Tony; Linxiao, Han; Wai, Choy Kwong; Yi, Sui; Tao, Yang; Tao, Tang

    2017-01-01

    Objective This study was conducted to explore chromosomal copy number variations (CNV) and transcript expression and to examine pathways in cervical pathogenesis using genome-wide high resolution microarrays. Methods Genome-wide chromosomal CNVs were investigated in 6 cervical cancer cell lines by Human Genome CGH Microarray Kit (4x44K). Gene expression profiles in cervical cancer cell lines, primary cervical carcinoma and normal cervical epithelium tissues were also studied using the Whole Human Genome Microarray Kit (4x44K). Results Fifty common chromosomal CNVs were identified in the cervical cancer cell lines. Correlation analysis revealed that gene up-regulation or down-regulation is significantly correlated with genomic amplification (P=0.009) or deletion (P=0.006) events. Expression profiles were identified through cluster analysis. Gene annotation analysis pinpointed cell cycle pathways was significantly (P=1.15E-08) affected in cervical cancer. Common CNVs were associated with cervical cancer. Conclusion Chromosomal CNVs may contribute to their transcript expression in cervical cancer. PMID:29312578

  6. Defining the role of common variation in the genomic and biological architecture of adult human height.

    PubMed

    Wood, Andrew R; Esko, Tonu; Yang, Jian; Vedantam, Sailaja; Pers, Tune H; Gustafsson, Stefan; Chu, Audrey Y; Estrada, Karol; Luan, Jian'an; Kutalik, Zoltán; Amin, Najaf; Buchkovich, Martin L; Croteau-Chonka, Damien C; Day, Felix R; Duan, Yanan; Fall, Tove; Fehrmann, Rudolf; Ferreira, Teresa; Jackson, Anne U; Karjalainen, Juha; Lo, Ken Sin; Locke, Adam E; Mägi, Reedik; Mihailov, Evelin; Porcu, Eleonora; Randall, Joshua C; Scherag, André; Vinkhuyzen, Anna A E; Westra, Harm-Jan; Winkler, Thomas W; Workalemahu, Tsegaselassie; Zhao, Jing Hua; Absher, Devin; Albrecht, Eva; Anderson, Denise; Baron, Jeffrey; Beekman, Marian; Demirkan, Ayse; Ehret, Georg B; Feenstra, Bjarke; Feitosa, Mary F; Fischer, Krista; Fraser, Ross M; Goel, Anuj; Gong, Jian; Justice, Anne E; Kanoni, Stavroula; Kleber, Marcus E; Kristiansson, Kati; Lim, Unhee; Lotay, Vaneet; Lui, Julian C; Mangino, Massimo; Mateo Leach, Irene; Medina-Gomez, Carolina; Nalls, Michael A; Nyholt, Dale R; Palmer, Cameron D; Pasko, Dorota; Pechlivanis, Sonali; Prokopenko, Inga; Ried, Janina S; Ripke, Stephan; Shungin, Dmitry; Stancáková, Alena; Strawbridge, Rona J; Sung, Yun Ju; Tanaka, Toshiko; Teumer, Alexander; Trompet, Stella; van der Laan, Sander W; van Setten, Jessica; Van Vliet-Ostaptchouk, Jana V; Wang, Zhaoming; Yengo, Loïc; Zhang, Weihua; Afzal, Uzma; Arnlöv, Johan; Arscott, Gillian M; Bandinelli, Stefania; Barrett, Amy; Bellis, Claire; Bennett, Amanda J; Berne, Christian; Blüher, Matthias; Bolton, Jennifer L; Böttcher, Yvonne; Boyd, Heather A; Bruinenberg, Marcel; Buckley, Brendan M; Buyske, Steven; Caspersen, Ida H; Chines, Peter S; Clarke, Robert; Claudi-Boehm, Simone; Cooper, Matthew; Daw, E Warwick; De Jong, Pim A; Deelen, Joris; Delgado, Graciela; Denny, Josh C; Dhonukshe-Rutten, Rosalie; Dimitriou, Maria; Doney, Alex S F; Dörr, Marcus; Eklund, Niina; Eury, Elodie; Folkersen, Lasse; Garcia, Melissa E; Geller, Frank; Giedraitis, Vilmantas; Go, Alan S; Grallert, Harald; Grammer, Tanja B; Gräßler, Jürgen; Grönberg, Henrik; de Groot, Lisette C P G M; Groves, Christopher J; Haessler, Jeffrey; Hall, Per; Haller, Toomas; Hallmans, Goran; Hannemann, Anke; Hartman, Catharina A; Hassinen, Maija; Hayward, Caroline; Heard-Costa, Nancy L; Helmer, Quinta; Hemani, Gibran; Henders, Anjali K; Hillege, Hans L; Hlatky, Mark A; Hoffmann, Wolfgang; Hoffmann, Per; Holmen, Oddgeir; Houwing-Duistermaat, Jeanine J; Illig, Thomas; Isaacs, Aaron; James, Alan L; Jeff, Janina; Johansen, Berit; Johansson, Åsa; Jolley, Jennifer; Juliusdottir, Thorhildur; Junttila, Juhani; Kho, Abel N; Kinnunen, Leena; Klopp, Norman; Kocher, Thomas; Kratzer, Wolfgang; Lichtner, Peter; Lind, Lars; Lindström, Jaana; Lobbens, Stéphane; Lorentzon, Mattias; Lu, Yingchang; Lyssenko, Valeriya; Magnusson, Patrik K E; Mahajan, Anubha; Maillard, Marc; McArdle, Wendy L; McKenzie, Colin A; McLachlan, Stela; McLaren, Paul J; Menni, Cristina; Merger, Sigrun; Milani, Lili; Moayyeri, Alireza; Monda, Keri L; Morken, Mario A; Müller, Gabriele; Müller-Nurasyid, Martina; Musk, Arthur W; Narisu, Narisu; Nauck, Matthias; Nolte, Ilja M; Nöthen, Markus M; Oozageer, Laticia; Pilz, Stefan; Rayner, Nigel W; Renstrom, Frida; Robertson, Neil R; Rose, Lynda M; Roussel, Ronan; Sanna, Serena; Scharnagl, Hubert; Scholtens, Salome; Schumacher, Fredrick R; Schunkert, Heribert; Scott, Robert A; Sehmi, Joban; Seufferlein, Thomas; Shi, Jianxin; Silventoinen, Karri; Smit, Johannes H; Smith, Albert Vernon; Smolonska, Joanna; Stanton, Alice V; Stirrups, Kathleen; Stott, David J; Stringham, Heather M; Sundström, Johan; Swertz, Morris A; Syvänen, Ann-Christine; Tayo, Bamidele O; Thorleifsson, Gudmar; Tyrer, Jonathan P; van Dijk, Suzanne; van Schoor, Natasja M; van der Velde, Nathalie; van Heemst, Diana; van Oort, Floor V A; Vermeulen, Sita H; Verweij, Niek; Vonk, Judith M; Waite, Lindsay L; Waldenberger, Melanie; Wennauer, Roman; Wilkens, Lynne R; Willenborg, Christina; Wilsgaard, Tom; Wojczynski, Mary K; Wong, Andrew; Wright, Alan F; Zhang, Qunyuan; Arveiler, Dominique; Bakker, Stephan J L; Beilby, John; Bergman, Richard N; Bergmann, Sven; Biffar, Reiner; Blangero, John; Boomsma, Dorret I; Bornstein, Stefan R; Bovet, Pascal; Brambilla, Paolo; Brown, Morris J; Campbell, Harry; Caulfield, Mark J; Chakravarti, Aravinda; Collins, Rory; Collins, Francis S; Crawford, Dana C; Cupples, L Adrienne; Danesh, John; de Faire, Ulf; den Ruijter, Hester M; Erbel, Raimund; Erdmann, Jeanette; Eriksson, Johan G; Farrall, Martin; Ferrannini, Ele; Ferrières, Jean; Ford, Ian; Forouhi, Nita G; Forrester, Terrence; Gansevoort, Ron T; Gejman, Pablo V; Gieger, Christian; Golay, Alain; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Haas, David W; Hall, Alistair S; Harris, Tamara B; Hattersley, Andrew T; Heath, Andrew C; Hengstenberg, Christian; Hicks, Andrew A; Hindorff, Lucia A; Hingorani, Aroon D; Hofman, Albert; Hovingh, G Kees; Humphries, Steve E; Hunt, Steven C; Hypponen, Elina; Jacobs, Kevin B; Jarvelin, Marjo-Riitta; Jousilahti, Pekka; Jula, Antti M; Kaprio, Jaakko; Kastelein, John J P; Kayser, Manfred; Kee, Frank; Keinanen-Kiukaanniemi, Sirkka M; Kiemeney, Lambertus A; Kooner, Jaspal S; Kooperberg, Charles; Koskinen, Seppo; Kovacs, Peter; Kraja, Aldi T; Kumari, Meena; Kuusisto, Johanna; Lakka, Timo A; Langenberg, Claudia; Le Marchand, Loic; Lehtimäki, Terho; Lupoli, Sara; Madden, Pamela A F; Männistö, Satu; Manunta, Paolo; Marette, André; Matise, Tara C; McKnight, Barbara; Meitinger, Thomas; Moll, Frans L; Montgomery, Grant W; Morris, Andrew D; Morris, Andrew P; Murray, Jeffrey C; Nelis, Mari; Ohlsson, Claes; Oldehinkel, Albertine J; Ong, Ken K; Ouwehand, Willem H; Pasterkamp, Gerard; Peters, Annette; Pramstaller, Peter P; Price, Jackie F; Qi, Lu; Raitakari, Olli T; Rankinen, Tuomo; Rao, D C; Rice, Treva K; Ritchie, Marylyn; Rudan, Igor; Salomaa, Veikko; Samani, Nilesh J; Saramies, Jouko; Sarzynski, Mark A; Schwarz, Peter E H; Sebert, Sylvain; Sever, Peter; Shuldiner, Alan R; Sinisalo, Juha; Steinthorsdottir, Valgerdur; Stolk, Ronald P; Tardif, Jean-Claude; Tönjes, Anke; Tremblay, Angelo; Tremoli, Elena; Virtamo, Jarmo; Vohl, Marie-Claude; Amouyel, Philippe; Asselbergs, Folkert W; Assimes, Themistocles L; Bochud, Murielle; Boehm, Bernhard O; Boerwinkle, Eric; Bottinger, Erwin P; Bouchard, Claude; Cauchi, Stéphane; Chambers, John C; Chanock, Stephen J; Cooper, Richard S; de Bakker, Paul I W; Dedoussis, George; Ferrucci, Luigi; Franks, Paul W; Froguel, Philippe; Groop, Leif C; Haiman, Christopher A; Hamsten, Anders; Hayes, M Geoffrey; Hui, Jennie; Hunter, David J; Hveem, Kristian; Jukema, J Wouter; Kaplan, Robert C; Kivimaki, Mika; Kuh, Diana; Laakso, Markku; Liu, Yongmei; Martin, Nicholas G; März, Winfried; Melbye, Mads; Moebus, Susanne; Munroe, Patricia B; Njølstad, Inger; Oostra, Ben A; Palmer, Colin N A; Pedersen, Nancy L; Perola, Markus; Pérusse, Louis; Peters, Ulrike; Powell, Joseph E; Power, Chris; Quertermous, Thomas; Rauramaa, Rainer; Reinmaa, Eva; Ridker, Paul M; Rivadeneira, Fernando; Rotter, Jerome I; Saaristo, Timo E; Saleheen, Danish; Schlessinger, David; Slagboom, P Eline; Snieder, Harold; Spector, Tim D; Strauch, Konstantin; Stumvoll, Michael; Tuomilehto, Jaakko; Uusitupa, Matti; van der Harst, Pim; Völzke, Henry; Walker, Mark; Wareham, Nicholas J; Watkins, Hugh; Wichmann, H-Erich; Wilson, James F; Zanen, Pieter; Deloukas, Panos; Heid, Iris M; Lindgren, Cecilia M; Mohlke, Karen L; Speliotes, Elizabeth K; Thorsteinsdottir, Unnur; Barroso, Inês; Fox, Caroline S; North, Kari E; Strachan, David P; Beckmann, Jacques S; Berndt, Sonja I; Boehnke, Michael; Borecki, Ingrid B; McCarthy, Mark I; Metspalu, Andres; Stefansson, Kari; Uitterlinden, André G; van Duijn, Cornelia M; Franke, Lude; Willer, Cristen J; Price, Alkes L; Lettre, Guillaume; Loos, Ruth J F; Weedon, Michael N; Ingelsson, Erik; O'Connell, Jeffrey R; Abecasis, Goncalo R; Chasman, Daniel I; Goddard, Michael E; Visscher, Peter M; Hirschhorn, Joel N; Frayling, Timothy M

    2014-11-01

    Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.

  7. Defining the role of common variation in the genomic and biological architecture of adult human height

    PubMed Central

    Chu, Audrey Y; Estrada, Karol; Luan, Jian’an; Kutalik, Zoltán; Amin, Najaf; Buchkovich, Martin L; Croteau-Chonka, Damien C; Day, Felix R; Duan, Yanan; Fall, Tove; Fehrmann, Rudolf; Ferreira, Teresa; Jackson, Anne U; Karjalainen, Juha; Lo, Ken Sin; Locke, Adam E; Mägi, Reedik; Mihailov, Evelin; Porcu, Eleonora; Randall, Joshua C; Scherag, André; Vinkhuyzen, Anna AE; Westra, Harm-Jan; Winkler, Thomas W; Workalemahu, Tsegaselassie; Zhao, Jing Hua; Absher, Devin; Albrecht, Eva; Anderson, Denise; Baron, Jeffrey; Beekman, Marian; Demirkan, Ayse; Ehret, Georg B; Feenstra, Bjarke; Feitosa, Mary F; Fischer, Krista; Fraser, Ross M; Goel, Anuj; Gong, Jian; Justice, Anne E; Kanoni, Stavroula; Kleber, Marcus E; Kristiansson, Kati; Lim, Unhee; Lotay, Vaneet; Lui, Julian C; Mangino, Massimo; Leach, Irene Mateo; Medina-Gomez, Carolina; Nalls, Michael A; Nyholt, Dale R; Palmer, Cameron D; Pasko, Dorota; Pechlivanis, Sonali; Prokopenko, Inga; Ried, Janina S; Ripke, Stephan; Shungin, Dmitry; Stancáková, Alena; Strawbridge, Rona J; Sung, Yun Ju; Tanaka, Toshiko; Teumer, Alexander; Trompet, Stella; van der Laan, Sander W; van Setten, Jessica; Van Vliet-Ostaptchouk, Jana V; Wang, Zhaoming; Yengo, Loïc; Zhang, Weihua; Afzal, Uzma; Ärnlöv, Johan; Arscott, Gillian M; Bandinelli, Stefania; Barrett, Amy; Bellis, Claire; Bennett, Amanda J; Berne, Christian; Blüher, Matthias; Bolton, Jennifer L; Böttcher, Yvonne; Boyd, Heather A; Bruinenberg, Marcel; Buckley, Brendan M; Buyske, Steven; Caspersen, Ida H; Chines, Peter S; Clarke, Robert; Claudi-Boehm, Simone; Cooper, Matthew; Daw, E Warwick; De Jong, Pim A; Deelen, Joris; Delgado, Graciela; Denny, Josh C; Dhonukshe-Rutten, Rosalie; Dimitriou, Maria; Doney, Alex SF; Dörr, Marcus; Eklund, Niina; Eury, Elodie; Folkersen, Lasse; Garcia, Melissa E; Geller, Frank; Giedraitis, Vilmantas; Go, Alan S; Grallert, Harald; Grammer, Tanja B; Gräßler, Jürgen; Grönberg, Henrik; de Groot, Lisette C.P.G.M.; Groves, Christopher J; Haessler, Jeffrey; Hall, Per; Haller, Toomas; Hallmans, Goran; Hannemann, Anke; Hartman, Catharina A; Hassinen, Maija; Hayward, Caroline; Heard-Costa, Nancy L; Helmer, Quinta; Hemani, Gibran; Henders, Anjali K; Hillege, Hans L; Hlatky, Mark A; Hoffmann, Wolfgang; Hoffmann, Per; Holmen, Oddgeir; Houwing-Duistermaat, Jeanine J; Illig, Thomas; Isaacs, Aaron; James, Alan L; Jeff, Janina; Johansen, Berit; Johansson, Åsa; Jolley, Jennifer; Juliusdottir, Thorhildur; Junttila, Juhani; Kho, Abel N; Kinnunen, Leena; Klopp, Norman; Kocher, Thomas; Kratzer, Wolfgang; Lichtner, Peter; Lind, Lars; Lindström, Jaana; Lobbens, Stéphane; Lorentzon, Mattias; Lu, Yingchang; Lyssenko, Valeriya; Magnusson, Patrik KE; Mahajan, Anubha; Maillard, Marc; McArdle, Wendy L; McKenzie, Colin A; McLachlan, Stela; McLaren, Paul J; Menni, Cristina; Merger, Sigrun; Milani, Lili; Moayyeri, Alireza; Monda, Keri L; Morken, Mario A; Müller, Gabriele; Müller-Nurasyid, Martina; Musk, Arthur W; Narisu, Narisu; Nauck, Matthias; Nolte, Ilja M; Nöthen, Markus M; Oozageer, Laticia; Pilz, Stefan; Rayner, Nigel W; Renstrom, Frida; Robertson, Neil R; Rose, Lynda M; Roussel, Ronan; Sanna, Serena; Scharnagl, Hubert; Scholtens, Salome; Schumacher, Fredrick R; Schunkert, Heribert; Scott, Robert A; Sehmi, Joban; Seufferlein, Thomas; Shi, Jianxin; Silventoinen, Karri; Smit, Johannes H; Smith, Albert Vernon; Smolonska, Joanna; Stanton, Alice V; Stirrups, Kathleen; Stott, David J; Stringham, Heather M; Sundström, Johan; Swertz, Morris A; Syvänen, Ann-Christine; Tayo, Bamidele O; Thorleifsson, Gudmar; Tyrer, Jonathan P; van Dijk, Suzanne; van Schoor, Natasja M; van der Velde, Nathalie; van Heemst, Diana; van Oort, Floor VA; Vermeulen, Sita H; Verweij, Niek; Vonk, Judith M; Waite, Lindsay L; Waldenberger, Melanie; Wennauer, Roman; Wilkens, Lynne R; Willenborg, Christina; Wilsgaard, Tom; Wojczynski, Mary K; Wong, Andrew; Wright, Alan F; Zhang, Qunyuan; Arveiler, Dominique; Bakker, Stephan JL; Beilby, John; Bergman, Richard N; Bergmann, Sven; Biffar, Reiner; Blangero, John; Boomsma, Dorret I; Bornstein, Stefan R; Bovet, Pascal; Brambilla, Paolo; Brown, Morris J; Campbell, Harry; Caulfield, Mark J; Chakravarti, Aravinda; Collins, Rory; Collins, Francis S; Crawford, Dana C; Cupples, L Adrienne; Danesh, John; de Faire, Ulf; den Ruijter, Hester M; Erbel, Raimund; Erdmann, Jeanette; Eriksson, Johan G; Farrall, Martin; Ferrannini, Ele; Ferrières, Jean; Ford, Ian; Forouhi, Nita G; Forrester, Terrence; Gansevoort, Ron T; Gejman, Pablo V; Gieger, Christian; Golay, Alain; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Haas, David W; Hall, Alistair S; Harris, Tamara B; Hattersley, Andrew T; Heath, Andrew C; Hengstenberg, Christian; Hicks, Andrew A; Hindorff, Lucia A; Hingorani, Aroon D; Hofman, Albert; Hovingh, G Kees; Humphries, Steve E; Hunt, Steven C; Hypponen, Elina; Jacobs, Kevin B; Jarvelin, Marjo-Riitta; Jousilahti, Pekka; Jula, Antti M; Kaprio, Jaakko; Kastelein, John JP; Kayser, Manfred; Kee, Frank; Keinanen-Kiukaanniemi, Sirkka M; Kiemeney, Lambertus A; Kooner, Jaspal S; Kooperberg, Charles; Koskinen, Seppo; Kovacs, Peter; Kraja, Aldi T; Kumari, Meena; Kuusisto, Johanna; Lakka, Timo A; Langenberg, Claudia; Le Marchand, Loic; Lehtimäki, Terho; Lupoli, Sara; Madden, Pamela AF; Männistö, Satu; Manunta, Paolo; Marette, André; Matise, Tara C; McKnight, Barbara; Meitinger, Thomas; Moll, Frans L; Montgomery, Grant W; Morris, Andrew D; Morris, Andrew P; Murray, Jeffrey C; Nelis, Mari; Ohlsson, Claes; Oldehinkel, Albertine J; Ong, Ken K; Ouwehand, Willem H; Pasterkamp, Gerard; Peters, Annette; Pramstaller, Peter P; Price, Jackie F; Qi, Lu; Raitakari, Olli T; Rankinen, Tuomo; Rao, DC; Rice, Treva K; Ritchie, Marylyn; Rudan, Igor; Salomaa, Veikko; Samani, Nilesh J; Saramies, Jouko; Sarzynski, Mark A; Schwarz, Peter EH; Sebert, Sylvain; Sever, Peter; Shuldiner, Alan R; Sinisalo, Juha; Steinthorsdottir, Valgerdur; Stolk, Ronald P; Tardif, Jean-Claude; Tönjes, Anke; Tremblay, Angelo; Tremoli, Elena; Virtamo, Jarmo; Vohl, Marie-Claude; Amouyel, Philippe; Asselbergs, Folkert W; Assimes, Themistocles L; Bochud, Murielle; Boehm, Bernhard O; Boerwinkle, Eric; Bottinger, Erwin P; Bouchard, Claude; Cauchi, Stéphane; Chambers, John C; Chanock, Stephen J; Cooper, Richard S; de Bakker, Paul IW; Dedoussis, George; Ferrucci, Luigi; Franks, Paul W; Froguel, Philippe; Groop, Leif C; Haiman, Christopher A; Hamsten, Anders; Hayes, M Geoffrey; Hui, Jennie; Hunter, David J.; Hveem, Kristian; Jukema, J Wouter; Kaplan, Robert C; Kivimaki, Mika; Kuh, Diana; Laakso, Markku; Liu, Yongmei; Martin, Nicholas G; März, Winfried; Melbye, Mads; Moebus, Susanne; Munroe, Patricia B; Njølstad, Inger; Oostra, Ben A; Palmer, Colin NA; Pedersen, Nancy L; Perola, Markus; Pérusse, Louis; Peters, Ulrike; Powell, Joseph E; Power, Chris; Quertermous, Thomas; Rauramaa, Rainer; Reinmaa, Eva; Ridker, Paul M; Rivadeneira, Fernando; Rotter, Jerome I; Saaristo, Timo E; Saleheen, Danish; Schlessinger, David; Slagboom, P Eline; Snieder, Harold; Spector, Tim D; Strauch, Konstantin; Stumvoll, Michael; Tuomilehto, Jaakko; Uusitupa, Matti; van der Harst, Pim; Völzke, Henry; Walker, Mark; Wareham, Nicholas J; Watkins, Hugh; Wichmann, H-Erich; Wilson, James F; Zanen, Pieter; Deloukas, Panos; Heid, Iris M; Lindgren, Cecilia M; Mohlke, Karen L; Speliotes, Elizabeth K; Thorsteinsdottir, Unnur; Barroso, Inês; Fox, Caroline S; North, Kari E; Strachan, David P; Beckmann, Jacques S.; Berndt, Sonja I; Boehnke, Michael; Borecki, Ingrid B; McCarthy, Mark I; Metspalu, Andres; Stefansson, Kari; Uitterlinden, André G; van Duijn, Cornelia M; Franke, Lude; Willer, Cristen J; Price, Alkes L.; Lettre, Guillaume; Loos, Ruth JF; Weedon, Michael N; Ingelsson, Erik; O’Connell, Jeffrey R; Abecasis, Goncalo R; Chasman, Daniel I; Goddard, Michael E

    2014-01-01

    Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explain one-fifth of heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ~2,000, ~3,700 and ~9,500 SNPs explained ~21%, ~24% and ~29% of phenotypic variance. Furthermore, all common variants together captured the majority (60%) of heritability. The 697 variants clustered in 423 loci enriched for genes, pathways, and tissue-types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/beta-catenin, and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants. PMID:25282103

  8. Integrated analysis of chromosome copy number variation and gene expression in cervical carcinoma.

    PubMed

    Yan, Deng; Yi, Song; Chiu, Wang Chi; Qin, Liu Gui; Kin, Wong Hoi; Kwok Hung, Chung Tony; Linxiao, Han; Wai, Choy Kwong; Yi, Sui; Tao, Yang; Tao, Tang

    2017-12-12

    This study was conducted to explore chromosomal copy number variations (CNV) and transcript expression and to examine pathways in cervical pathogenesis using genome-wide high resolution microarrays. Genome-wide chromosomal CNVs were investigated in 6 cervical cancer cell lines by Human Genome CGH Microarray Kit (4x44K). Gene expression profiles in cervical cancer cell lines, primary cervical carcinoma and normal cervical epithelium tissues were also studied using the Whole Human Genome Microarray Kit (4x44K). Fifty common chromosomal CNVs were identified in the cervical cancer cell lines. Correlation analysis revealed that gene up-regulation or down-regulation is significantly correlated with genomic amplification ( P =0.009) or deletion ( P =0.006) events. Expression profiles were identified through cluster analysis. Gene annotation analysis pinpointed cell cycle pathways was significantly ( P =1.15E-08) affected in cervical cancer. Common CNVs were associated with cervical cancer. Chromosomal CNVs may contribute to their transcript expression in cervical cancer.

  9. Comparative genomics reveals phylogenetic distribution patterns of secondary metabolites in Amycolatopsis species.

    PubMed

    Adamek, Martina; Alanjary, Mohammad; Sales-Ortells, Helena; Goodfellow, Michael; Bull, Alan T; Winkler, Anika; Wibberg, Daniel; Kalinowski, Jörn; Ziemert, Nadine

    2018-06-01

    Genome mining tools have enabled us to predict biosynthetic gene clusters that might encode compounds with valuable functions for industrial and medical applications. With the continuously increasing number of genomes sequenced, we are confronted with an overwhelming number of predicted clusters. In order to guide the effective prioritization of biosynthetic gene clusters towards finding the most promising compounds, knowledge about diversity, phylogenetic relationships and distribution patterns of biosynthetic gene clusters is necessary. Here, we provide a comprehensive analysis of the model actinobacterial genus Amycolatopsis and its potential for the production of secondary metabolites. A phylogenetic characterization, together with a pan-genome analysis showed that within this highly diverse genus, four major lineages could be distinguished which differed in their potential to produce secondary metabolites. Furthermore, we were able to distinguish gene cluster families whose distribution correlated with phylogeny, indicating that vertical gene transfer plays a major role in the evolution of secondary metabolite gene clusters. Still, the vast majority of the diverse biosynthetic gene clusters were derived from clusters unique to the genus, and also unique in comparison to a database of known compounds. Our study on the locations of biosynthetic gene clusters in the genomes of Amycolatopsis' strains showed that clusters acquired by horizontal gene transfer tend to be incorporated into non-conserved regions of the genome thereby allowing us to distinguish core and hypervariable regions in Amycolatopsis genomes. Using a comparative genomics approach, it was possible to determine the potential of the genus Amycolatopsis to produce a huge diversity of secondary metabolites. Furthermore, the analysis demonstrates that horizontal and vertical gene transfer play an important role in the acquisition and maintenance of valuable secondary metabolites. Our results cast light on the interconnections between secondary metabolite gene clusters and provide a way to prioritize biosynthetic pathways in the search and discovery of novel compounds.

  10. Genome-wide DNA methylation analysis reveals estrogen-mediated epigenetic repression of metallothionein-1 gene cluster in breast cancer.

    PubMed

    Jadhav, Rohit R; Ye, Zhenqing; Huang, Rui-Lan; Liu, Joseph; Hsu, Pei-Yin; Huang, Yi-Wen; Rangel, Leticia B; Lai, Hung-Cheng; Roa, Juan Carlos; Kirma, Nameer B; Huang, Tim Hui-Ming; Jin, Victor X

    2015-01-01

    Recent genome-wide analysis has shown that DNA methylation spans long stretches of chromosome regions consisting of clusters of contiguous CpG islands or gene families. Hypermethylation of various gene clusters has been reported in many types of cancer. In this study, we conducted methyl-binding domain capture (MBDCap) sequencing (MBD-seq) analysis on a breast cancer cohort consisting of 77 patients and 10 normal controls, as well as a panel of 38 breast cancer cell lines. Bioinformatics analysis determined seven gene clusters with a significant difference in overall survival (OS) and further revealed a distinct feature that the conservation of a large gene cluster (approximately 70 kb) metallothionein-1 (MT1) among 45 species is much lower than the average of all RefSeq genes. Furthermore, we found that DNA methylation is an important epigenetic regulator contributing to gene repression of MT1 gene cluster in both ERα positive (ERα+) and ERα negative (ERα-) breast tumors. In silico analysis revealed much lower gene expression of this cluster in The Cancer Genome Atlas (TCGA) cohort for ERα + tumors. To further investigate the role of estrogen, we conducted 17β-estradiol (E2) and demethylating agent 5-aza-2'-deoxycytidine (DAC) treatment in various breast cancer cell types. Cell proliferation and invasion assays suggested MT1F and MT1M may play an anti-oncogenic role in breast cancer. Our data suggests that DNA methylation in large contiguous gene clusters can be potential prognostic markers of breast cancer. Further investigation of these clusters revealed that estrogen mediates epigenetic repression of MT1 cluster in ERα + breast cancer cell lines. In all, our studies identify thousands of breast tumor hypermethylated regions for the first time, in particular, discovering seven large contiguous hypermethylated gene clusters.

  11. Biogeography of Burkholderia pseudomallei in the Torres Strait Islands of Northern Australia

    PubMed Central

    Baker, Anthony; Mayo, Mark; Owens, Leigh; Burgess, Graham; Norton, Robert; McBride, William John Hannan; Currie, Bart J.

    2013-01-01

    It has been hypothesized that biogeographical boundaries are a feature of Burkholderia pseudomallei ecology, and they impact the epidemiology of melioidosis on a global scale. This study examined the relatedness of B. pseudomallei sourced from islands in the Torres Strait of Northern Australia to determine if the geography of isolated island communities is a determinant of the organisms' dispersal. Environmental sampling on Badu Island in the Near Western Island cluster recovered a single clone. An additional 32 clinical isolates from the region were sourced. Isolates were characterized using multilocus sequence typing and a multiplex PCR targeting the flagellum gene cluster. Gene cluster analysis determined that 69% of the isolates from the region encoded the ancestral Burkholderia thailandensis-like flagellum and chemotaxis gene cluster, a proportion significantly lower than that reported from mainland Australia and consistent with observations of isolates from southern Papua New Guinea. A goodness-of-fit test indicated that there was geographic localization of sequence types throughout the archipelago, with the exception of Thursday Island, the economic and cultural hub of the region. Sequence types common to mainland Australia and Papua New Guinea were identified. These findings demonstrate for the first time an environmental reservoir for B. pseudomallei in the Torres Strait, and multilocus sequence typing suggests that the organism is not randomly distributed throughout this region and that seawater may provide a barrier to dispersal of the organism. Moreover, these findings support an anthropogenic dispersal hypothesis for the spread of B. pseudomallei throughout this region. PMID:23698533

  12. A systems biology pipeline identifies new immune and disease related molecular signatures and networks in human cells during microgravity exposure

    NASA Astrophysics Data System (ADS)

    Mukhopadhyay, Sayak; Saha, Rohini; Palanisamy, Anbarasi; Ghosh, Madhurima; Biswas, Anupriya; Roy, Saheli; Pal, Arijit; Sarkar, Kathakali; Bagh, Sangram

    2016-05-01

    Microgravity is a prominent health hazard for astronauts, yet we understand little about its effect at the molecular systems level. In this study, we have integrated a set of systems-biology tools and databases and have analysed more than 8000 molecular pathways on published global gene expression datasets of human cells in microgravity. Hundreds of new pathways have been identified with statistical confidence for each dataset and despite the difference in cell types and experiments, around 100 of the new pathways are appeared common across the datasets. They are related to reduced inflammation, autoimmunity, diabetes and asthma. We have identified downregulation of NfκB pathway via Notch1 signalling as new pathway for reduced immunity in microgravity. Induction of few cancer types including liver cancer and leukaemia and increased drug response to cancer in microgravity are also found. Increase in olfactory signal transduction is also identified. Genes, based on their expression pattern, are clustered and mathematically stable clusters are identified. The network mapping of genes within a cluster indicates the plausible functional connections in microgravity. This pipeline gives a new systems level picture of human cells under microgravity, generates testable hypothesis and may help estimating risk and developing medicine for space missions.

  13. A New Two-Step Approach for Hands-On Teaching of Gene Technology: Effects on Students' Activities During Experimentation in an Outreach Gene Technology Lab

    NASA Astrophysics Data System (ADS)

    Scharfenberg, Franz-Josef; Bogner, Franz X.

    2011-08-01

    Emphasis on improving higher level biology education continues. A new two-step approach to the experimental phases within an outreach gene technology lab, derived from cognitive load theory, is presented. We compared our approach using a quasi-experimental design with the conventional one-step mode. The difference consisted of additional focused discussions combined with students writing down their ideas (step one) prior to starting any experimental procedure (step two). We monitored students' activities during the experimental phases by continuously videotaping 20 work groups within each approach ( N = 131). Subsequent classification of students' activities yielded 10 categories (with well-fitting intra- and inter-observer scores with respect to reliability). Based on the students' individual time budgets, we evaluated students' roles during experimentation from their prevalent activities (by independently using two cluster analysis methods). Independently of the approach, two common clusters emerged, which we labeled as `all-rounders' and as `passive students', and two clusters specific to each approach: `observers' as well as `high-experimenters' were identified only within the one-step approach whereas under the two-step conditions `managers' and `scribes' were identified. Potential changes in group-leadership style during experimentation are discussed, and conclusions for optimizing science teaching are drawn.

  14. A systems biology pipeline identifies new immune and disease related molecular signatures and networks in human cells during microgravity exposure.

    PubMed

    Mukhopadhyay, Sayak; Saha, Rohini; Palanisamy, Anbarasi; Ghosh, Madhurima; Biswas, Anupriya; Roy, Saheli; Pal, Arijit; Sarkar, Kathakali; Bagh, Sangram

    2016-05-17

    Microgravity is a prominent health hazard for astronauts, yet we understand little about its effect at the molecular systems level. In this study, we have integrated a set of systems-biology tools and databases and have analysed more than 8000 molecular pathways on published global gene expression datasets of human cells in microgravity. Hundreds of new pathways have been identified with statistical confidence for each dataset and despite the difference in cell types and experiments, around 100 of the new pathways are appeared common across the datasets. They are related to reduced inflammation, autoimmunity, diabetes and asthma. We have identified downregulation of NfκB pathway via Notch1 signalling as new pathway for reduced immunity in microgravity. Induction of few cancer types including liver cancer and leukaemia and increased drug response to cancer in microgravity are also found. Increase in olfactory signal transduction is also identified. Genes, based on their expression pattern, are clustered and mathematically stable clusters are identified. The network mapping of genes within a cluster indicates the plausible functional connections in microgravity. This pipeline gives a new systems level picture of human cells under microgravity, generates testable hypothesis and may help estimating risk and developing medicine for space missions.

  15. Horizontal gene transfer of acetyltransferases, invertases and chorismate mutases from different bacteria to diverse recipients.

    PubMed

    Noon, Jason B; Baum, Thomas J

    2016-04-12

    Hoplolaimina plant-parasitic nematodes (PPN) are a lineage of animals with many documented cases of horizontal gene transfer (HGT). In a recent study, we reported on three likely HGT candidate genes in the soybean cyst nematode Heterodera glycines, all of which encode secreted candidate effectors with putative functions in the host plant. Hg-GLAND1 is a putative GCN5-related N-acetyltransferase (GNAT), Hg-GLAND13 is a putative invertase (INV), and Hg-GLAND16 is a putative chorismate mutase (CM), and blastp searches of the non-redundant database resulted in highest similarity to bacterial sequences. Here, we searched nematode and non-nematode sequence databases to identify all the nematodes possible that contain these three genes, and to formulate hypotheses about when they most likely appeared in the phylum Nematoda. We then performed phylogenetic analyses combined with model selection tests of alternative models of sequence evolution to determine whether these genes were horizontally acquired from bacteria. Mining of nematode sequence databases determined that GNATs appeared in Hoplolaimina PPN late in evolution, while both INVs and CMs appeared before the radiation of the Hoplolaimina suborder. Also, Hoplolaimina GNATs, INVs and CMs formed well-supported clusters with different rhizosphere bacteria in the phylogenetic trees, and the model selection tests greatly supported models of HGT over descent via common ancestry. Surprisingly, the phylogenetic trees also revealed additional, well-supported clusters of bacterial GNATs, INVs and CMs with diverse eukaryotes and archaea. There were at least eleven and eight well-supported clusters of GNATs and INVs, respectively, from different bacteria with diverse eukaryotes and archaea. Though less frequent, CMs from different bacteria formed supported clusters with multiple different eukaryotes. Moreover, almost all individual clusters containing bacteria and eukaryotes or archaea contained species that inhabit very similar niches. GNATs were horizontally acquired late in Hoplolaimina PPN evolution from bacteria most similar to the saprophytic and plant-pathogenic actinomycetes. INVs and CMs were horizontally acquired from bacteria most similar to rhizobacteria and Burkholderia soil bacteria, respectively, before the radiation of Hoplolaimina. Also, these three gene groups appear to have been frequent subjects of HGT from different bacteria to numerous, diverse lineages of eukaryotes and archaea, which suggests that these genes may confer important evolutionary advantages to many taxa. In the case of Hoplolaimina PPN, this advantage likely was an improved ability to parasitize plants.

  16. WordCluster: detecting clusters of DNA words and genomic elements

    PubMed Central

    2011-01-01

    Background Many k-mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds. Results We introduce here an algorithm to detect clusters of DNA words (k-mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used WordCluster to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome. Conclusions WordCluster seems to predict biological meaningful clusters of DNA words (k-mers) and genomic entities. The implementation of the method into a web server is available at http://bioinfo2.ugr.es/wordCluster/wordCluster.php including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes. PMID:21261981

  17. Insight into Energy Conservation via Alternative Carbon Monoxide Metabolism in Carboxydothermus pertinax Revealed by Comparative Genome Analysis.

    PubMed

    Fukuyama, Yuto; Omae, Kimiho; Yoneda, Yasuko; Yoshida, Takashi; Sako, Yoshihiko

    2018-05-04

    Carboxydothermus species are some of the most studied thermophilic carboxydotrophs. Their varied carboxydotrophic growth properties suggest distinct strategies for energy conservation via CO metabolism. In this study, we used comparative genome analysis of the genus Carboxydothermus to show variations in the CO dehydrogenase/energy-converting hydrogenase gene cluster, which is responsible for CO metabolism with H 2 production (hydrogenogenic CO metabolism). Indeed, ability or inability to produce H 2 with CO oxidation is explained by the presence or absence of this gene cluster in C. hydrogenoformans , C. islandicus , and C. ferrireducens Interestingly, despite its hydrogenogenic CO metabolism, C. pertinax lacks the Ni-CO dehydrogenase catalytic subunit (CooS-I) and its transcriptional regulator encoding genes in this gene cluster probably due to inversion. Transcriptional analysis in C. pertinax showed that the Ni-CO dehydrogenase gene ( cooS-II ) and distantly encoded energy-converting hydrogenase related genes were remarkably upregulated under 100% CO. In addition, when thiosulfate was available as a terminal electron acceptor under 100% CO, C. pertinax maximum cell density and maximum specific growth rate were 3.1-fold and 1.5-fold higher, respectively, than when thiosulfate was absent. The amount of H 2 produced was only 63% of the consumed CO, less than expected according to hydrogenogenic CO oxidation: CO + H 2 O → CO 2 + H 2 Accordingly, C. pertinax would couple CO oxidation by Ni-CO dehydrogenase-II with simultaneous reduction of not only H 2 O but thiosulfate when grown under 100% CO. IMPORTANCE Anaerobic hydrogenogenic carboxydotrophs are thought to fill a vital niche with scavenging potentially toxic CO and producing H 2 as available energy source for thermophilic microbes. This hydrogenogenic carboxydotrophy relies on a Ni-CO dehydrogenase/energy-converting hydrogenase gene cluster. This feature is thought to be as common to these organisms. However, hydrogenogenic carboxydotroph, Carboxydothermus pertinax lacks the gene for the Ni-CO dehydrogenase catalytic subunit encoded in the gene cluster. Here, we performed a comparative genome analysis of the genus Carboxydothermus , transcriptional analysis, and cultivation study under 100% CO to prove their hydrogenogenic CO metabolism. Results revealed that C. pertinax could couple Ni-CO dehydrogenase-II alternatively to the distal energy-converting hydrogenase. Furthermore, C. pertinax represents an example of the functioning of Ni-CO dehydrogenase which does not always correspond with its genomic context owing to the versatility of CO metabolism and the low redox potential of CO. Copyright © 2018 American Society for Microbiology.

  18. Large clusters of co-expressed genes in the Drosophila genome.

    PubMed

    Boutanaev, Alexander M; Kalmykova, Alla I; Shevelyov, Yuri Y; Nurminsky, Dmitry I

    2002-12-12

    Clustering of co-expressed, non-homologous genes on chromosomes implies their co-regulation. In lower eukaryotes, co-expressed genes are often found in pairs. Clustering of genes that share aspects of transcriptional regulation has also been reported in higher eukaryotes. To advance our understanding of the mode of coordinated gene regulation in multicellular organisms, we performed a genome-wide analysis of the chromosomal distribution of co-expressed genes in Drosophila. We identified a total of 1,661 testes-specific genes, one-third of which are clustered on chromosomes. The number of clusters of three or more genes is much higher than expected by chance. We observed a similar trend for genes upregulated in the embryo and in the adult head, although the expression pattern of individual genes cannot be predicted on the basis of chromosomal position alone. Our data suggest that the prevalent mechanism of transcriptional co-regulation in higher eukaryotes operates with extensive chromatin domains that comprise multiple genes.

  19. Unusual Gene Order and Organization of the Sea Urchin Hox Cluster

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cameron, R A; Rowen, L; Nesbitt, R

    2005-10-11

    The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3 gene is Hox5. (The gene order is :more » 5-Hox1, 2, 3, 11/13c, 11/13b, 11/13a, 9/10, 8, 7, 6, 5 - 3). The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.« less

  20. Unusual Gene Order and Organization of the Sea Urchin HoxCluster

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Richardson, Paul M.; Lucas, Susan; Cameron, R. Andrew

    2005-05-10

    The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3' gene is Hox5. (The gene order is :more » 5'-Hox1,2, 3, 11/13c, 11/13b, '11/13a, 9/10, 8, 7, 6, 5 - 3)'. The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.« less

  1. Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants

    DOE PAGES

    Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan; ...

    2017-04-01

    Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we will need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can bemore » used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters.« less

  2. Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants1[OPEN

    PubMed Central

    Zhang, Peifen; Kim, Taehyong; Banf, Michael; Chavali, Arvind K.; Nilo-Poyanco, Ricardo; Bernard, Thomas

    2017-01-01

    Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can be used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters. PMID:28228535

  3. Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan

    Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we will need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can bemore » used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters.« less

  4. Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants.

    PubMed

    Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan; Kim, Taehyong; Banf, Michael; Chae, Lee; Dreher, Kate; Chavali, Arvind K; Nilo-Poyanco, Ricardo; Bernard, Thomas; Kahn, Daniel; Rhee, Seung Y

    2017-04-01

    Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can be used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters. © 2017 American Society of Plant Biologists. All Rights Reserved.

  5. Function Clustering Self-Organization Maps (FCSOMs) for mining differentially expressed genes in Drosophila and its correlation with the growth medium.

    PubMed

    Liu, L L; Liu, M J; Ma, M

    2015-09-28

    The central task of this study was to mine the gene-to-medium relationship. Adequate knowledge of this relationship could potentially improve the accuracy of differentially expressed gene mining. One of the approaches to differentially expressed gene mining uses conventional clustering algorithms to identify the gene-to-medium relationship. Compared to conventional clustering algorithms, self-organization maps (SOMs) identify the nonlinear aspects of the gene-to-medium relationships by mapping the input space into another higher dimensional feature space. However, SOMs are not suitable for huge datasets consisting of millions of samples. Therefore, a new computational model, the Function Clustering Self-Organization Maps (FCSOMs), was developed. FCSOMs take advantage of the theory of granular computing as well as advanced statistical learning methodologies, and are built specifically for each information granule (a function cluster of genes), which are intelligently partitioned by the clustering algorithm provided by the DAVID_6.7 software platform. However, only the gene functions, and not their expression values, are considered in the fuzzy clustering algorithm of DAVID. Compared to the clustering algorithm of DAVID, these experimental results show a marked improvement in the accuracy of classification with the application of FCSOMs. FCSOMs can handle huge datasets and their complex classification problems, as each FCSOM (modeled for each function cluster) can be easily parallelized.

  6. Functional genomics of commercial baker's yeasts that have different abilities for sugar utilization and high-sucrose tolerance under different sugar conditions.

    PubMed

    Tanaka-Tsuno, Fumiko; Mizukami-Murata, Satomi; Murata, Yoshinori; Nakamura, Toshihide; Ando, Akira; Takagi, Hiroshi; Shima, Jun

    2007-10-01

    In the modern baking industry, high-sucrose-tolerant (HS) and maltose-utilizing (LS) yeast were developed using breeding techniques and are now used commercially. Sugar utilization and high-sucrose tolerance differ significantly between HS and LS yeasts. We analysed the gene expression profiles of HS and LS yeasts under different sucrose conditions in order to determine their basic physiology. Two-way hierarchical clustering was performed to obtain the overall patterns of gene expression. The clustering clearly showed that the gene expression patterns of LS yeast differed from those of HS yeast. Quality threshold clustering was used to identify the gene clusters containing upregulated genes (cluster 1) and downregulated genes (cluster 2) under high-sucrose conditions. Clusters 1 and 2 contained numerous genes involved in carbon and nitrogen metabolism, respectively. The expression level of the genes involved in the metabolism of glycerol and trehalose, which are known to be osmoprotectants, in LS yeast was higher than that in HS yeast under sucrose concentrations of 5-40%. No clear correlation was found between the expression level of the genes involved in the biosynthesis of the osmoprotectants and the intracellular contents of the osmoprotectants. The present gene expression data were compared with data previously reported in a comprehensive analysis of a gene deletion strain collection. Welch's t-test for this comparison showed that the relative growth rates of the deletion strains whose deletion occurred in genes belonging to cluster 1 were significantly higher than the average growth rates of all deletion strains. Copyright 2007 John Wiley & Sons, Ltd.

  7. Identification of the First Riboflavin Catabolic Gene Cluster Isolated from Microbacterium maritypicum G10*

    PubMed Central

    Xu, Hui; Chakrabarty, Yindrila; Philmus, Benjamin; Mehta, Angad P.; Bhandari, Dhananjay; Hohmann, Hans-Peter; Begley, Tadhg P.

    2016-01-01

    Riboflavin is a common cofactor, and its biosynthetic pathway is well characterized. However, its catabolic pathway, despite intriguing hints in a few distinct organisms, has never been established. This article describes the isolation of a Microbacterium maritypicum riboflavin catabolic strain, and the cloning of the riboflavin catabolic genes. RcaA, RcaB, RcaD, and RcaE were overexpressed and biochemically characterized as riboflavin kinase, riboflavin reductase, ribokinase, and riboflavin hydrolase, respectively. Based on these activities, a pathway for riboflavin catabolism is proposed. PMID:27590337

  8. Differential Retention of Gene Functions in a Secondary Metabolite Cluster.

    PubMed

    Reynolds, Hannah T; Slot, Jason C; Divon, Hege H; Lysøe, Erik; Proctor, Robert H; Brown, Daren W

    2017-08-01

    In fungi, distribution of secondary metabolite (SM) gene clusters is often associated with host- or environment-specific benefits provided by SMs. In the plant pathogen Alternaria brassicicola (Dothideomycetes), the DEP cluster confers an ability to synthesize the SM depudecin, a histone deacetylase inhibitor that contributes weakly to virulence. The DEP cluster includes genes encoding enzymes, a transporter, and a transcription regulator. We investigated the distribution and evolution of the DEP cluster in 585 fungal genomes and found a wide but sporadic distribution among Dothideomycetes, Sordariomycetes, and Eurotiomycetes. We confirmed DEP gene expression and depudecin production in one fungus, Fusarium langsethiae. Phylogenetic analyses suggested 6-10 horizontal gene transfers (HGTs) of the cluster, including a transfer that led to the presence of closely related cluster homologs in Alternaria and Fusarium. The analyses also indicated that HGTs were frequently followed by loss/pseudogenization of one or more DEP genes. Independent cluster inactivation was inferred in at least four fungal classes. Analyses of transitions among functional, pseudogenized, and absent states of DEP genes among Fusarium species suggest enzyme-encoding genes are lost at higher rates than the transporter (DEP3) and regulatory (DEP6) genes. The phenotype of an experimentally-induced DEP3 mutant of Fusarium did not support the hypothesis that selective retention of DEP3 and DEP6 protects fungi from exogenous depudecin. Together, the results suggest that HGT and gene loss have contributed significantly to DEP cluster distribution, and that some DEP genes provide a greater fitness benefit possibly due to a differential tendency to form network connections. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.

  9. A comprehensive aligned nifH gene database: a multipurpose tool for studies of nitrogen-fixing bacteria.

    PubMed

    Gaby, John Christian; Buckley, Daniel H

    2014-01-01

    We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm.

  10. A comprehensive aligned nifH gene database: a multipurpose tool for studies of nitrogen-fixing bacteria

    PubMed Central

    Gaby, John Christian; Buckley, Daniel H.

    2014-01-01

    We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm PMID:24501396

  11. Genomic V exons from whole genome shotgun data in reptiles.

    PubMed

    Olivieri, D N; von Haeften, B; Sánchez-Espinel, C; Faro, J; Gambón-Deza, F

    2014-08-01

    Reptiles and mammals diverged over 300 million years ago, creating two parallel evolutionary lineages amongst terrestrial vertebrates. In reptiles, two main evolutionary lines emerged: one gave rise to Squamata, while the other gave rise to Testudines, Crocodylia, and Aves. In this study, we determined the genomic variable (V) exons from whole genome shotgun sequencing (WGS) data in reptiles corresponding to the three main immunoglobulin (IG) loci and the four main T cell receptor (TR) loci. We show that Squamata lack the TRG and TRD genes, and snakes lack the IGKV genes. In representative species of Testudines and Crocodylia, the seven major IG and TR loci are maintained. As in mammals, genes of the IG loci can be grouped into well-defined IMGT clans through a multi-species phylogenetic analysis. We show that the reptilian IGHV and IGLV genes are distributed amongst the established mammalian clans, while their IGKV genes are found within a single clan, nearly exclusive from the mammalian sequences. The reptilian and mammalian TRAV genes cluster into six common evolutionary clades (since IMGT clans have not been defined for TR). In contrast, the reptilian TRBV genes cluster into three clades, which have few mammalian members. In this locus, the V exon sequences from mammals appear to have undergone different evolutionary diversification processes that occurred outside these shared reptilian clans. These sequences can be obtained in a freely available public repository (http://vgenerepertoire.org).

  12. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Data Analysis and Visualization; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,'' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA

    2008-05-12

    The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii)more » evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.« less

  13. Comparative genomic analysis of Acinetobacter strains isolated from murine colonic crypts.

    PubMed

    Saffarian, Azadeh; Touchon, Marie; Mulet, Céline; Tournebize, Régis; Passet, Virginie; Brisse, Sylvain; Rocha, Eduardo P C; Sansonetti, Philippe J; Pédron, Thierry

    2017-07-11

    A restricted set of aerobic bacteria dominated by the Acinetobacter genus was identified in murine intestinal colonic crypts. The vicinity of such bacteria with intestinal stem cells could indicate that they protect the crypt against cytotoxic and genotoxic signals. Genome analyses of these bacteria were performed to better appreciate their biodegradative capacities. Two taxonomically different clusters of Acinetobacter were isolated from murine proximal colonic crypts, one was identified as A. modestus and the other as A. radioresistens. Their identification was performed through biochemical parameters and housekeeping gene sequencing. After selection of one strain of each cluster (A. modestus CM11G and A. radioresistens CM38.2), comparative genomic analysis was performed on whole-genome sequencing data. The antibiotic resistance pattern of these two strains is different, in line with the many genes involved in resistance to heavy metals identified in both genomes. Moreover whereas the operon benABCDE involved in benzoate metabolism is encoded by the two genomes, the operon antABC encoding the anthranilate dioxygenase, and the phenol hydroxylase gene cluster are absent in the A. modestus genomic sequence, indicating that the two strains have different capacities to metabolize xenobiotics. A common feature of the two strains is the presence of a type IV pili system, and the presence of genes encoding proteins pertaining to secretion systems such as Type I and Type II secretion systems. Our comparative genomic analysis revealed that different Acinetobacter isolated from the same biological niche, even if they share a large majority of genes, possess unique features that could play a specific role in the protection of the intestinal crypt.

  14. Host Cell Contact-Induced Transcription of the Type IV Fimbria Gene Cluster of Actinobacillus pleuropneumoniae

    PubMed Central

    Boekema, Bouke K. H. L.; Van Putten, Jos P. M.; Stockhofe-Zurwieden, Norbert; Smith, Hilde E.

    2004-01-01

    Type IV pili (Tfp) of gram-negative species share many characteristics, including a common architecture and conserved biogenesis pathway. Much less is known about the regulation of Tfp expression in response to changing environmental conditions. We investigated the diversity of Tfp regulatory systems by searching for the molecular basis of the reported variable expression of the Tfp gene cluster of the pathogen Actinobacillus pleuropneumoniae. Despite the presence of an intact Tfp gene cluster consisting of four genes, apfABCD, no Tfp were formed under standard growth conditions. Sequence analysis of the predicted major subunit protein ApfA showed an atypical alanine residue at position −1 from the prepilin peptidase cleavage site in 42 strains. This alanine deviates from the consensus glycine at this position in Tfp from other species. Yet, cloning of the apfABCD genes under a constitutive promoter in A. pleuropneumoniae resulted in pilin and Tfp assembly. Tfp promoter-luxAB reporter gene fusions demonstrated that the Tfp promoter was intact but tightly regulated. Promoter activity varied with bacterial growth phase and was detected only when bacteria were grown in chemically defined medium. Infection experiments with cultured epithelial cells demonstrated that Tfp promoter activity was upregulated upon adherence of the pathogen to primary cultures of lung epithelial cells. Nonadherent bacteria in the culture supernatant exhibited virtually no promoter activity. A similar upregulation of Tfp promoter activity was observed in vivo during experimental infection of pigs. The host cell contact-induced and in vivo-upregulated Tfp promoter activity in A. pleuropneumoniae adds a new dimension to the diversity of Tfp regulation. PMID:14742510

  15. Comparison of the Predictive Accuracy of DNA Array-Based Multigene Classifiers across cDNA Arrays and Affymetrix GeneChips

    PubMed Central

    Stec, James; Wang, Jing; Coombes, Kevin; Ayers, Mark; Hoersch, Sebastian; Gold, David L.; Ross, Jeffrey S; Hess, Kenneth R.; Tirrell, Stephen; Linette, Gerald; Hortobagyi, Gabriel N.; Symmans, W. Fraser; Pusztai, Lajos

    2005-01-01

    We examined how well differentially expressed genes and multigene outcome classifiers retain their class-discriminating values when tested on data generated by different transcriptional profiling platforms. RNA from 33 stage I-III breast cancers was hybridized to both Affymetrix GeneChip and Millennium Pharmaceuticals cDNA arrays. Only 30% of all corresponding gene expression measurements on the two platforms had Pearson correlation coefficient r ≥ 0.7 when UniGene was used to match probes. There was substantial variation in correlation between different Affymetrix probe sets matched to the same cDNA probe. When cDNA and Affymetrix probes were matched by basic local alignment tool (BLAST) sequence identity, the correlation increased substantially. We identified 182 genes in the Affymetrix and 45 in the cDNA data (including 17 common genes) that accurately separated 91% of cases in supervised hierarchical clustering in each data set. Cross-platform testing of these informative genes resulted in lower clustering accuracy of 45 and 79%, respectively. Several sets of accurate five-gene classifiers were developed on each platform using linear discriminant analysis. The best 100 classifiers showed average misclassification error rate of 2% on the original data that rose to 19.5% when tested on data from the other platform. Random five-gene classifiers showed misclassification error rate of 33%. We conclude that multigene predictors optimized for one platform lose accuracy when applied to data from another platform due to missing genes and sequence differences in probes that result in differing measurements for the same gene. PMID:16049308

  16. Analysis of Copy Number Variation in the Abp Gene Regions of Two House Mouse Subspecies Suggests Divergence during the Gene Family Expansions

    PubMed Central

    Pezer, Željka; Chung, Amanda G.; Karn, Robert C.

    2017-01-01

    Abstract The Androgen-binding protein (Abp) gene region of the mouse genome contains 64 genes, some encoding pheromones that influence assortative mating between mice from different subspecies. Using CNVnator and quantitative PCR, we explored copy number variation in this gene family in natural populations of Mus musculus domesticus (Mmd) and Mus musculus musculus (Mmm), two subspecies of house mice that form a narrow hybrid zone in Central Europe. We found that copy number variation in the center of the Abp gene region is very common in wild Mmd, primarily representing the presence/absence of the final duplications described for the mouse genome. Clustering of Mmd individuals based on this variation did not reflect their geographical origin, suggesting no population divergence in the Abp gene cluster. However, copy number variation patterns differ substantially between Mmd and other mouse taxa. Large blocks of Abp genes are absent in Mmm, Mus musculus castaneus and an outgroup, Mus spretus, although with differences in variation and breakpoint locations. Our analysis calls into question the reliance on a reference genome for interpreting the detailed organization of genes in taxa more distant from the Mmd reference genome. The polymorphic nature of the gene family expansion in all four taxa suggests that the number of Abp genes, especially in the central gene region, is not critical to the survival and reproduction of the mouse. However, Abp haplotypes of variable length may serve as a source of raw genetic material for new signals influencing reproductive communication and thus speciation of mice. PMID:28575204

  17. The Association of Multiple Interacting Genes with Specific Phenotypes in Rice Using Gene Coexpression Networks1[C][W][OA

    PubMed Central

    Ficklin, Stephen P.; Luo, Feng; Feltus, F. Alex

    2010-01-01

    Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes. PMID:20668062

  18. The association of multiple interacting genes with specific phenotypes in rice using gene coexpression networks.

    PubMed

    Ficklin, Stephen P; Luo, Feng; Feltus, F Alex

    2010-09-01

    Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes.

  19. Hepatic gene expression in rainbow trout (Oncorhynchus mykiss) exposed to different hydrocarbon mixtures.

    PubMed

    Hook, Sharon E; Lampi, Mark A; Febbo, Eric J; Ward, Jeff A; Parkerton, Thomas F

    2010-09-01

    Traditional biomarkers for hydrocarbon exposure are not induced by all petroleum substances. The objective of this study was to determine if exposure to a crude oil and different refined oils would generate a common hydrocarbon-specific response in gene expression profiles that could be used as generic biomarkers of hydrocarbon exposure. Juvenile rainbow trout (Oncorhynchus mykiss) were exposed to the water accommodated fraction (WAF) of either kerosene, gas oil, heavy fuel oil, or crude oil for 96 h. Tissue was collected for RNA extraction and microarray analysis. Exposure to each WAF resulted in a different list of differentially regulated genes, with few genes in common across treatments. Exposure to crude oil WAF changed the expression of genes including cytochrome P4501A (CYP1A) and glutathione-S-transferase (GST) with known roles in detoxification pathways. These gene expression profiles were compared to others from previous experiments that used a diverse suite of toxicants. Clustering algorithms successfully identified gene expression profiles resulting from hydrocarbon exposure. These preliminary analyses highlight the difficulties of using single genes as diagnostic of petroleum hydrocarbon exposures. Further work is needed to determine if multivariate transcriptomic-based biomarkers may be a more effective tool than single gene studies for exposure monitoring of different oils. Copyright 2010 SETAC.

  20. The sirodesmin biosynthetic gene cluster of the plant pathogenic fungus Leptosphaeria maculans.

    PubMed

    Gardiner, Donald M; Cozijnsen, Anton J; Wilson, Leanne M; Pedras, M Soledade C; Howlett, Barbara J

    2004-09-01

    Sirodesmin PL is a phytotoxin produced by the fungus Leptosphaeria maculans, which causes blackleg disease of canola (Brassica napus). This phytotoxin belongs to the epipolythiodioxopiperazine (ETP) class of toxins produced by fungi including mammalian and plant pathogens. We report the cloning of a cluster of genes with predicted roles in the biosynthesis of sirodesmin PL and show via gene disruption that one of these genes (encoding a two-module non-ribosomal peptide synthetase) is essential for sirodesmin PL biosynthesis. Of the nine genes in the cluster tested, all are co-regulated with the production of sirodesmin PL in culture. A similar cluster is present in the genome of the opportunistic human pathogen Aspergillus fumigatus and is most likely responsible for the production of gliotoxin, which is also an ETP. Homologues of the genes in the cluster were also identified in expressed sequence tags of the ETP producing fungus Chaetomium globosum. Two other fungi with publicly available genome sequences, Magnaporthe grisea and Fusarium graminearum, had similar gene clusters. A comparative analysis of all four clusters is presented. This is the first report of the genes responsible for the biosynthesis of an ETP. Copyright 2004 Blackwell Publishing Ltd

  1. Sex, Drugs, and Rock ‘N’ Roll: Hypothesizing Common Mesolimbic Activation as a Function of Reward Gene Polymorphisms

    PubMed Central

    Blum, Kenneth; Werner, Tonia; Carnes, Stefanie; Carnes, Patrick; Bowirrat, Abdalla; Giordano, John; Marlene-Oscar-Berman; Gold, Mark

    2014-01-01

    The nucleus accumbens, a site within the ventral striatum, plays a prominent role in mediating the reinforcing effects of drugs of abuse, food, sex, and other addictions. Indeed, it is generally believed that this structure mandates motivated behaviors such as eating, drinking, and sexual activity, which are elicited by natural rewards and other strong incentive stimuli. This article focuses on sex addiction, but we hypothesize that there is a common underlying mechanism of action for the powerful effects that all addictions have on human motivation. That is, biological drives may have common molecular genetic antecedents, which if impaired, lead to aberrant behaviors. Based on abundant scientific support, we further hypothesize that dopaminergic genes, and possibly other candidate neurotransmitter-related gene polymorphisms, affect both hedonic and anhedonic behavioral outcomes. Genotyping studies already have linked gene polymorphic associations with alcohol and drug addictions and obesity, and we anticipate that future genotyping studies of sex addicts will provide evidence for polymorphic associations with specific clustering of sexual typologies based on clinical instrument assessments. We recommend that scientists and clinicians embark on research coupling the use of neuroimaging tools with dopaminergic agonistic agents to target specific gene polymorphisms systematically for normalizing hyper- or hypo-sexual behaviors. PMID:22641964

  2. The human TREM gene cluster at 6p21.1 encodes both activating and inhibitory single IgV domain receptors and includes NKp44.

    PubMed

    Allcock, Richard J N; Barrow, Alexander D; Forbes, Simon; Beck, Stephan; Trowsdale, John

    2003-02-01

    We have characterized a cluster of single immunoglobulin variable (IgV) domain receptors centromeric of the major histocompatibility complex (MHC) on human chromosome 6. In addition to triggering receptor expressed on myeloid cells (TREM)-1 and TREM2, the cluster contains NKp44, a triggering receptor whose expression is limited to NK cells. We identified three new related genes and two gene fragments within a cluster of approximately 200 kb. Two of the three new genes lack charged residues in their transmembrane domain tails. Further, one of the genes contains two potential immunotyrosine Inhibitory motifs in its cytoplasmic tail, suggesting that it delivers inhibitory signals. The human and mouse TREM clusters appear to have diverged such that there are unique sequences in each species. Finally, each gene in the TREM cluster was expressed in a different range of cell types.

  3. Silencing by imprinted noncoding RNAs: is transcription the answer?

    PubMed Central

    Pauler, Florian M.; Koerner, Martha V.; Barlow, Denise P.

    2010-01-01

    Non-coding RNAs (ncRNAs) with gene regulatory functions are starting to be seen as a common feature of mammalian gene regulation with the discovery that most of the transcriptome is ncRNA. The prototype has long been the Xist ncRNA, which induces X-chromosome inactivation in female cells. However, a new paradigm is emerging – the silencing of imprinted gene clusters by long ncRNAs. Here, we review models by which imprinted ncRNAs could function. We argue that an Xist-like model is only one of many possible solutions and that imprinted ncRNAs could provide the better model for understanding the function of the new class of ncRNAs associated with non-imprinted mammalian genes. PMID:17445943

  4. Genomic Epidemiology of Clostridium botulinum Isolates from Temporally Related Cases of Infant Botulism in New South Wales, Australia

    PubMed Central

    Gray, Timothy J.; Wang, Qinning; Ng, Jimmy; Hicks, Leanne; Nguyen, Trang; Yuen, Marion; Hill-Cawthorne, Grant A.; Sintchenko, Vitali

    2015-01-01

    Infant botulism is a potentially life-threatening paralytic disease that can be associated with prolonged morbidity if not rapidly diagnosed and treated. Four infants were diagnosed and treated for infant botulism in NSW, Australia, between May 2011 and August 2013. Despite the temporal relationship between the cases, there was no close geographical clustering or other epidemiological links. Clostridium botulinum isolates, three of which produced botulism neurotoxin serotype A (BoNT/A) and one BoNT serotype B (BoNT/B), were characterized using whole-genome sequencing (WGS). In silico multilocus sequence typing (MLST) found that two of the BoNT/A-producing isolates shared an identical novel sequence type, ST84. The other two isolates were single-locus variants of this sequence type (ST85 and ST86). All BoNT/A-producing isolates contained the same chromosomally integrated BoNT/A2 neurotoxin gene cluster. The BoNT/B-producing isolate carried a single plasmid-borne bont/B gene cluster, encoding BoNT subtype B6. Single nucleotide polymorphism (SNP)-based typing results corresponded well with MLST; however, the extra resolution provided by the whole-genome SNP comparisons showed that the isolates differed from each other by >3,500 SNPs. WGS analyses indicated that the four infant botulism cases were caused by genomically distinct strains of C. botulinum that were unlikely to have originated from a common environmental source. The isolates did, however, cluster together, compared with international isolates, suggesting that C. botulinum from environmental reservoirs throughout NSW have descended from a common ancestor. Analyses showed that the high resolution of WGS provided important phylogenetic information that would not be captured by standard seven-loci MLST. PMID:26109442

  5. Genomic Epidemiology of Clostridium botulinum Isolates from Temporally Related Cases of Infant Botulism in New South Wales, Australia.

    PubMed

    McCallum, Nadine; Gray, Timothy J; Wang, Qinning; Ng, Jimmy; Hicks, Leanne; Nguyen, Trang; Yuen, Marion; Hill-Cawthorne, Grant A; Sintchenko, Vitali

    2015-09-01

    Infant botulism is a potentially life-threatening paralytic disease that can be associated with prolonged morbidity if not rapidly diagnosed and treated. Four infants were diagnosed and treated for infant botulism in NSW, Australia, between May 2011 and August 2013. Despite the temporal relationship between the cases, there was no close geographical clustering or other epidemiological links. Clostridium botulinum isolates, three of which produced botulism neurotoxin serotype A (BoNT/A) and one BoNT serotype B (BoNT/B), were characterized using whole-genome sequencing (WGS). In silico multilocus sequence typing (MLST) found that two of the BoNT/A-producing isolates shared an identical novel sequence type, ST84. The other two isolates were single-locus variants of this sequence type (ST85 and ST86). All BoNT/A-producing isolates contained the same chromosomally integrated BoNT/A2 neurotoxin gene cluster. The BoNT/B-producing isolate carried a single plasmid-borne bont/B gene cluster, encoding BoNT subtype B6. Single nucleotide polymorphism (SNP)-based typing results corresponded well with MLST; however, the extra resolution provided by the whole-genome SNP comparisons showed that the isolates differed from each other by >3,500 SNPs. WGS analyses indicated that the four infant botulism cases were caused by genomically distinct strains of C. botulinum that were unlikely to have originated from a common environmental source. The isolates did, however, cluster together, compared with international isolates, suggesting that C. botulinum from environmental reservoirs throughout NSW have descended from a common ancestor. Analyses showed that the high resolution of WGS provided important phylogenetic information that would not be captured by standard seven-loci MLST. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  6. Clustering change patterns using Fourier transformation with time-course gene expression data.

    PubMed

    Kim, Jaehee

    2011-01-01

    To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a period of time because biologically related gene groups can share the same change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. This work is aimed at discovering gene groups with similar change patterns which share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. We applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns.

  7. Clarinet (CLA-1), a novel active zone protein required for synaptic vesicle clustering and release

    PubMed Central

    Nelson, Jessica; Richmond, Janet E; Colón-Ramos, Daniel A; Shen, Kang

    2017-01-01

    Active zone proteins cluster synaptic vesicles at presynaptic terminals and coordinate their release. In forward genetic screens, we isolated a novel Caenorhabditis elegans active zone gene, clarinet (cla-1). cla-1 mutants exhibit defects in synaptic vesicle clustering, active zone structure and synapse number. As a result, they have reduced spontaneous vesicle release and increased synaptic depression. cla-1 mutants show defects in vesicle distribution near the presynaptic dense projection, with fewer undocked vesicles contacting the dense projection and more docked vesicles at the plasma membrane. cla-1 encodes three isoforms containing common C-terminal PDZ and C2 domains with homology to vertebrate active zone proteins Piccolo and RIM. The C-termini of all isoforms localize to the active zone. Specific loss of the ~9000 amino acid long isoform results in vesicle clustering defects and increased synaptic depression. Our data indicate that specific isoforms of clarinet serve distinct functions, regulating synapse development, vesicle clustering and release. PMID:29160205

  8. Variation in the fumonisin biosynthetic gene cluster in fumonisin-producing and nonproducing black aspergilli.

    PubMed

    Susca, Antonia; Proctor, Robert H; Butchko, Robert A E; Haidukowski, Miriam; Stea, Gaetano; Logrieco, Antonio; Moretti, Antonio

    2014-12-01

    The ability to produce fumonisin mycotoxins varies among members of the black aspergilli. Previously, analyses of selected genes in the fumonisin biosynthetic gene (fum) cluster in black aspergilli from California grapes indicated that fumonisin-nonproducing isolates of Aspergillus welwitschiae lack six fum genes, but nonproducing isolates of Aspergillus niger do not. In the current study, analyses of black aspergilli from grapes from the Mediterranean Basin indicate that the genomic context of the fum cluster is the same in isolates of A. niger and A. welwitschiae regardless of fumonisin-production ability and that full-length clusters occur in producing isolates of both species and nonproducing isolates of A. niger. In contrast, the cluster has undergone an eight-gene deletion in fumonisin-nonproducing isolates of A. welwitschiae. Phylogenetic analyses suggest each species consists of a mixed population of fumonisin-producing and nonproducing individuals, and that existence of both production phenotypes may provide a selective advantage to these species. Differences in gene content of fum cluster homologues and phylogenetic relationships of fum genes suggest that the mutation(s) responsible for the nonproduction phenotype differs, and therefore arose independently, in the two species. Partial fum cluster homologues were also identified in genome sequences of four other black Aspergillus species. Gene content of these partial clusters and phylogenetic relationships of fum sequences indicate that non-random partial deletion of the cluster has occurred multiple times among the species. This in turn suggests that an intact cluster and fumonisin production were once more widespread among black aspergilli. Copyright © 2014 Elsevier Inc. All rights reserved.

  9. Discovery of Gene Cluster for Mycosporine-Like Amino Acid Biosynthesis from Actinomycetales Microorganisms and Production of a Novel Mycosporine-Like Amino Acid by Heterologous Expression

    PubMed Central

    Miyamoto, Kiyoko T.; Komatsu, Mamoru

    2014-01-01

    Mycosporines and mycosporine-like amino acids (MAAs), including shinorine (mycosporine-glycine-serine) and porphyra-334 (mycosporine-glycine-threonine), are UV-absorbing compounds produced by cyanobacteria, fungi, and marine micro- and macroalgae. These MAAs have the ability to protect these organisms from damage by environmental UV radiation. Although no reports have described the production of MAAs and the corresponding genes involved in MAA biosynthesis from Gram-positive bacteria to date, genome mining of the Gram-positive bacterial database revealed that two microorganisms belonging to the order Actinomycetales, Actinosynnema mirum DSM 43827 and Pseudonocardia sp. strain P1, possess a gene cluster homologous to the biosynthetic gene clusters identified from cyanobacteria. When the two strains were grown in liquid culture, Pseudonocardia sp. accumulated a very small amount of MAA-like compound in a medium-dependent manner, whereas A. mirum did not produce MAAs under any culture conditions, indicating that the biosynthetic gene cluster of A. mirum was in a cryptic state in this microorganism. In order to characterize these biosynthetic gene clusters, each biosynthetic gene cluster was heterologously expressed in an engineered host, Streptomyces avermitilis SUKA22. Since the resultant transformants carrying the entire biosynthetic gene cluster controlled by an alternative promoter produced mainly shinorine, this is the first confirmation of a biosynthetic gene cluster for MAA from Gram-positive bacteria. Furthermore, S. avermitilis SUKA22 transformants carrying the biosynthetic gene cluster for MAA of A. mirum accumulated not only shinorine and porphyra-334 but also a novel MAA. Structure elucidation revealed that the novel MAA is mycosporine-glycine-alanine, which substitutes l-alanine for the l-serine of shinorine. PMID:24907338

  10. Discovery of gene cluster for mycosporine-like amino acid biosynthesis from Actinomycetales microorganisms and production of a novel mycosporine-like amino acid by heterologous expression.

    PubMed

    Miyamoto, Kiyoko T; Komatsu, Mamoru; Ikeda, Haruo

    2014-08-01

    Mycosporines and mycosporine-like amino acids (MAAs), including shinorine (mycosporine-glycine-serine) and porphyra-334 (mycosporine-glycine-threonine), are UV-absorbing compounds produced by cyanobacteria, fungi, and marine micro- and macroalgae. These MAAs have the ability to protect these organisms from damage by environmental UV radiation. Although no reports have described the production of MAAs and the corresponding genes involved in MAA biosynthesis from Gram-positive bacteria to date, genome mining of the Gram-positive bacterial database revealed that two microorganisms belonging to the order Actinomycetales, Actinosynnema mirum DSM 43827 and Pseudonocardia sp. strain P1, possess a gene cluster homologous to the biosynthetic gene clusters identified from cyanobacteria. When the two strains were grown in liquid culture, Pseudonocardia sp. accumulated a very small amount of MAA-like compound in a medium-dependent manner, whereas A. mirum did not produce MAAs under any culture conditions, indicating that the biosynthetic gene cluster of A. mirum was in a cryptic state in this microorganism. In order to characterize these biosynthetic gene clusters, each biosynthetic gene cluster was heterologously expressed in an engineered host, Streptomyces avermitilis SUKA22. Since the resultant transformants carrying the entire biosynthetic gene cluster controlled by an alternative promoter produced mainly shinorine, this is the first confirmation of a biosynthetic gene cluster for MAA from Gram-positive bacteria. Furthermore, S. avermitilis SUKA22 transformants carrying the biosynthetic gene cluster for MAA of A. mirum accumulated not only shinorine and porphyra-334 but also a novel MAA. Structure elucidation revealed that the novel MAA is mycosporine-glycine-alanine, which substitutes l-alanine for the l-serine of shinorine. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  11. Identification of Loci and Functional Characterization of Trichothecene Biosynthesis Genes in Filamentous Fungi of the Genus Trichoderma▿†

    PubMed Central

    Cardoza, R. E.; Malmierca, M. G.; Hermosa, M. R.; Alexander, N. J.; McCormick, S. P.; Proctor, R. H.; Tijerino, A. M.; Rumbero, A.; Monte, E.; Gutiérrez, S.

    2011-01-01

    Trichothecenes are mycotoxins produced by Trichoderma, Fusarium, and at least four other genera in the fungal order Hypocreales. Fusarium has a trichothecene biosynthetic gene (TRI) cluster that encodes transport and regulatory proteins as well as most enzymes required for the formation of the mycotoxins. However, little is known about trichothecene biosynthesis in the other genera. Here, we identify and characterize TRI gene orthologues (tri) in Trichoderma arundinaceum and Trichoderma brevicompactum. Our results indicate that both Trichoderma species have a tri cluster that consists of orthologues of seven genes present in the Fusarium TRI cluster. Organization of genes in the cluster is the same in the two Trichoderma species but differs from the organization in Fusarium. Sequence and functional analysis revealed that the gene (tri5) responsible for the first committed step in trichothecene biosynthesis is located outside the cluster in both Trichoderma species rather than inside the cluster as it is in Fusarium. Heterologous expression analysis revealed that two T. arundinaceum cluster genes (tri4 and tri11) differ in function from their Fusarium orthologues. The Tatri4-encoded enzyme catalyzes only three of the four oxygenation reactions catalyzed by the orthologous enzyme in Fusarium. The Tatri11-encoded enzyme catalyzes a completely different reaction (trichothecene C-4 hydroxylation) than the Fusarium orthologue (trichothecene C-15 hydroxylation). The results of this study indicate that although some characteristics of the tri/TRI cluster have been conserved during evolution of Trichoderma and Fusarium, the cluster has undergone marked changes, including gene loss and/or gain, gene rearrangement, and divergence of gene function. PMID:21642405

  12. The Finding of a Group IIE Phospholipase A2 Gene in a Specified Segment of Protobothrops flavoviridis Genome and Its Possible Evolutionary Relationship to Group IIA Phospholipase A2 Genes

    PubMed Central

    Yamaguchi, Kazuaki; Chijiwa, Takahito; Ikeda, Naoki; Shibata, Hiroki; Fukumaki, Yasuyuki; Oda-Ueda, Naoko; Hattori, Shosaku; Ohno, Motonori

    2014-01-01

    The genes encoding group IIE phospholipase A2, abbreviated as IIE PLA2, and its 5' and 3' flanking regions of Crotalinae snakes such as Protobothrops flavoviridis, P. tokarensis, P. elegans, and Ovophis okinavensis, were found and sequenced. The genes consisted of four exons and three introns and coded for 22 or 24 amino acid residues of the signal peptides and 134 amino acid residues of the mature proteins. These IIE PLA2s show high similarity to those from mammals and Colubridae snakes. The high expression level of IIE PLA2s in Crotalinae venom glands suggests that they should work as venomous proteins. The blast analysis indicated that the gene encoding OTUD3, which is ovarian tumor domain-containing protein 3, is located in the 3' downstream of IIE PLA2 gene. Moreover, a group IIA PLA2 gene was found in the 5' upstream of IIE PLA2 gene linked to the OTUD3 gene (OTUD3) in the P. flavoviridis genome. It became evident that the specified arrangement of IIA PLA2 gene, IIE PLA2 gene, and OTUD3 in this order is common in the genomes of humans to snakes. The present finding that the genes encoding various secretory PLA2s form a cluster in the genomes of humans to birds is closely related to the previous finding that six venom PLA2 isozyme genes are densely clustered in the so-called NIS-1 fragment of the P. flavoviridis genome. It is also suggested that venom IIA PLA2 genes may be evolutionarily derived from the IIE PLA2 gene. PMID:25529307

  13. Clustered Genes Involved in Cyclopiazonic Acid Production are Next to the Aflatoxin Biosynthesis Gene Cluster in Aspergillus flavus

    USDA-ARS?s Scientific Manuscript database

    Cyclopiazonic acid (CPA), an indole-tetramic acid toxin, is produced by many species of Aspergillus and Penicillium. In addition to CPA Aspergillus flavus produces polyketide-derived carcinogenic aflatoxins (AFs). AF biosynthesis genes form a gene cluster in a subtelomeric region. Isolates of A. fla...

  14. Identification of nitrogen-fixing genes and gene clusters from metagenomic library of acid mine drainage.

    PubMed

    Dai, Zhimin; Guo, Xue; Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

    2014-01-01

    Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.

  15. Identification of Nitrogen-Fixing Genes and Gene Clusters from Metagenomic Library of Acid Mine Drainage

    PubMed Central

    Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

    2014-01-01

    Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community. PMID:24498417

  16. Molecular serotyping and antimicrobial resistance profiles of Actinobacillus pleuropneumoniae isolated from pigs in South Korea.

    PubMed

    Kim, Boram; Hur, Jin; Lee, Ji Yeong; Choi, Yoonyoung; Lee, John Hwa

    2016-09-01

    Actinobacillus pleuropneumoniae (APP) causes porcine pleuropneumonia (PP). Serotypes and antimicrobial resistance patterns in APP isolates from pigs in Korea were examined. Sixty-five APP isolates were genetically serotyped using standard and multiplex PCR (polymerase chain reaction). Antimicrobial susceptibilities were tested using the standardized disk-agar method. PCR was used to detect β-lactam, gentamicin and tetracycline-resistance genes. The random amplified polymorphic DNA (RAPD) patterns were determined by PCR. Korean pigs predominantly carried APP serotypes 1 and 5. Among 65 isolates, one isolate was sensitive to all 12 antimicrobials tested in this study. Sixty-two isolates was resistant to tetracycline and 53 isolates carried one or five genes including tet(B), tet(A), tet(H), tet(M)/tet(O), tet(C), tet(G) and/or tet(L)-1 markers. Among 64 strains, 9% and 26.6% were resistance to 10 and three or more antimicrobials, respectively. Thirteen different antimicrobial resistance patterns were observed and RAPD analysis revealed a separation of the isolates into two clusters: cluster II (6 strains resistant to 10 antimicrobials) and cluster I (the other 59 strains). Results show that APP serotypes 1 and 5 are the most common in Korea, and multi-drug resistant strains are prevalent. RAPD analysis demonstrated that six isolates resistant to 10 antimicrobials belonged to the same cluster.

  17. Unsupervised discovery of microbial population structure within metagenomes using nucleotide base composition

    PubMed Central

    Saeed, Isaam; Tang, Sen-Lin; Halgamuge, Saman K.

    2012-01-01

    An approach to infer the unknown microbial population structure within a metagenome is to cluster nucleotide sequences based on common patterns in base composition, otherwise referred to as binning. When functional roles are assigned to the identified populations, a deeper understanding of microbial communities can be attained, more so than gene-centric approaches that explore overall functionality. In this study, we propose an unsupervised, model-based binning method with two clustering tiers, which uses a novel transformation of the oligonucleotide frequency-derived error gradient and GC content to generate coarse groups at the first tier of clustering; and tetranucleotide frequency to refine these groups at the secondary clustering tier. The proposed method has a demonstrated improvement over PhyloPythia, S-GSOM, TACOA and TaxSOM on all three benchmarks that were used for evaluation in this study. The proposed method is then applied to a pyrosequenced metagenomic library of mud volcano sediment sampled in southwestern Taiwan, with the inferred population structure validated against complementary sequencing of 16S ribosomal RNA marker genes. Finally, the proposed method was further validated against four publicly available metagenomes, including a highly complex Antarctic whale-fall bone sample, which was previously assumed to be too complex for binning prior to functional analysis. PMID:22180538

  18. Interstitial telomeric sequences in human chromosomes cluster with common fragile sites, mutagen sensitive sites, viral integration sites, cancer breakpoints, proto-oncogenes and breakpoints involved in primate evolution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Adekunle, S.S.A.; Wyandt, H.; Mark, H.F.L.

    1994-09-01

    Recently we mapped the telomeric repeat sequences to 111 interstitial sites in the human genome and to sites of gaps and breaks induced by aphidicolin and sister chromatid exchange sites detected by BrdU. Many of these sites correspond to conserved fragile sites in man, gorilla and chimpazee, to sites of conserved sister chromatid exchange in the mammalian X chromosome, to mutagenic sensitive sites, mapped locations of proto-oncogenes, breakpoints implicated in primate evolution and to breakpoints indicated as the sole anomaly in neoplasia. This observation prompted us to investigate if the interstitial telomeric sites cluster with these sites. An extensive literaturemore » search was carried out to find all the available published sites mentioned above. For comparison, we also carried out a statistical analysis of the clustering of the sites of the telomeric repeats with the gene locations where only nucleotide mutations have been observed as the only chromosomal abnormality. Our results indicate that the telomeric repeats cluster most with fragile sites, mutagenic sensitive sites and breakpoints implicated in primate evolution and least with cancer breakpoints, mapped locations of proto-oncogenes and other genes with nucleotide mutations.« less

  19. Sequence analyses reveal that a TPR-DP module, surrounded by recombinable flanking introns, could be at the origin of eukaryotic Hop and Hip TPR-DP domains and prokaryotic GerD proteins.

    PubMed

    Hernández Torres, Jorge; Papandreou, Nikolaos; Chomilier, Jacques

    2009-05-01

    The co-chaperone Hop [heat shock protein (HSP) organising protein] is known to bind both Hsp70 and Hsp90. Hop comprises three repeats of a tetratricopeptide repeat (TPR) domain, each consisting of three TPR motifs. The first and last TPR domains are followed by a domain containing several dipeptide (DP) repeats called the DP domain. These analyses suggest that the hop genes result from successive recombination events of an ancestral TPR-DP module. From a hydrophobic cluster analysis of homologous Hop protein sequences derived from gene families, we can postulate that shifts in the open reading frames are at the origin of the present sequences. Moreover, these shifts can be related to the presence or absence of biological function. We propose to extend the family of Hop co-chaperons into the kingdom of bacteria, as several structurally related genes have been identified by hydrophobic cluster analysis. We also provide evidence of common structural characteristics between hop and hip genes, suggesting a shared precursor of ancestral TPR-DP domains.

  20. Comparison of expression of secondary metabolite biosynthesis cluster genes in Aspergillus flavus, A. parasiticus, and A. oryzae.

    PubMed

    Ehrlich, Kenneth C; Mack, Brian M

    2014-06-23

    Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity.

  1. Comparison of Expression of Secondary Metabolite Biosynthesis Cluster Genes in Aspergillus flavus, A. parasiticus, and A. oryzae

    PubMed Central

    Ehrlich, Kenneth C.; Mack, Brian M.

    2014-01-01

    Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity. PMID:24960201

  2. Recurrent Rearrangements of Human Amylase Genes Create Multiple Independent CNV Series.

    PubMed

    Shwan, Nzar A A; Louzada, Sandra; Yang, Fengtang; Armour, John A L

    2017-05-01

    The human amylase gene cluster includes the human salivary (AMY1) and pancreatic amylase genes (AMY2A and AMY2B), and is a highly variable and dynamic region of the genome. Copy number variation (CNV) of AMY1 has been implicated in human dietary adaptation, and in population association with obesity, but neither of these findings has been independently replicated. Despite these functional implications, the structural genomic basis of CNV has only been defined in detail very recently. In this work, we use high-resolution analysis of copy number, and analysis of segregation in trios, to define new, independent allelic series of amylase CNVs in sub-Saharan Africans, including a series of higher-order expansions of a unit consisting of one copy each of AMY1, AMY2A, and AMY2B. We use fiber-FISH (fluorescence in situ hybridization) to define unexpected complexity in the accompanying rearrangements. These findings demonstrate recurrent involvement of the amylase gene region in genomic instability, involving at least five independent rearrangements of the pancreatic amylase genes (AMY2A and AMY2B). Structural features shared by fundamentally distinct lineages strongly suggest that the common ancestral state for the human amylase cluster contained more than one, and probably three, copies of AMY1. © 2017 WILEY PERIODICALS, INC.

  3. Associations among child abuse, mental health, and epigenetic modifications in the proopiomelanocortin gene (POMC): A study with children in Tanzania.

    PubMed

    Hecker, Tobias; Radtke, Karl M; Hermenau, Katharin; Papassotiropoulos, Andreas; Elbert, Thomas

    2016-11-01

    Child abuse is associated with a number of emotional and behavioral problems. Nevertheless, it has been argued that these adverse consequences may not hold for societies in which many of the specific acts of abuse are culturally normed. Epigenetic modifications in the genes of the hypothalamus-pituitary-adrenal axis may provide a potential mechanism translating abuse into altered gene expression, which subsequently results in behavioral changes. Our investigation took place in Tanzania, a society in which many forms of abuse are commonly employed as disciplinary methods. We included 35 children with high exposure and compared them to 25 children with low exposure. Extreme group comparisons revealed that children with high exposure reported more mental health problems. Child abuse was associated with differential methylation in the proopiomelanocortin gene (POMC), measured both in saliva and in blood. Hierarchical clustering based on the methylation of the POMC gene found two distinct clusters. These corresponded with children's self-reported abuse, with two-thirds of the children allocated into their respective group. Our results emphasize the consequences of child abuse based on both molecular and behavioral grounds, providing further evidence that acts of abuse affect children, even when culturally acceptable. Furthermore, on a molecular level, our findings strengthen the credibility of children's self-reports.

  4. Identification and comparative analysis of the epidermal differentiation complex in snakes

    PubMed Central

    Brigit Holthaus, Karin; Mlitz, Veronika; Strasser, Bettina; Tschachler, Erwin; Alibardi, Lorenzo; Eckhart, Leopold

    2017-01-01

    The epidermis of snakes efficiently protects against dehydration and mechanical stress. However, only few proteins of the epidermal barrier to the environment have so far been identified in snakes. Here, we determined the organization of the Epidermal Differentiation Complex (EDC), a cluster of genes encoding protein constituents of cornified epidermal structures, in snakes and compared it to the EDCs of other squamates and non-squamate reptiles. The EDC of snakes displays shared synteny with that of the green anole lizard, including the presence of a cluster of corneous beta-protein (CBP)/beta-keratin genes. We found that a unique CBP comprising 4 putative beta-sheets and multiple cysteine-rich EDC proteins are conserved in all snakes and other squamates investigated. Comparative genomics of squamates suggests that the evolution of snakes was associated with a gene duplication generating two isoforms of the S100 fused-type protein, scaffoldin, the origin of distinct snake-specific EDC genes, and the loss of other genes that were present in the EDC of the last common ancestor of snakes and lizards. Taken together, our results provide new insights into the evolution of the skin in squamates and a basis for the characterization of the molecular composition of the epidermis in snakes. PMID:28345630

  5. Type 2 diabetes mellitus: distribution of genetic markers in Kazakh population

    PubMed Central

    Sikhayeva, Nurgul; Talzhanov, Yerkebulan; Iskakova, Aisha; Dzharmukhanov, Jarkyn; Nugmanova, Raushan; Zholdybaeva, Elena; Ramanculov, Erlan

    2018-01-01

    Background Ethnic differences exist in the frequencies of genetic variations that contribute to the risk of common disease. This study aimed to analyse the distribution of several genes, previously associated with susceptibility to type 2 diabetes and obesity-related phenotypes, in a Kazakh population. Methods A total of 966 individuals belonging to the Kazakh ethnicity were recruited from an outpatient clinic. We genotyped 41 common single nucleotide polymorphisms (SNPs) previously associated with type 2 diabetes in other ethnic groups and 31 of these were in Hardy–Weinberg equilibrium. The obtained allele frequencies were further compared to publicly available data from other ethnic populations. Allele frequencies for other (compared) populations were pooled from the haplotype map (HapMap) database. Principal component analysis (PCA), cluster analysis, and multidimensional scaling (MDS) were used for the analysis of genetic relationship between the populations. Results Comparative analysis of allele frequencies of the studied SNPs showed significant differentiation among the studied populations. The Kazakh population was grouped with Asian populations according to the cluster analysis and with the Caucasian populations according to PCA. According to MDS, results of the current study show that the Kazakh population holds an intermediate position between Caucasian and Asian populations. Conclusion A high percentage of population differentiation was observed between Kazakh and world populations. The Kazakh population was clustered with Caucasian populations, and this result may indicate a significant Caucasian component in the Kazakh gene pool. PMID:29551892

  6. Gene Cluster Encoding Cholate Catabolism in Rhodococcus spp.

    PubMed Central

    Wilbrink, Maarten H.; Casabon, Israël; Stewart, Gordon R.; Liu, Jie; van der Geize, Robert; Eltis, Lindsay D.

    2012-01-01

    Bile acids are highly abundant steroids with important functions in vertebrate digestion. Their catabolism by bacteria is an important component of the carbon cycle, contributes to gut ecology, and has potential commercial applications. We found that Rhodococcus jostii RHA1 grows well on cholate, as well as on its conjugates, taurocholate and glycocholate. The transcriptome of RHA1 growing on cholate revealed 39 genes upregulated on cholate, occurring in a single gene cluster. Reverse transcriptase quantitative PCR confirmed that selected genes in the cluster were upregulated 10-fold on cholate versus on cholesterol. One of these genes, kshA3, encoding a putative 3-ketosteroid-9α-hydroxylase, was deleted and found essential for growth on cholate. Two coenzyme A (CoA) synthetases encoded in the cluster, CasG and CasI, were heterologously expressed. CasG was shown to transform cholate to cholyl-CoA, thus initiating side chain degradation. CasI was shown to form CoA derivatives of steroids with isopropanoyl side chains, likely occurring as degradation intermediates. Orthologous gene clusters were identified in all available Rhodococcus genomes, as well as that of Thermomonospora curvata. Moreover, Rhodococcus equi 103S, Rhodococcus ruber Chol-4 and Rhodococcus erythropolis SQ1 each grew on cholate. In contrast, several mycolic acid bacteria lacking the gene cluster were unable to grow on cholate. Our results demonstrate that the above-mentioned gene cluster encodes cholate catabolism and is distinct from a more widely occurring gene cluster encoding cholesterol catabolism. PMID:23024343

  7. Frequent genomic imbalances suggest commonly altered tumour genes in human hepatocarcinogenesis

    PubMed Central

    Niketeghad, F; Decker, H J; Caselmann, W H; Lund, P; Geissler, F; Dienes, H P; Schirmacher, P

    2001-01-01

    Hepatocellular carcinoma (HCC) is one of the most frequent-occurring malignant tumours worldwide, but molecular changes of tumour DNA, with the exception of viral integrations and p53 mutations, are poorly understood. In order to search for common macro-imbalances of genomic tumour DNA, 21 HCCs and 3 HCC-cell lines were characterized by comparative genomic hybridization (CGH), subsequent database analyses and in selected cases by fluorescence in situ hybridization (FISH). Chromosomal subregions of 1q, 8q, 17q and 20q showed frequent gains of genomic material, while losses were most prevalent in subregions of 4q, 6q, 13q and 16q. Deleted regions encompass tumour suppressor genes, like RB-1 and the cadherin gene cluster, some of them previously identified as potential target genes in HCC development. Several potential growth- or transformation-promoting genes located in chromosomal subregions showed frequent gains of genomic material. The present study provides a basis for further genomic and expression analyses in HCCs and in addition suggests chromosome 4q to carry a so far unidentified tumour suppressor gene relevant for HCC development. © 2001 Cancer Research Campaign http://www.bjcancer.com PMID:11531255

  8. Evolution of homeobox genes.

    PubMed

    Holland, Peter W H

    2013-01-01

    Many homeobox genes encode transcription factors with regulatory roles in animal and plant development. Homeobox genes are found in almost all eukaryotes, and have diversified into 11 gene classes and over 100 gene families in animal evolution, and 10 to 14 gene classes in plants. The largest group in animals is the ANTP class which includes the well-known Hox genes, plus other genes implicated in development including ParaHox (Cdx, Xlox, Gsx), Evx, Dlx, En, NK4, NK3, Msx, and Nanog. Genomic data suggest that the ANTP class diversified by extensive tandem duplication to generate a large array of genes, including an NK gene cluster and a hypothetical ProtoHox gene cluster that duplicated to generate Hox and ParaHox genes. Expression and functional data suggest that NK, Hox, and ParaHox gene clusters acquired distinct roles in patterning the mesoderm, nervous system, and gut. The PRD class is also diverse and includes Pax2/5/8, Pax3/7, Pax4/6, Gsc, Hesx, Otx, Otp, and Pitx genes. PRD genes are not generally arranged in ancient genomic clusters, although the Dux, Obox, and Rhox gene clusters arose in mammalian evolution as did several non-clustered PRD genes. Tandem duplication and genome duplication expanded the number of homeobox genes, possibly contributing to the evolution of developmental complexity, but homeobox gene loss must not be ignored. Evolutionary changes to homeobox gene expression have also been documented, including Hox gene expression patterns shifting in concert with segmental diversification in vertebrates and crustaceans, and deletion of a Pitx1 gene enhancer in pelvic-reduced sticklebacks. WIREs Dev Biol 2013, 2:31-45. doi: 10.1002/wdev.78 For further resources related to this article, please visit the WIREs website. The author declares that he has no conflicts of interest. Copyright © 2012 Wiley Periodicals, Inc.

  9. Hox gene clusters in the Indonesian coelacanth, Latimeria menadoensis

    PubMed Central

    Koh, Esther G. L.; Lam, Kevin; Christoffels, Alan; Erdmann, Mark V.; Brenner, Sydney; Venkatesh, Byrappa

    2003-01-01

    The Hox genes encode transcription factors that play a key role in specifying body plans of metazoans. They are organized into clusters that contain up to 13 paralogue group members. The complex morphology of vertebrates has been attributed to the duplication of Hox clusters during vertebrate evolution. In contrast to the single Hox cluster in the amphioxus (Branchiostoma floridae), an invertebrate-chordate, mammals have four clusters containing 39 Hox genes. Ray-finned fishes (Actinopterygii) such as zebrafish and fugu possess more than four Hox clusters. The coelacanth occupies a basal phylogenetic position among lobe-finned fishes (Sarcopterygii), which gave rise to the tetrapod lineage. The lobe fins of sarcopterygians are considered to be the evolutionary precursors of tetrapod limbs. Thus, the characterization of Hox genes in the coelacanth should provide insights into the origin of tetrapod limbs. We have cloned the complete second exon of 33 Hox genes from the Indonesian coelacanth, Latimeria menadoensis, by extensive PCR survey and genome walking. Phylogenetic analysis shows that 32 of these genes have orthologs in the four mammalian HOX clusters, including three genes (HoxA6, D1, and D8) that are absent in ray-finned fishes. The remaining coelacanth gene is an ortholog of hoxc1 found in zebrafish but absent in mammals. Our results suggest that coelacanths have four Hox clusters bearing a gene complement more similar to mammals than to ray-finned fishes, but with an additional gene, HoxC1, which has been lost during the evolution of mammals from lobe-finned fishes. PMID:12547909

  10. Hox gene clusters in the Indonesian coelacanth, Latimeria menadoensis.

    PubMed

    Koh, Esther G L; Lam, Kevin; Christoffels, Alan; Erdmann, Mark V; Brenner, Sydney; Venkatesh, Byrappa

    2003-02-04

    The Hox genes encode transcription factors that play a key role in specifying body plans of metazoans. They are organized into clusters that contain up to 13 paralogue group members. The complex morphology of vertebrates has been attributed to the duplication of Hox clusters during vertebrate evolution. In contrast to the single Hox cluster in the amphioxus (Branchiostoma floridae), an invertebrate-chordate, mammals have four clusters containing 39 Hox genes. Ray-finned fishes (Actinopterygii) such as zebrafish and fugu possess more than four Hox clusters. The coelacanth occupies a basal phylogenetic position among lobe-finned fishes (Sarcopterygii), which gave rise to the tetrapod lineage. The lobe fins of sarcopterygians are considered to be the evolutionary precursors of tetrapod limbs. Thus, the characterization of Hox genes in the coelacanth should provide insights into the origin of tetrapod limbs. We have cloned the complete second exon of 33 Hox genes from the Indonesian coelacanth, Latimeria menadoensis, by extensive PCR survey and genome walking. Phylogenetic analysis shows that 32 of these genes have orthologs in the four mammalian HOX clusters, including three genes (HoxA6, D1, and D8) that are absent in ray-finned fishes. The remaining coelacanth gene is an ortholog of hoxc1 found in zebrafish but absent in mammals. Our results suggest that coelacanths have four Hox clusters bearing a gene complement more similar to mammals than to ray-finned fishes, but with an additional gene, HoxC1, which has been lost during the evolution of mammals from lobe-finned fishes.

  11. Characterization of a Major Cluster of nif, fix, and Associated Genes in a Sugarcane Endophyte, Acetobacter diazotrophicus

    PubMed Central

    Lee, Sunhee; Reth, Alexander; Meletzus, Dietmar; Sevilla, Myrna; Kennedy, Christina

    2000-01-01

    A major 30.5-kb cluster of nif and associated genes of Acetobacter diazotrophicus (syn. Gluconacetobacter diazotrophicus), a nitrogen-fixing endophyte of sugarcane, was sequenced and analyzed. This cluster represents the largest assembly of contiguous nif-fix and associated genes so far characterized in any diazotrophic bacterial species. Northern blots and promoter sequence analysis indicated that the genes are organized into eight transcriptional units. The overall arrangement of genes is most like that of the nif-fix cluster in Azospirillum brasilense, while the individual gene products are more similar to those in species of Rhizobiaceae or in Rhodobacter capsulatus. PMID:11092875

  12. Distribution of Suicin Gene Clusters in Streptococcus suis Serotype 2 Belonging to Sequence Types 25 and 28.

    PubMed

    Athey, Taryn B T; Vaillancourt, Katy; Frenette, Michel; Fittipaldi, Nahuel; Gottschalk, Marcelo; Grenier, Daniel

    2016-01-01

    Recently, we reported the purification and characterization of three distinct lantibiotics (named suicin 90-1330, suicin 3908, and suicin 65) produced by Streptococcus suis . In this study, we investigated the distribution of the three suicin lantibiotic gene clusters among serotype 2 S. suis strains belonging to sequence type (ST) 25 and ST28, the two dominant STs identified in North America. The genomes of 102 strains were interrogated for the presence of suicin gene clusters encoding suicins 90-1330, 3908, and 65. The gene cluster encoding suicin 65 was the most prevalent and mainly found among ST25 strains. In contrast, none of the genes related to suicin 90-1330 production were identified in 51 ST25 strains nor in 35/51 ST28 strains. However, the complete suicin 90-1330 gene cluster was found in ten ST28 strains, although some genes in the cluster were truncated in three of these isolates. The vast majority (101/102) of S. suis strains did not possess any of the genes encoding suicin 3908. In conclusion, this study indicates heterogeneous distribution of suicin genes in S. suis .

  13. Virulence Factors and Antibiotic Susceptibility of Staphylococcus aureus Isolates in Ready-to-Eat Foods: Detection of S. aureus Contamination and a High Prevalence of Virulence Genes

    PubMed Central

    Puah, Suat Moi; Chua, Kek Heng; Tan, Jin Ai Mary Anne

    2016-01-01

    Staphylococcus aureus is one of the leading causes of food poisoning. Its pathogenicity results from the possession of virulence genes that produce different toxins which result in self-limiting to severe illness often requiring hospitalization. In this study of 200 sushi and sashimi samples, S. aureus contamination was confirmed in 26% of the food samples. The S. aureus isolates were further characterized for virulence genes and antibiotic susceptibility. A high incidence of virulence genes was identified in 96.2% of the isolates and 20 different virulence gene profiles were confirmed. DNA amplification showed that 30.8% (16/52) of the S. aureus carried at least one SE gene which causes staphylococcal food poisoning. The most common enterotoxin gene was seg (11.5%) and the egc cluster was detected in 5.8% of the isolates. A combination of hla and hld was the most prevalent coexistence virulence genes and accounted for 59.6% of all isolates. Antibiotic resistance studies showed tetracycline resistance to be the most common at 28.8% while multi-drug resistance was found to be low at 3.8%. In conclusion, the high rate of S. aureus in the sampled sushi and sashimi indicates the need for food safety guidelines. PMID:26861367

  14. Virulence Factors and Antibiotic Susceptibility of Staphylococcus aureus Isolates in Ready-to-Eat Foods: Detection of S. aureus Contamination and a High Prevalence of Virulence Genes.

    PubMed

    Puah, Suat Moi; Chua, Kek Heng; Tan, Jin Ai Mary Anne

    2016-02-05

    Staphylococcus aureus is one of the leading causes of food poisoning. Its pathogenicity results from the possession of virulence genes that produce different toxins which result in self-limiting to severe illness often requiring hospitalization. In this study of 200 sushi and sashimi samples, S. aureus contamination was confirmed in 26% of the food samples. The S. aureus isolates were further characterized for virulence genes and antibiotic susceptibility. A high incidence of virulence genes was identified in 96.2% of the isolates and 20 different virulence gene profiles were confirmed. DNA amplification showed that 30.8% (16/52) of the S. aureus carried at least one SE gene which causes staphylococcal food poisoning. The most common enterotoxin gene was seg (11.5%) and the egc cluster was detected in 5.8% of the isolates. A combination of hla and hld was the most prevalent coexistence virulence genes and accounted for 59.6% of all isolates. Antibiotic resistance studies showed tetracycline resistance to be the most common at 28.8% while multi-drug resistance was found to be low at 3.8%. In conclusion, the high rate of S. aureus in the sampled sushi and sashimi indicates the need for food safety guidelines.

  15. Genome-wide association identifies genetic variants associated with lentiform nucleus volume in N = 1345 young and elderly subjects.

    PubMed

    Hibar, Derrek P; Stein, Jason L; Ryles, April B; Kohannim, Omid; Jahanshad, Neda; Medland, Sarah E; Hansell, Narelle K; McMahon, Katie L; de Zubicaray, Greig I; Montgomery, Grant W; Martin, Nicholas G; Wright, Margaret J; Saykin, Andrew J; Jack, Clifford R; Weiner, Michael W; Toga, Arthur W; Thompson, Paul M

    2013-06-01

    Deficits in lentiform nucleus volume and morphometry are implicated in a number of genetically influenced disorders, including Parkinson's disease, schizophrenia, and ADHD. Here we performed genome-wide searches to discover common genetic variants associated with differences in lentiform nucleus volume in human populations. We assessed structural MRI scans of the brain in two large genotyped samples: the Alzheimer's Disease Neuroimaging Initiative (ADNI; N = 706) and the Queensland Twin Imaging Study (QTIM; N = 639). Statistics of association from each cohort were combined meta-analytically using a fixed-effects model to boost power and to reduce the prevalence of false positive findings. We identified a number of associations in and around the flavin-containing monooxygenase (FMO) gene cluster. The most highly associated SNP, rs1795240, was located in the FMO3 gene; after meta-analysis, it showed genome-wide significant evidence of association with lentiform nucleus volume (P MA  = 4.79 × 10(-8)). This commonly-carried genetic variant accounted for 2.68 % and 0.84 % of the trait variability in the ADNI and QTIM samples, respectively, even though the QTIM sample was on average 50 years younger. Pathway enrichment analysis revealed significant contributions of this gene to the cytochrome P450 pathway, which is involved in metabolizing numerous therapeutic drugs for pain, seizures, mania, depression, anxiety, and psychosis. The genetic variants we identified provide replicated, genome-wide significant evidence for the FMO gene cluster's involvement in lentiform nucleus volume differences in human populations.

  16. Environmental history impacts gene expression during diapause development in the alfalfa leafcutting bee, Megachile rotundata.

    PubMed

    Yocum, George D; Childers, Anna K; Rinehart, Joseph P; Rajamohan, Arun; Pitts-Singer, Theresa L; Greenlee, Kendra J; Bowsher, Julia H

    2018-05-10

    Our understanding of the mechanisms controlling insect diapause has increased dramatically with the introduction of global gene expression techniques, such as RNA-seq. However, little attention has been given to how ecologically relevant field conditions may affect gene expression during diapause development because previous studies have focused on laboratory reared and maintained insects. To determine whether gene expression differs between laboratory and field conditions, prepupae of the alfalfa leafcutting bee, Megachile rotundata , entering diapause early or late in the growing season were collected. These two groups were further subdivided in early autumn into laboratory and field maintained groups, resulting in four experimental treatments of diapausing prepupae: early and late field, and early and late laboratory. RNA-seq and differential expression analyses were performed on bees from the four treatment groups in November, January, March and May. The number of treatment-specific differentially expressed genes (97 to 1249) outnumbered the number of differentially regulated genes common to all four treatments (14 to 229), indicating that exposure to laboratory or field conditions had a major impact on gene expression during diapause development. Principle component analysis and hierarchical cluster analysis yielded similar grouping of treatments, confirming that the treatments form distinct clusters. Our results support the conclusion that gene expression during the course of diapause development is not a simple ordered sequence, but rather a highly plastic response determined primarily by the environmental history of the individual insect. © 2018. Published by The Company of Biologists Ltd.

  17. Comparing the performance of biomedical clustering methods.

    PubMed

    Wiwie, Christian; Baumbach, Jan; Röttger, Richard

    2015-11-01

    Identifying groups of similar objects is a popular first step in biomedical data analysis, but it is error-prone and impossible to perform manually. Many computational methods have been developed to tackle this problem. Here we assessed 13 well-known methods using 24 data sets ranging from gene expression to protein domains. Performance was judged on the basis of 13 common cluster validity indices. We developed a clustering analysis platform, ClustEval (http://clusteval.mpi-inf.mpg.de), to promote streamlined evaluation, comparison and reproducibility of clustering results in the future. This allowed us to objectively evaluate the performance of all tools on all data sets with up to 1,000 different parameter sets each, resulting in a total of more than 4 million calculated cluster validity indices. We observed that there was no universal best performer, but on the basis of this wide-ranging comparison we were able to develop a short guideline for biomedical clustering tasks. ClustEval allows biomedical researchers to pick the appropriate tool for their data type and allows method developers to compare their tool to the state of the art.

  18. A Putative Gene Cluster from a Lyngbya wollei Bloom that Encodes Paralytic Shellfish Toxin Biosynthesis

    PubMed Central

    Mihali, Troco K.; Carmichael, Wayne W.; Neilan, Brett A.

    2011-01-01

    Saxitoxin and its analogs cause the paralytic shellfish-poisoning syndrome, adversely affecting human health and coastal shellfish industries worldwide. Here we report the isolation, sequencing, annotation, and predicted pathway of the saxitoxin biosynthetic gene cluster in the cyanobacterium Lyngbya wollei. The gene cluster spans 36 kb and encodes enzymes for the biosynthesis and export of the toxins. The Lyngbya wollei saxitoxin gene cluster differs from previously identified saxitoxin clusters as it contains genes that are unique to this cluster, whereby the carbamoyltransferase is truncated and replaced by an acyltransferase, explaining the unique toxin profile presented by Lyngbya wollei. These findings will enable the creation of toxin probes, for water monitoring purposes, as well as proof-of-concept for the combinatorial biosynthesis of these natural occurring alkaloids for the production of novel, biologically active compounds. PMID:21347365

  19. Revisiting the variation of clustering coefficient of biological networks suggests new modular structure.

    PubMed

    Hao, Dapeng; Ren, Cong; Li, Chuanxing

    2012-05-01

    A central idea in biology is the hierarchical organization of cellular processes. A commonly used method to identify the hierarchical modular organization of network relies on detecting a global signature known as variation of clustering coefficient (so-called modularity scaling). Although several studies have suggested other possible origins of this signature, it is still widely used nowadays to identify hierarchical modularity, especially in the analysis of biological networks. Therefore, a further and systematical investigation of this signature for different types of biological networks is necessary. We analyzed a variety of biological networks and found that the commonly used signature of hierarchical modularity is actually the reflection of spoke-like topology, suggesting a different view of network architecture. We proved that the existence of super-hubs is the origin that the clustering coefficient of a node follows a particular scaling law with degree k in metabolic networks. To study the modularity of biological networks, we systematically investigated the relationship between repulsion of hubs and variation of clustering coefficient. We provided direct evidences for repulsion between hubs being the underlying origin of the variation of clustering coefficient, and found that for biological networks having no anti-correlation between hubs, such as gene co-expression network, the clustering coefficient doesn't show dependence of degree. Here we have shown that the variation of clustering coefficient is neither sufficient nor exclusive for a network to be hierarchical. Our results suggest the existence of spoke-like modules as opposed to "deterministic model" of hierarchical modularity, and suggest the need to reconsider the organizational principle of biological hierarchy.

  20. The evolutionary life cycle of the polysaccharide biosynthetic gene cluster based on the Sphingomonadaceae.

    PubMed

    Wu, Mengmeng; Huang, Haidong; Li, Guoqiang; Ren, Yi; Shi, Zhong; Li, Xiaoyan; Dai, Xiaohui; Gao, Ge; Ren, Mengnan; Ma, Ting

    2017-04-21

    Although clustering of genes from the same metabolic pathway is a widespread phenomenon, the evolution of the polysaccharide biosynthetic gene cluster remains poorly understood. To determine the evolution of this pathway, we identified a scattered production pathway of the polysaccharide sanxan by Sphingomonas sanxanigenens NX02, and compared the distribution of genes between sphingan-producing and other Sphingomonadaceae strains. This allowed us to determine how the scattered sanxan pathway developed, and how the polysaccharide gene cluster evolved. Our findings suggested that the evolution of microbial polysaccharide biosynthesis gene clusters is a lengthy cyclic process comprising cluster 1 → scatter → cluster 2. The sanxan biosynthetic pathway proved the existence of a dispersive process. We also report the complete genome sequence of NX02, in which we identified many unstable genetic elements and powerful secretion systems. Furthermore, nine enzymes for the formation of activated precursors, four glycosyltransferases, four acyltransferases, and four polymerization and export proteins were identified. These genes were scattered in the NX02 genome, and the positive regulator SpnA of sphingans synthesis could not regulate sanxan production. Finally, we concluded that the evolution of the sanxan pathway was independent. NX02 evolved naturally as a polysaccharide producing strain over a long-time evolution involving gene acquisitions and adaptive mutations.

  1. Hox cluster polarity in early transcriptional availability: a high order regulatory level of clustered Hox genes in the mouse.

    PubMed

    Roelen, Bernard A J; de Graaff, Wim; Forlani, Sylvie; Deschamps, Jacqueline

    2002-11-01

    The molecular mechanism underlying the 3' to 5' polarity of induction of mouse Hox genes is still elusive. While relief from a cluster-encompassing repression was shown to lead to all Hoxd genes being expressed like the 3'most of them, Hoxd1 (Kondo and Duboule, 1999), the molecular basis of initial activation of this 3'most gene, is not understood yet. We show that, already before primitive streak formation, prior to initial expression of the first Hox gene, a dramatic transcriptional stimulation of the 3'most genes, Hoxb1 and Hoxb2, is observed upon a short pulse of exogenous retinoic acid (RA), whereas it is not in the case for more 5', cluster-internal, RA-responsive Hoxb genes. In contrast, the RA-responding Hoxb1lacZ transgene that faithfully mimics the endogenous gene (Marshall et al., 1994) did not exhibit the sensitivity of Hoxb1 to precocious activation. We conclude that polarity in initial activation of Hoxb genes reflects a greater availability of 3'Hox genes for transcription, suggesting a pre-existing (susceptibility to) opening of the chromatin structure at the 3' extremity of the cluster. We discuss the data in the context of prevailing models involving differential chromatin opening in the directionality of clustered Hox gene transcription, and regarding the importance of the cluster context for correct timing of initial Hox gene expression.Interestingly, Cdx1 manifested the same early transcriptional availability as Hoxb1. Copyright 2002 Elsevier Science Ireland Ltd.

  2. Broad spectrum antibiotic compounds and use thereof

    DOEpatents

    Koglin, Alexander; Strieker, Matthias

    2016-07-05

    The discovery of a non-ribosomal peptide synthetase (NRPS) gene cluster in the genome of Clostridium thermocellum that produces a secondary metabolite that is assembled outside of the host membrane is described. Also described is the identification of homologous NRPS gene clusters from several additional microorganisms. The secondary metabolites produced by the NRPS gene clusters exhibit broad spectrum antibiotic activity. Thus, antibiotic compounds produced by the NRPS gene clusters, and analogs thereof, their use for inhibiting bacterial growth, and methods of making the antibiotic compounds are described.

  3. De novo transcriptome profiling of cold-stressed siliques during pod filling stages in Indian mustard (Brassica juncea L.)

    PubMed Central

    Sinha, Somya; Raxwal, Vivek K.; Joshi, Bharat; Jagannath, Arun; Katiyar-Agarwal, Surekha; Goel, Shailendra; Kumar, Amar; Agarwal, Manu

    2015-01-01

    Low temperature is a major abiotic stress that impedes plant growth and development. Brassica juncea is an economically important oil seed crop and is sensitive to freezing stress during pod filling subsequently leading to abortion of seeds. To understand the cold stress mediated global perturbations in gene expression, whole transcriptome of B. juncea siliques that were exposed to sub-optimal temperature was sequenced. Manually self-pollinated siliques at different stages of development were subjected to either short (6 h) or long (12 h) durations of chilling stress followed by construction of RNA-seq libraries and deep sequencing using Illumina's NGS platform. De-novo assembly of B. juncea transcriptome resulted in 133,641 transcripts, whose combined length was 117 Mb and N50 value was 1428 bp. We identified 13,342 differentially regulated transcripts by pair-wise comparison of 18 transcriptome libraries. Hierarchical clustering along with Spearman correlation analysis identified that the differentially expressed genes segregated in two major clusters representing early (5–15 DAP) and late stages (20–30 DAP) of silique development. Further analysis led to the discovery of sub-clusters having similar patterns of gene expression. Two of the sub-clusters (one each from the early and late stages) comprised of genes that were inducible by both the durations of cold stress. Comparison of transcripts from these clusters led to identification of 283 transcripts that were commonly induced by cold stress, and were referred to as “core cold-inducible” transcripts. Additionally, we found that 689 and 100 transcripts were specifically up-regulated by cold stress in early and late stages, respectively. We further explored the expression patterns of gene families encoding for transcription factors (TFs), transcription regulators (TRs) and kinases, and found that cold stress induced protein kinases only during early silique development. We validated the digital gene expression profiles of selected transcripts by qPCR and found a high degree of concordance between the two analyses. To our knowledge this is the first report of transcriptome sequencing of cold-stressed B. juncea siliques. The data generated in this study would be a valuable resource for not only understanding the cold stress signaling pathway but also for introducing cold hardiness in B. juncea. PMID:26579175

  4. Gene expression profiles of breast biopsies from healthy women identify a group with claudin-low features.

    PubMed

    Haakensen, Vilde D; Lingjaerde, Ole Christian; Lüders, Torben; Riis, Margit; Prat, Aleix; Troester, Melissa A; Holmen, Marit M; Frantzen, Jan Ole; Romundstad, Linda; Navjord, Dina; Bukholm, Ida K; Johannesen, Tom B; Perou, Charles M; Ursin, Giske; Kristensen, Vessela N; Børresen-Dale, Anne-Lise; Helland, Aslaug

    2011-11-01

    Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer.

  5. Comparative genomics of ParaHox clusters of teleost fishes: gene cluster breakup and the retention of gene sets following whole genome duplications

    PubMed Central

    Siegel, Nicol; Hoegg, Simone; Salzburger, Walter; Braasch, Ingo; Meyer, Axel

    2007-01-01

    Background The evolutionary lineage leading to the teleost fish underwent a whole genome duplication termed FSGD or 3R in addition to two prior genome duplications that took place earlier during vertebrate evolution (termed 1R and 2R). Resulting from the FSGD, additional copies of genes are present in fish, compared to tetrapods whose lineage did not experience the 3R genome duplication. Interestingly, we find that ParaHox genes do not differ in number in extant teleost fishes despite their additional genome duplication from the genomic situation in mammals, but they are distributed over twice as many paralogous regions in fish genomes. Results We determined the DNA sequence of the entire ParaHox C1 paralogon in the East African cichlid fish Astatotilapia burtoni, and compared it to orthologous regions in other vertebrate genomes as well as to the paralogous vertebrate ParaHox D paralogons. Evolutionary relationships among genes from these four chromosomal regions were studied with several phylogenetic algorithms. We provide evidence that the genes of the ParaHox C paralogous cluster are duplicated in teleosts, just as it had been shown previously for the D paralogon genes. Overall, however, synteny and cluster integrity seems to be less conserved in ParaHox gene clusters than in Hox gene clusters. Comparative analyses of non-coding sequences uncovered conserved, possibly co-regulatory elements, which are likely to contain promoter motives of the genes belonging to the ParaHox paralogons. Conclusion There seems to be strong stabilizing selection for gene order as well as gene orientation in the ParaHox C paralogon, since with a few exceptions, only the lengths of the introns and intergenic regions differ between the distantly related species examined. The high degree of evolutionary conservation of this gene cluster's architecture in particular – but possibly clusters of genes more generally – might be linked to the presence of promoter, enhancer or inhibitor motifs that serve to regulate more than just one gene. Therefore, deletions, inversions or relocations of individual genes could destroy the regulation of the clustered genes in this region. The existence of such a regulation network might explain the evolutionary conservation of gene order and orientation over the course of hundreds of millions of years of vertebrate evolution. Another possible explanation for the highly conserved gene order might be the existence of a regulator not located immediately next to its corresponding gene but further away since a relocation or inversion would possibly interrupt this interaction. Different ParaHox clusters were found to have experienced differential gene loss in teleosts. Yet the complete set of these homeobox genes was maintained, albeit distributed over almost twice the number of chromosomes. Selection due to dosage effects and/or stoichiometric disturbance might act more strongly to maintain a modal number of homeobox genes (and possibly transcription factors more generally) per genome, yet permit the accumulation of other (non regulatory) genes associated with these homeobox gene clusters. PMID:17822543

  6. Clustering Algorithms: Their Application to Gene Expression Data

    PubMed Central

    Oyelade, Jelili; Isewon, Itunuoluwa; Oladipupo, Funke; Aromolaran, Olufemi; Uwoghiren, Efosa; Ameh, Faridah; Achas, Moses; Adebiyi, Ezekiel

    2016-01-01

    Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure. PMID:27932867

  7. Global Distribution and Evolutionary History of Enterovirus D68, with Emphasis on the 2014 Outbreak in Ontario, Canada.

    PubMed

    Eshaghi, Alireza; Duvvuri, Venkata R; Isabel, Sandra; Banh, Philip; Li, Aimin; Peci, Adriana; Patel, Samir N; Gubbay, Jonathan B

    2017-01-01

    Despite its first appearance in 1962, human enterovirus D68 (EV-D68) has been recognized as an emerging respiratory pathogen in the last decade when it caused outbreaks and clusters in several countries including Japan, the Philippines, and the Netherlands. The most recent and largest outbreak of EV-D68 associated with severe respiratory illness took place in North America between August 2014 and January 2015. Between September 1 and October 31 2014, EV-D68 infection was laboratory confirmed among 153/907 (16.9%) persons tested for the virus in Ontario, Canada, using real time RT-PCR and subsequent genotyping by sequencing of partial VP1 gene. In order to understand the evolutionary history of the 2014 North American EV-D68 outbreak, we conducted phylogenetic and phylodynamic analyses using available partial VP1 genes ( n = 469) and NCBI available whole genome sequences (WGS) ( n = 38). The global EV-D68 phylogenetic tree ( n = 469) reconfirms the divergence of three distinct clades A, B, and C from the prototype EV-D68 Fermon strain as previously documented. Two sub-clades (B1 and B2) were identified, with most 2014 EV-D68 outbreak strains belonging to sub-cluster B2b2 (one of the two emerging clusters within sub-clade B2), with two signature substitutions T650A and M700V in BC and DE loops of VP1 gene, respectively. The close homology between WGS of strains from Ontario ( n = 2) and USA ( n = 21) in the recent EV-D68 outbreak suggests genetic relatedness and also a common source for the outbreak. The time of most recent common ancestor of EV-D68 and the 2014 EV-D68 outbreak strain suggest that the viruses possibly emerged during 1960-1961 and 2012-2013, respectively. We observed lower mean evolutionary rates of global EV-D68 using WGS data than estimated with partial VP1 gene sequences. Based on WGS data, the estimated mean rate of evolution of the EV-D68 B2b cluster was 9.75 × 10 -3 substitutions/site/year (95% BCI 4.11 × 10 -3 to 16 × 10 -3 ).

  8. Global Distribution and Evolutionary History of Enterovirus D68, with Emphasis on the 2014 Outbreak in Ontario, Canada

    PubMed Central

    Eshaghi, Alireza; Duvvuri, Venkata R.; Isabel, Sandra; Banh, Philip; Li, Aimin; Peci, Adriana; Patel, Samir N.; Gubbay, Jonathan B.

    2017-01-01

    Despite its first appearance in 1962, human enterovirus D68 (EV-D68) has been recognized as an emerging respiratory pathogen in the last decade when it caused outbreaks and clusters in several countries including Japan, the Philippines, and the Netherlands. The most recent and largest outbreak of EV-D68 associated with severe respiratory illness took place in North America between August 2014 and January 2015. Between September 1 and October 31 2014, EV-D68 infection was laboratory confirmed among 153/907 (16.9%) persons tested for the virus in Ontario, Canada, using real time RT-PCR and subsequent genotyping by sequencing of partial VP1 gene. In order to understand the evolutionary history of the 2014 North American EV-D68 outbreak, we conducted phylogenetic and phylodynamic analyses using available partial VP1 genes (n = 469) and NCBI available whole genome sequences (WGS) (n = 38). The global EV-D68 phylogenetic tree (n = 469) reconfirms the divergence of three distinct clades A, B, and C from the prototype EV-D68 Fermon strain as previously documented. Two sub-clades (B1 and B2) were identified, with most 2014 EV-D68 outbreak strains belonging to sub-cluster B2b2 (one of the two emerging clusters within sub-clade B2), with two signature substitutions T650A and M700V in BC and DE loops of VP1 gene, respectively. The close homology between WGS of strains from Ontario (n = 2) and USA (n = 21) in the recent EV-D68 outbreak suggests genetic relatedness and also a common source for the outbreak. The time of most recent common ancestor of EV-D68 and the 2014 EV-D68 outbreak strain suggest that the viruses possibly emerged during 1960–1961 and 2012–2013, respectively. We observed lower mean evolutionary rates of global EV-D68 using WGS data than estimated with partial VP1 gene sequences. Based on WGS data, the estimated mean rate of evolution of the EV-D68 B2b cluster was 9.75 × 10-3 substitutions/site/year (95% BCI 4.11 × 10-3 to 16 × 10-3). PMID:28298902

  9. Genome Content and Phylogenomics Reveal both Ancestral and Lateral Evolutionary Pathways in Plant-Pathogenic Streptomyces Species

    PubMed Central

    Huguet-Tapia, Jose C.; Lefebure, Tristan; Badger, Jonathan H.; Guan, Dongli; Stanhope, Michael J.

    2016-01-01

    Streptomyces spp. are highly differentiated actinomycetes with large, linear chromosomes that encode an arsenal of biologically active molecules and catabolic enzymes. Members of this genus are well equipped for life in nutrient-limited environments and are common soil saprophytes. Out of the hundreds of species in the genus Streptomyces, a small group has evolved the ability to infect plants. The recent availability of Streptomyces genome sequences, including four genomes of pathogenic species, provided an opportunity to characterize the gene content specific to these pathogens and to study phylogenetic relationships among them. Genome sequencing, comparative genomics, and phylogenetic analysis enabled us to discriminate pathogenic from saprophytic Streptomyces strains; moreover, we calculated that the pathogen-specific genome contains 4,662 orthologs. Phylogenetic reconstruction suggested that Streptomyces scabies and S. ipomoeae share an ancestor but that their biosynthetic clusters encoding the required virulence factor thaxtomin have diverged. In contrast, S. turgidiscabies and S. acidiscabies, two relatively unrelated pathogens, possess highly similar thaxtomin biosynthesis clusters, which suggests that the acquisition of these genes was through lateral gene transfer. PMID:26826232

  10. Genomic analyses of bacterial porin-cytochrome gene clusters

    DOE PAGES

    Shi, Liang; Fredrickson, James K.; Zachara, John M.

    2014-11-26

    In this study, the porin-cytochrome (Pcc) protein complex is responsible for trans-outer membrane electron transfer during extracellular reduction of Fe(III) by the dissimilatory metal-reducing bacterium Geobacter sulfurreducens PCA. The identified and characterized Pcc complex of G. sulfurreducens PCA consists of a porin-like outer-membrane protein, a periplasmic 8-heme c type cytochrome (c-Cyt) and an outer-membrane 12-heme c-Cyt, and the genes encoding the Pcc proteins are clustered in the same regions of genome (i.e., the pcc gene clusters) of G. sulfurreducens PCA. A survey of additionally microbial genomes has identified the pcc gene clusters in all sequenced Geobacter spp. and other bacteriamore » from six different phyla, including Anaeromyxobacter dehalogenans 2CP-1, A. dehalogenans 2CP-C, Anaeromyxobacter sp. K, Candidatus Kuenenia stuttgartiensis, Denitrovibrio acetiphilus DSM 12809, Desulfurispirillum indicum S5, Desulfurivibrio alkaliphilus AHT2, Desulfurobacterium thermolithotrophum DSM 11699, Desulfuromonas acetoxidans DSM 684, Ignavibacterium album JCM 16511, and Thermovibrio ammonificans HB-1. The numbers of genes in the pcc gene clusters vary, ranging from two to nine. Similar to the metal-reducing (Mtr) gene clusters of other Fe(III)-reducing bacteria, such as Shewanella spp., additional genes that encode putative c-Cyts with predicted cellular localizations at the cytoplasmic membrane, periplasm and outer membrane often associate with the pcc gene clusters. This suggests that the Pcc-associated c-Cyts may be part of the pathways for extracellular electron transfer reactions. The presence of pcc gene clusters in the microorganisms that do not reduce solid-phase Fe(III) and Mn(IV) oxides, such as D. alkaliphilus AHT2 and I. album JCM 16511, also suggests that some of the pcc gene clusters may be involved in extracellular electron transfer reactions with the substrates other than Fe(III) and Mn(IV) oxides.« less

  11. Activation and comparative analysis of cryptic xiamycin gene cluster from marine-derived Streptomyces sp. FXJ 7.388.

    PubMed

    Uhong Lü, Yuhong; Liu, Xiaoli; Wang, Miao; Li, Yuanyuan; Liu, Ning; Bao, Yuxin; Liu, Minghao; Li, Xiaoqian; Wang, Yinyin; Qian, Shenyan; Yue, Changwu; Huang, Ying

    2016-09-01

    In order to obtain the natural products synthesized by the three putative xiamycin biosynthesis gene clusters which were predicted via antiSMASH during the genome mining of marine Streptomyces sp. FXJ 7.388, Streptomyces sp. FXJ 8.012, and Streptomyces olivaceus FXJ 7.023. Sixteen genes involved in xiamycin assembly, modification, and regulation with higher identity than the newest reported xiamycin biosynthetic gene cluster from marine Streptomyces sp. SCSIO 02999, Streptomyces sp. HKI0576, and Streptomyces sp. FXJ 7.388 were discovered via gene cluster comparative analysis. A ribosome engineering strategy was adopted to activate such cryptic gene clusters with different final concentrations antibiotics that act on the ribosome, and two indolosesquiterpenes were isolated from idlethaldose streptomycin-resistant Streptomyces sp. FXJ 7.388 strains. However, no such product was detected in Streptomyces sp. FXJ 8.012 and Streptomyces olivaceus FXJ 7.023 under the same treatment. This result suggested that these genes might hold the least gene content for xiamycin biosynthesis.

  12. Genes encoding cuticular proteins are components of the Nimrod gene cluster in Drosophila.

    PubMed

    Cinege, Gyöngyi; Zsámboki, János; Vidal-Quadras, Maite; Uv, Anne; Csordás, Gábor; Honti, Viktor; Gábor, Erika; Hegedűs, Zoltán; Varga, Gergely I B; Kovács, Attila L; Juhász, Gábor; Williams, Michael J; Andó, István; Kurucz, Éva

    2017-08-01

    The Nimrod gene cluster, located on the second chromosome of Drosophila melanogaster, is the largest synthenic unit of the Drosophila genome. Nimrod genes show blood cell specific expression and code for phagocytosis receptors that play a major role in fruit fly innate immune functions. We previously identified three homologous genes (vajk-1, vajk-2 and vajk-3) located within the Nimrod cluster, which are unrelated to the Nimrod genes, but are homologous to a fourth gene (vajk-4) located outside the cluster. Here we show that, unlike the Nimrod candidates, the Vajk proteins are expressed in cuticular structures of the late embryo and the late pupa, indicating that they contribute to cuticular barrier functions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. Analysis of genetic association using hierarchical clustering and cluster validation indices.

    PubMed

    Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

    2017-10-01

    It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. Variability among Cucurbitaceae species (melon, cucumber and watermelon) in a genomic region containing a cluster of NBS-LRR genes.

    PubMed

    Morata, Jordi; Puigdomènech, Pere

    2017-02-08

    Cucurbitaceae species contain a significantly lower number of genes coding for proteins with similarity to plant resistance genes belonging to the NBS-LRR family than other plant species of similar genome size. A large proportion of these genes are organized in clusters that appear to be hotspots of variability. The genomes of the Cucurbitaceae species measured until now are intermediate in size (between 350 and 450 Mb) and they apparently have not undergone any genome duplications beside those at the origin of eudicots. The cluster containing the largest number of NBS-LRR genes has previously been analyzed in melon and related species and showed a high degree of interspecific and intraspecific variability. It was of interest to study whether similar behavior occurred in other cluster of the same family of genes. The cluster of NBS-LRR genes located in melon chromosome 9 was analyzed and compared with the syntenic regions in other cucurbit genomes. This is the second cluster in number within this species and it contains nine sequences with a NBS-LRR annotation including two genes, Fom1 and Prv, providing resistance against Fusarium and Ppapaya ring-spot virus (PRSV). The variability within the melon species appears to consist essentially of single nucleotide polymorphisms. Clusters of similar genes are present in the syntenic regions of the two species of Cucurbitaceae that were sequenced, cucumber and watermelon. Most of the genes in the syntenic clusters can be aligned between species and a hypothesis of generation of the cluster is proposed. The number of genes in the watermelon cluster is similar to that in melon while a higher number of genes (12) is present in cucumber, a species with a smaller genome than melon. After comparing genome resequencing data of 115 cucumber varieties, deletion of a group of genes is observed in a group of varieties of Indian origin. Clusters of genes coding for NBS-LRR proteins in cucurbits appear to have specific variability in different regions of the genome and between different species. This observation is in favour of considering that the adaptation of plant species to changing environments is based upon the variability that may occur at any location in the genome and that has been produced by specific mechanisms of sequence variation acting on plant genomes. This information could be useful both to understand the evolution of species and for plant breeding.

  15. Functional grouping of similar genes using eigenanalysis on minimum spanning tree based neighborhood graph.

    PubMed

    Jothi, R; Mohanty, Sraban Kumar; Ojha, Aparajita

    2016-04-01

    Gene expression data clustering is an important biological process in DNA microarray analysis. Although there have been many clustering algorithms for gene expression analysis, finding a suitable and effective clustering algorithm is always a challenging problem due to the heterogeneous nature of gene profiles. Minimum Spanning Tree (MST) based clustering algorithms have been successfully employed to detect clusters of varying shapes and sizes. This paper proposes a novel clustering algorithm using Eigenanalysis on Minimum Spanning Tree based neighborhood graph (E-MST). As MST of a set of points reflects the similarity of the points with their neighborhood, the proposed algorithm employs a similarity graph obtained from k(') rounds of MST (k(')-MST neighborhood graph). By studying the spectral properties of the similarity matrix obtained from k(')-MST graph, the proposed algorithm achieves improved clustering results. We demonstrate the efficacy of the proposed algorithm on 12 gene expression datasets. Experimental results show that the proposed algorithm performs better than the standard clustering algorithms. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Plant polycistronic precursors containing non-homologous microRNAs target transcripts encoding functionally related proteins

    PubMed Central

    2009-01-01

    Background MicroRNAs (miRNAs) are endogenous single-stranded small RNAs that regulate the expression of specific mRNAs involved in diverse biological processes. In plants, miRNAs are generally encoded as a single species in independent transcriptional units, referred to as MIRNA genes, in contrast to animal miRNAs, which are frequently clustered. Results We performed a comparative genomic analysis in three model plants (rice, poplar and Arabidopsis) and characterized miRNA clusters containing two to eight miRNA species. These clusters usually encode miRNAs of the same family and certain share a common evolutionary origin across monocot and dicot lineages. In addition, we identified miRNA clusters harboring miRNAs with unrelated sequences that are usually not evolutionarily conserved. Strikingly, non-homologous miRNAs from the same cluster were predicted to target transcripts encoding related proteins. At least four Arabidopsis non-homologous clusters were expressed as single transcriptional units. Overexpression of one of these polycistronic precursors, producing Ath-miR859 and Ath-miR774, led to the DCL1-dependent accumulation of both miRNAs and down-regulation of their different mRNA targets encoding F-box proteins. Conclusions In addition to polycistronic precursors carrying related miRNAs, plants also contain precursors allowing coordinated expression of non-homologous miRNAs to co-regulate functionally related target transcripts. This mechanism paves the way for using polycistronic MIRNA precursors as a new molecular tool for plant biologists to simultaneously control the expression of different genes. PMID:19951405

  17. A conserved gene cluster as a putative functional unit in insect innate immunity.

    PubMed

    Somogyi, Kálmán; Sipos, Botond; Pénzes, Zsolt; Andó, István

    2010-11-05

    The Nimrod gene superfamily is an important component of the innate immune response. The majority of its member genes are located in close proximity within the Drosophila melanogaster genome and they lie in a larger conserved cluster ("Nimrod cluster"), made up of non-related groups (families, superfamilies) of genes. This cluster has been a part of the Arthropod genomes for about 300-350 million years. The available data suggest that the Nimrod cluster is a functional module of the insect innate immune response. Copyright © 2010 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  18. Cloning and Characterization of the Pyrrolomycin Biosynthetic Gene Clusters from Actinosporangium vitaminophilum ATCC 31673 and Streptomyces sp. Strain UC 11065▿

    PubMed Central

    Zhang, Xiujun; Parry, Ronald J.

    2007-01-01

    The pyrrolomycins are a family of polyketide antibiotics, some of which contain a nitro group. To gain insight into the nitration mechanism associated with the formation of these antibiotics, the pyrrolomycin biosynthetic gene cluster from Actinosporangium vitaminophilum was cloned. Sequencing of ca. 56 kb of A. vitaminophilum DNA revealed 35 open reading frames (ORFs). Sequence analysis revealed a clear relationship between some of these ORFs and the biosynthetic gene cluster for pyoluteorin, a structurally related antibiotic. Since a gene transfer system could not be devised for A. vitaminophilum, additional proof for the identity of the cloned gene cluster was sought by cloning the pyrrolomycin gene cluster from Streptomyces sp. strain UC 11065, a transformable pyrrolomycin producer. Sequencing of ca. 26 kb of UC 11065 DNA revealed the presence of 17 ORFs, 15 of which exhibit strong similarity to ORFs in the A. vitaminophilum cluster as well as a nearly identical organization. Single-crossover disruption of two genes in the UC 11065 cluster abolished pyrrolomycin production in both cases. These results confirm that the genetic locus cloned from UC 11065 is essential for pyrrolomycin production, and they also confirm that the highly similar locus in A. vitaminophilum encodes pyrrolomycin biosynthetic genes. Sequence analysis revealed that both clusters contain genes encoding the two components of an assimilatory nitrate reductase. This finding suggests that nitrite is required for the formation of the nitrated pyrrolomycins. However, sequence analysis did not provide additional insights into the nitration process, suggesting the operation of a novel nitration mechanism. PMID:17158935

  19. Clusters of antibiotic resistance genes enriched together stay together in swine agriculture

    DOE PAGES

    Johnson, Timothy A.; Stedtfeld, Robert D.; Wang, Qiong; ...

    2016-04-12

    Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundancemore » of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk.Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of resistance genes if they are genetically linked. No links to bacterial membership were observed for these clusters of resistance genes. These findings urge deeper understanding of colocalization of resistance genes and mobile genetic elements in resistance islands and their distribution throughout antibiotic-exposed microbiomes. In addition, as governments seek to combat the rise in antibiotic resistance, a balance is sought between ensuring proper animal health and welfare and preserving medically important antibiotics for therapeutic use. Metagenomic and genomic monitoring will be critical to determine if resistance genes can be reduced in animal microbiomes, or if these gene clusters will continue to be coselected by antibiotics not deemed medically important for human health but used for growth promotion or by medically important antibiotics used therapeutically.« less

  20. Clusters of antibiotic resistance genes enriched together stay together in swine agriculture

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnson, Timothy A.; Stedtfeld, Robert D.; Wang, Qiong

    Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundancemore » of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk.Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of resistance genes if they are genetically linked. No links to bacterial membership were observed for these clusters of resistance genes. These findings urge deeper understanding of colocalization of resistance genes and mobile genetic elements in resistance islands and their distribution throughout antibiotic-exposed microbiomes. In addition, as governments seek to combat the rise in antibiotic resistance, a balance is sought between ensuring proper animal health and welfare and preserving medically important antibiotics for therapeutic use. Metagenomic and genomic monitoring will be critical to determine if resistance genes can be reduced in animal microbiomes, or if these gene clusters will continue to be coselected by antibiotics not deemed medically important for human health but used for growth promotion or by medically important antibiotics used therapeutically.« less

  1. Clusters of Antibiotic Resistance Genes Enriched Together Stay Together in Swine Agriculture.

    PubMed

    Johnson, Timothy A; Stedtfeld, Robert D; Wang, Qiong; Cole, James R; Hashsham, Syed A; Looft, Torey; Zhu, Yong-Guan; Tiedje, James M

    2016-04-12

    Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundance of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk. Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of resistance genes if they are genetically linked. No links to bacterial membership were observed for these clusters of resistance genes. These findings urge deeper understanding of colocalization of resistance genes and mobile genetic elements in resistance islands and their distribution throughout antibiotic-exposed microbiomes. As governments seek to combat the rise in antibiotic resistance, a balance is sought between ensuring proper animal health and welfare and preserving medically important antibiotics for therapeutic use. Metagenomic and genomic monitoring will be critical to determine if resistance genes can be reduced in animal microbiomes, or if these gene clusters will continue to be coselected by antibiotics not deemed medically important for human health but used for growth promotion or by medically important antibiotics used therapeutically. Copyright © 2016 Johnson et al.

  2. Production of High Amounts of Hepatotoxin Nodularin and New Protease Inhibitors Pseudospumigins by the Brazilian Benthic Nostoc sp. CENA543

    PubMed Central

    Jokela, Jouni; Heinilä, Lassi M. P.; Shishido, Tânia K.; Wahlsten, Matti; Fewer, David P.; Fiore, Marli F.; Wang, Hao; Haapaniemi, Esa; Permi, Perttu; Sivonen, Kaarina

    2017-01-01

    Nostoc is a cyanobacterial genus, common in soils and a prolific producer of natural products. This research project aimed to explore and characterize Brazilian cyanobacteria for new bioactive compounds. Here we report the production of hepatotoxins and new protease inhibitors from benthic Nostoc sp. CENA543 isolated from a small, shallow, saline-alkaline lake in the Nhecolândia, Pantanal wetland area in Brazil. Nostoc sp. CENA543 produces exceptionally high amounts of nodularin-R. This is the first free-living Nostoc that produces nodularin at comparable levels as the toxic, bloom-forming, Nodularia spumigena. We also characterized pseudospumigins A–F, which are a novel family of linear tetrapeptides. Pseudospumigins are structurally related to linear tetrapeptide spumigins and aeruginosins both present in N. spumigena but differ in respect to their diagnostic amino acid, which is Ile/Leu/Val in pseudospumigins, Pro/mPro in spumigins, and Choi in aeruginosins. The pseudospumigin gene cluster is more similar to the spumigin biosynthetic gene cluster than the aeruginosin gene cluster. Pseudospumigin A inhibited trypsin (IC50 4.5 μM after 1 h) in a similar manner as spumigin E from N. spumigena but was almost two orders of magnitude less potent. This study identifies another location and environment where the hepatotoxic nodularin has the potential to cause the death of eukaryotic organisms. PMID:29062311

  3. Classification and Clustering Methods for Multiple Environmental Factors in Gene-Environment Interaction: Application to the Multi-Ethnic Study of Atherosclerosis.

    PubMed

    Ko, Yi-An; Mukherjee, Bhramar; Smith, Jennifer A; Kardia, Sharon L R; Allison, Matthew; Diez Roux, Ana V

    2016-11-01

    There has been an increased interest in identifying gene-environment interaction (G × E) in the context of multiple environmental exposures. Most G × E studies analyze one exposure at a time, but we are exposed to multiple exposures in reality. Efficient analysis strategies for complex G × E with multiple environmental factors in a single model are still lacking. Using the data from the Multiethnic Study of Atherosclerosis, we illustrate a two-step approach for modeling G × E with multiple environmental factors. First, we utilize common clustering and classification strategies (e.g., k-means, latent class analysis, classification and regression trees, Bayesian clustering using Dirichlet Process) to define subgroups corresponding to distinct environmental exposure profiles. Second, we illustrate the use of an additive main effects and multiplicative interaction model, instead of the conventional saturated interaction model using product terms of factors, to study G × E with the data-driven exposure subgroups defined in the first step. We demonstrate useful analytical approaches to translate multiple environmental exposures into one summary class. These tools not only allow researchers to consider several environmental exposures in G × E analysis but also provide some insight into how genes modify the effect of a comprehensive exposure profile instead of examining effect modification for each exposure in isolation.

  4. Detection of emerging rotavirus G12P[8] in Sonora, México.

    PubMed

    González-Ochoa, G; J, G de; Calleja-García, P M; Rosas-Rodríguez, J A; Virgen-Ortíz, A; Tamez-Guerra, P

    2016-06-01

    Rotavirus is the most common cause of gastroenteritis in children up to five years of age worldwide. The aim of the present study was to analyze the genotypes of rotavirus strains isolated from children with gastroenteritis, after the introduction of the rotavirus vaccine in México. Rotavirus was detected in 14/100 (14%) fecal samples from children with gastroenteritis, using a commercial test kit. The viral genome was purified from these samples and used as a template in RT-PCR amplification of the VP4 and VP7 genes, followed by gene cloning and sequencing. Among the rotavirus strains, 4/14 (28.5%) were characterized as G12P[8], 2/14 (14.3%), as G12P (not typed), and 3/14 (21.42%) as G (not typed) P[8]. Phylogenetic analysis of the VP7 gene showed that G12 genotypes clustered in lineage III. Phylogenetic analysis revealed that VP4 genotype P[8] sequences clustered in lineage V, whereas other P[8] sequences previously reported in Mexico (2005-2008) clustered in different lineages. Rotavirus genotype G12 is currently recognized as a globally emerging rotavirus. To our knowledge, this is the first report of this emerging rotavirus strain G12P[8] in México. Ongoing surveillance is recommended to monitor the distribution of rotavirus genotypes and to continually reassess the suitability of currently available rotavirus vaccines.

  5. Molecular Characterization of Copper and Cadmium Resistance Determinants in the Biomining Thermoacidophilic Archaeon Sulfolobus metallicus

    PubMed Central

    Orell, Alvaro; Remonsellez, Francisco; Arancibia, Rafaela; Jerez, Carlos A.

    2013-01-01

    Sulfolobus metallicus is a thermoacidophilic crenarchaeon used in high-temperature bioleaching processes that is able to grow under stressing conditions such as high concentrations of heavy metals. Nevertheless, the genetic and biochemical mechanisms responsible for heavy metal resistance in S. metallicus remain uncharacterized. Proteomic analysis of S. metallicus cells exposed to 100 mM Cu revealed that 18 out of 30 upregulated proteins are related to the production and conversion of energy, amino acids biosynthesis, and stress responses. Ten of these last proteins were also up-regulated in S. metallicus treated in the presence of 1 mM Cd suggesting that at least in part, a common general response to these two heavy metals. The S. metallicus genome contained two complete cop gene clusters, each encoding a metallochaperone (CopM), a Cu-exporting ATPase (CopA), and a transcriptional regulator (CopT). Transcriptional expression analysis revealed that copM and copA from each cop gene cluster were cotranscribed and their transcript levels increased when S. metallicus was grown either in the presence of Cu or using chalcopyrite (CuFeS2) as oxidizable substrate. This study shows for the first time the presence of a duplicated version of the cop gene cluster in Archaea and characterizes some of the Cu and Cd resistance determinants in a thermophilic archaeon employed for industrial biomining. PMID:23509422

  6. Molecular characterization of copper and cadmium resistance determinants in the biomining thermoacidophilic archaeon Sulfolobus metallicus.

    PubMed

    Orell, Alvaro; Remonsellez, Francisco; Arancibia, Rafaela; Jerez, Carlos A

    2013-01-01

    Sulfolobus metallicus is a thermoacidophilic crenarchaeon used in high-temperature bioleaching processes that is able to grow under stressing conditions such as high concentrations of heavy metals. Nevertheless, the genetic and biochemical mechanisms responsible for heavy metal resistance in S. metallicus remain uncharacterized. Proteomic analysis of S. metallicus cells exposed to 100 mM Cu revealed that 18 out of 30 upregulated proteins are related to the production and conversion of energy, amino acids biosynthesis, and stress responses. Ten of these last proteins were also up-regulated in S. metallicus treated in the presence of 1 mM Cd suggesting that at least in part, a common general response to these two heavy metals. The S. metallicus genome contained two complete cop gene clusters, each encoding a metallochaperone (CopM), a Cu-exporting ATPase (CopA), and a transcriptional regulator (CopT). Transcriptional expression analysis revealed that copM and copA from each cop gene cluster were cotranscribed and their transcript levels increased when S. metallicus was grown either in the presence of Cu or using chalcopyrite (CuFeS2) as oxidizable substrate. This study shows for the first time the presence of a duplicated version of the cop gene cluster in Archaea and characterizes some of the Cu and Cd resistance determinants in a thermophilic archaeon employed for industrial biomining.

  7. Analysis of Copy Number Variation in the Abp Gene Regions of Two House Mouse Subspecies Suggests Divergence during the Gene Family Expansions.

    PubMed

    Pezer, Željka; Chung, Amanda G; Karn, Robert C; Laukaitis, Christina M

    2017-06-01

    The Androgen-binding protein ( Abp ) gene region of the mouse genome contains 64 genes, some encoding pheromones that influence assortative mating between mice from different subspecies. Using CNVnator and quantitative PCR, we explored copy number variation in this gene family in natural populations of Mus musculus domesticus ( Mmd ) and Mus musculus musculus ( Mmm ), two subspecies of house mice that form a narrow hybrid zone in Central Europe. We found that copy number variation in the center of the Abp gene region is very common in wild Mmd , primarily representing the presence/absence of the final duplications described for the mouse genome. Clustering of Mmd individuals based on this variation did not reflect their geographical origin, suggesting no population divergence in the Abp gene cluster. However, copy number variation patterns differ substantially between Mmd and other mouse taxa. Large blocks of Abp genes are absent in Mmm , Mus musculus castaneus and an outgroup, Mus spretus , although with differences in variation and breakpoint locations. Our analysis calls into question the reliance on a reference genome for interpreting the detailed organization of genes in taxa more distant from the Mmd reference genome. The polymorphic nature of the gene family expansion in all four taxa suggests that the number of Abp genes, especially in the central gene region, is not critical to the survival and reproduction of the mouse. However, Abp haplotypes of variable length may serve as a source of raw genetic material for new signals influencing reproductive communication and thus speciation of mice. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  8. TimesVector: a vectorized clustering approach to the analysis of time series transcriptome data from multiple phenotypes.

    PubMed

    Jung, Inuk; Jo, Kyuri; Kang, Hyejin; Ahn, Hongryul; Yu, Youngjae; Kim, Sun

    2017-12-01

    Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i.e. gene-time-condition. Computational complexity for analyzing such data is very high, compared to the already difficult NP-hard two dimensional biclustering algorithms. Because of this challenge, traditional time series clustering algorithms are designed to capture co-expressed genes with similar expression pattern in two sample conditions. We present a triclustering algorithm, TimesVector, specifically designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. TimesVector identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters. Using four sets of time series gene expression data, generated by both microarray and high throughput sequencing platforms, we demonstrated that TimesVector successfully detected biologically meaningful clusters of high quality. TimesVector improved the clustering quality compared to existing triclustering tools and only TimesVector detected clusters with differential expression patterns across conditions successfully. The TimesVector software is available at http://biohealth.snu.ac.kr/software/TimesVector/. sunkim.bioinfo@snu.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  9. ORGANIZATION OF THE nif GENES OF THE NONHETEROCYSTOUS CYANOBACTERIUM TRICHODESMIUM SP. IMS101.

    PubMed

    Dominic, Benny; Zani, Sabino; Chen, Yi-Bu; Mellon, Mark T; Zehr, Jonathan P

    2000-08-26

    An approximately 16-kb fragment of the Trichodesmium sp. IMS101 (a nonheterocystous filamentous cyanobacterium) "conventional"nif gene cluster was cloned and sequenced. The gene organization of the Trichodesmium and Anabaena variabilis vegetative (nif 2) nitrogenase gene clusters spanning the region from nif B to nif W are similar except for the absence of two open reading frames (ORF3 and ORF1) in Trichodesmium. The Trichodesmium nif EN genes encode a fused Nif EN polypeptide that does not appear to be processed into individual Nif E and Nif N polypeptides. Fused nif EN genes were previously found in the A. variabilis nif 2 genes, but we have found that fused nif EN genes are widespread in the nonheterocystous cyanobacteria. Although the gene organization of the nonheterocystous filamentous Trichodesmium nif gene cluster is very similar to that of the A. variabilis vegetative nif 2 gene cluster, phylogenetic analysis of nif sequences do not support close relatedness of Trichodesmium and A. variabilis vegetative (nif 2) nitrogenase genes.

  10. Gene-trait matching across the Bifidobacterium longum pan-genome reveals considerable diversity in carbohydrate catabolism among human infant strains.

    PubMed

    Arboleya, Silvia; Bottacini, Francesca; O'Connell-Motherway, Mary; Ryan, C Anthony; Ross, R Paul; van Sinderen, Douwe; Stanton, Catherine

    2018-01-08

    Bifidobacterium longum is a common member of the human gut microbiota and is frequently present at high numbers in the gut microbiota of humans throughout life, thus indicative of a close symbiotic host-microbe relationship. Different mechanisms may be responsible for the high competitiveness of this taxon in its human host to allow stable establishment in the complex and dynamic intestinal microbiota environment. The objective of this study was to assess the genetic and metabolic diversity in a set of 20 B. longum strains, most of which had previously been isolated from infants, by performing whole genome sequencing and comparative analysis, and to analyse their carbohydrate utilization abilities using a gene-trait matching approach. We analysed their pan-genome and their phylogenetic relatedness. All strains clustered in the B. longum ssp. longum phylogenetic subgroup, except for one individual strain which was found to cluster in the B. longum ssp. suis phylogenetic group. The examined strains exhibit genomic diversity, while they also varied in their sugar utilization profiles. This allowed us to perform a gene-trait matching exercise enabling the identification of five gene clusters involved in the utilization of xylo-oligosaccharides, arabinan, arabinoxylan, galactan and fucosyllactose, the latter of which is an abundant human milk oligosaccharide (HMO). The results showed high diversity in terms of genes and predicted glycosyl-hydrolases, as well as the ability to metabolize a large range of sugars. Moreover, we corroborate the capability of B. longum ssp. longum to metabolise HMOs. Ultimately, their intraspecific genomic diversity and the ability to consume a wide assortment of carbohydrates, ranging from plant-derived carbohydrates to HMOs, may provide an explanation for the competitive advantage and persistence of B. longum in the human gut microbiome.

  11. Physiological oxygen prevents frequent silencing of the DLK1-DIO3 cluster during human embryonic stem cells culture.

    PubMed

    Xie, Pingyuan; Sun, Yi; Ouyang, Qi; Hu, Liang; Tan, Yueqiu; Zhou, Xiaoying; Xiong, Bo; Zhang, Qianjun; Yuan, Ding; Pan, Yi; Liu, Tiancheng; Liang, Ping; Lu, Guangxiu; Lin, Ge

    2014-02-01

    Genetic and epigenetic alterations are observed in long-term culture (>30 passages) of human embryonic stem cells (hESCs); however, little information is available in early cultures. Through a large-scale gene expression analysis between initial-passage hESCs (ihESCs, <10 passages) and early-passage hESCs (ehESCs, 20-30 passages) of 12 hESC lines, we found that the DLK1-DIO3 gene cluster was normally expressed and showed normal methylation pattern in ihESC, but was frequently silenced after 20 passages. Both the DLK1-DIO3 active status in ihESCs and the inactive status in ehESCs were inheritable during differentiation. Silencing of the DLK1-DIO3 cluster did not seem to compromise the multilineage differentiation ability of hESCs, but was associated with reduced DNA damage-induced apoptosis in ehESCs and their differentiated hepatocyte-like cell derivatives, possibly through attenuation of the expression and phosphorylation of p53. Furthermore, we demonstrated that 5% oxygen, instead of the commonly used 20% oxygen, is required for preserving the expression of the DLK1-DIO3 cluster. Overall, the data suggest that active expression of the DLK1-DIO3 cluster represents a new biomarker for epigenetic stability of hESCs and indicates the importance of using a proper physiological oxygen level during the derivation and culture of hESCs. © AlphaMed Press.

  12. Evolution of the mitochondrial genome in snakes: Gene rearrangements and phylogenetic relationships

    PubMed Central

    Yan, Jie; Li, Hongdan; Zhou, Kaiya

    2008-01-01

    Background Snakes as a major reptile group display a variety of morphological characteristics pertaining to their diverse behaviours. Despite abundant analyses of morphological characters, molecular studies using mitochondrial and nuclear genes are limited. As a result, the phylogeny of snakes remains controversial. Previous studies on mitochondrial genomes of snakes have demonstrated duplication of the control region and translocation of trnL to be two notable features of the alethinophidian (all serpents except blindsnakes and threadsnakes) mtDNAs. Our purpose is to further investigate the gene organizations, evolution of the snake mitochondrial genome, and phylogenetic relationships among several major snake families. Results The mitochondrial genomes were sequenced for four taxa representing four different families, and each had a different gene arrangement. Comparative analyses with other snake mitochondrial genomes allowed us to summarize six types of mitochondrial gene arrangement in snakes. Phylogenetic reconstruction with commonly used methods of phylogenetic inference (BI, ML, MP, NJ) arrived at a similar topology, which was used to reconstruct the evolution of mitochondrial gene arrangements in snakes. Conclusion The phylogenetic relationships among the major families of snakes are in accordance with the mitochondrial genomes in terms of gene arrangements. The gene arrangement in Ramphotyphlops braminus mtDNA is inferred to be ancestral for snakes. After the divergence of the early Ramphotyphlops lineage, three types of rearrangements occurred. These changes involve translocations within the IQM tRNA gene cluster and the duplication of the CR. All phylogenetic methods support the placement of Enhydris plumbea outside of the (Colubridae + Elapidae) cluster, providing mitochondrial genomic evidence for the familial rank of Homalopsidae. PMID:19038056

  13. Molecular evolution of the insect chemoreceptor gene superfamily in Drosophila melanogaster.

    PubMed

    Robertson, Hugh M; Warr, Coral G; Carlson, John R

    2003-11-25

    The insect chemoreceptor superfamily in Drosophila melanogaster is predicted to consist of 62 odorant receptor (Or) and 68 gustatory receptor (Gr) proteins, encoded by families of 60 Or and 60 Gr genes through alternative splicing. We include two previously undescribed Or genes and two previously undescribed Gr genes; two previously predicted Or genes are shown to be alternative splice forms. Three polymorphic pseudogenes and one highly defective pseudogene are recognized. Phylogenetic analysis reveals deep branches connecting multiple highly divergent clades within the Gr family, and the Or family appears to be a single highly expanded lineage within the superfamily. The genes are spread throughout the Drosophila genome, with some relatively recently diverged genes still clustered in the genome. The Gr5a gene on the X chromosome, which encodes a receptor for the sugar trehalose, has transposed from one such tandem cluster of six genes at cytological location 64, as has Gr61a, and all eight of these receptors might bind sugars. Analysis of intron evolution suggests that the common ancestor consisted of a long N-terminal exon encoding transmembrane domains 1-5 followed by three exons encoding transmembrane domains 6-7. As many as 57 additional introns have been acquired idiosyncratically during the evolution of the superfamily, whereas the ancestral introns and some of the older idiosyncratic introns have been lost at least 48 times independently. Altogether, these patterns of molecular evolution suggest that this is an ancient superfamily of chemoreceptors, probably dating back at least to the origin of the arthropods.

  14. Molecular evolution of the insect chemoreceptor gene superfamily in Drosophila melanogaster

    PubMed Central

    Robertson, Hugh M.; Warr, Coral G.; Carlson, John R.

    2003-01-01

    The insect chemoreceptor superfamily in Drosophila melanogaster is predicted to consist of 62 odorant receptor (Or) and 68 gustatory receptor (Gr) proteins, encoded by families of 60 Or and 60 Gr genes through alternative splicing. We include two previously undescribed Or genes and two previously undescribed Gr genes; two previously predicted Or genes are shown to be alternative splice forms. Three polymorphic pseudogenes and one highly defective pseudogene are recognized. Phylogenetic analysis reveals deep branches connecting multiple highly divergent clades within the Gr family, and the Or family appears to be a single highly expanded lineage within the superfamily. The genes are spread throughout the Drosophila genome, with some relatively recently diverged genes still clustered in the genome. The Gr5a gene on the X chromosome, which encodes a receptor for the sugar trehalose, has transposed from one such tandem cluster of six genes at cytological location 64, as has Gr61a, and all eight of these receptors might bind sugars. Analysis of intron evolution suggests that the common ancestor consisted of a long N-terminal exon encoding transmembrane domains 1-5 followed by three exons encoding transmembrane domains 6-7. As many as 57 additional introns have been acquired idiosyncratically during the evolution of the superfamily, whereas the ancestral introns and some of the older idiosyncratic introns have been lost at least 48 times independently. Altogether, these patterns of molecular evolution suggest that this is an ancient superfamily of chemoreceptors, probably dating back at least to the origin of the arthropods. PMID:14608037

  15. Differential Gene Expression in Normal Human Mammary Epithelial Cells Treated with Malathion Monitored by DNA Microarrays

    PubMed Central

    Gwinn, Maureen R.; Whipkey, Diana L.; Tennant, Lora B.; Weston, Ainsley

    2005-01-01

    Organophosphate pesticides are a major source of occupational exposure in the United States. Moreover, malathion has been sprayed over major urban populations in an effort to control mosquitoes carrying West Nile virus. Previous research, reviewed by the U.S. Environmental Protection Agency, on the genotoxicity and carcinogenicity of malathion has been inconclusive, although malathion is a known endocrine disruptor. Here, interindividual variations and commonality of gene expression signatures have been studied in normal human mammary epithelial cells from four women undergoing reduction mammoplasty. The cell strains were obtained from the discarded tissues through the Cooperative Human Tissue Network (sponsors: National Cancer Institute and National Disease Research Interchange). Interindividual variation of gene expression patterns in response to malathion was observed in various clustering patterns for the four cell strains. Further clustering identified three genes with increased expression after treatment in all four cell strains. These genes were two aldo–keto reductases (AKR1C1 and AKR1C2) and an estrogen-responsive gene (EBBP). Decreased expression of six RNA species was seen at various time points in all cell strains analyzed: plasminogen activator (PLAT), centromere protein F (CPF), replication factor C (RFC3), thymidylate synthetase (TYMS), a putative mitotic checkpoint kinase (BUB1), and a gene of unknown function (GenBank accession no. AI859865). Expression changes in all these genes, detected by DNA microarrays, have been verified by real-time polymerase chain reaction. Differential changes in expression of these genes may yield biomarkers that provide insight into interindividual variation in malathion toxicity. PMID:16079077

  16. A mixture model-based approach to the clustering of microarray expression data.

    PubMed

    McLachlan, G J; Bean, R W; Peel, D

    2002-03-01

    This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/

  17. Lampreys, the jawless vertebrates, contain only two ParaHox gene clusters.

    PubMed

    Zhang, Huixian; Ravi, Vydianathan; Tay, Boon-Hui; Tohari, Sumanty; Pillai, Nisha E; Prasad, Aravind; Lin, Qiang; Brenner, Sydney; Venkatesh, Byrappa

    2017-08-22

    ParaHox genes ( Gsx , Pdx , and Cdx ) are an ancient family of developmental genes closely related to the Hox genes. They play critical roles in the patterning of brain and gut. The basal chordate, amphioxus, contains a single ParaHox cluster comprising one member of each family, whereas nonteleost jawed vertebrates contain four ParaHox genomic loci with six or seven ParaHox genes. Teleosts, which have experienced an additional whole-genome duplication, contain six ParaHox genomic loci with six ParaHox genes. Jawless vertebrates, represented by lampreys and hagfish, are the most ancient group of vertebrates and are crucial for understanding the origin and evolution of vertebrate gene families. We have previously shown that lampreys contain six Hox gene loci. Here we report that lampreys contain only two ParaHox gene clusters (designated as α- and β-clusters) bearing five ParaHox genes ( Gsxα , Pdxα , Cdxα , Gsxβ , and Cdxβ ). The order and orientation of the three genes in the α-cluster are identical to that of the single cluster in amphioxus. However, the orientation of Gsxβ in the β-cluster is inverted. Interestingly, Gsxβ is expressed in the eye, unlike its homologs in jawed vertebrates, which are expressed mainly in the brain. The lamprey Pdxα is expressed in the pancreas similar to jawed vertebrate Pdx genes, indicating that the pancreatic expression of Pdx was acquired before the divergence of jawless and jawed vertebrate lineages. It is likely that the lamprey Pdxα plays a crucial role in pancreas specification and insulin production similar to the Pdx of jawed vertebrates.

  18. Clustered array of ochratoxin A biosynthetic genes in Aspergillus steynii and their expression patterns in permissive conditions.

    PubMed

    Gil-Serna, Jessica; Vázquez, Covadonga; González-Jaén, María Teresa; Patiño, Belén

    2015-12-02

    Aspergillus steynii is probably the most relevant species of section Circumdati producing ochratoxin A (OTA). This mycotoxin contaminates a wide number of commodities and it is highly toxic for humans and animals. Little is known on the biosynthetic genes and their regulation in Aspergillus species. In this work, we identified and analysed three contiguous genes in A. steynii using 5'-RACE and genome walking approaches which predicted a cytochrome P450 monooxygenase (p450ste), a non-ribosomal peptide synthetase (nrpsste) and a polyketide synthase (pksste). These three genes were contiguous within a 20742 bp long genomic DNA fragment. Their corresponding cDNA were sequenced and their expression was analysed in three A. steynii strains using real time RT-PCR specific assays in permissive conditions in in vitro cultures. OTA was also analysed in these cultures. Comparative analyses of predicted genomic, cDNA and amino acid sequences were performed with sequences of similar gene functions. All the results obtained in these analyses were consistent and point out the involvement of these three genes in OTA biosynthesis by A. steynii and showed a co-ordinated expression pattern. This is the first time that a clustered organization OTA biosynthetic genes has been reported in Aspergillus genus. The results also suggested that this situation might be common in Aspergillus OTA-producing species and distinct to the one described for Penicillium species. Copyright © 2015 Elsevier B.V. All rights reserved.

  19. Comparative analyses identify molecular signature of MRI-classified SVZ-associated glioblastoma

    PubMed Central

    Lin, Chin-Hsing Annie; Rhodes, Christopher T.; Lin, ChenWei; Phillips, Joanna J.; Berger, Mitchel S.

    2017-01-01

    ABSTRACT Glioblastoma (GBM) is a highly aggressive brain cancer with limited therapeutic options. While efforts to identify genes responsible for GBM have revealed mutations and aberrant gene expression associated with distinct types of GBM, patients with GBM are often diagnosed and classified based on MRI features. Therefore, we seek to identify molecular representatives in parallel with MRI classification for group I and group II primary GBM associated with the subventricular zone (SVZ). As group I and II GBM contain stem-like signature, we compared gene expression profiles between these 2 groups of primary GBM and endogenous neural stem progenitor cells to reveal dysregulation of cell cycle, chromatin status, cellular morphogenesis, and signaling pathways in these 2 types of MRI-classified GBM. In the absence of IDH mutation, several genes associated with metabolism are differentially expressed in these subtypes of primary GBM, implicating metabolic reprogramming occurs in tumor microenvironment. Furthermore, histone lysine methyltransferase EZH2 was upregulated while histone lysine demethylases KDM2 and KDM4 were downregulated in both group I and II primary GBM. Lastly, we identified 9 common genes across large data sets of gene expression profiles among MRI-classified group I/II GBM, a large cohort of GBM subtypes from TCGA, and glioma stem cells by unsupervised clustering comparison. These commonly upregulated genes have known functions in cell cycle, centromere assembly, chromosome segregation, and mitotic progression. Our findings highlight altered expression of genes important in chromosome integrity across all GBM, suggesting a common mechanism of disrupted fidelity of chromosome structure in GBM. PMID:28278055

  20. VRprofile: gene-cluster-detection-based profiling of virulence and antibiotic resistance traits encoded within genome sequences of pathogenic bacteria.

    PubMed

    Li, Jun; Tai, Cui; Deng, Zixin; Zhong, Weihong; He, Yongqun; Ou, Hong-Yu

    2017-01-10

    VRprofile is a Web server that facilitates rapid investigation of virulence and antibiotic resistance genes, as well as extends these trait transfer-related genetic contexts, in newly sequenced pathogenic bacterial genomes. The used backend database MobilomeDB was firstly built on sets of known gene cluster loci of bacterial type III/IV/VI/VII secretion systems and mobile genetic elements, including integrative and conjugative elements, prophages, class I integrons, IS elements and pathogenicity/antibiotic resistance islands. VRprofile is thus able to co-localize the homologs of these conserved gene clusters using HMMer or BLASTp searches. With the integration of the homologous gene cluster search module with a sequence composition module, VRprofile has exhibited better performance for island-like region predictions than the other widely used methods. In addition, VRprofile also provides an integrated Web interface for aligning and visualizing identified gene clusters with MobilomeDB-archived gene clusters, or a variety set of bacterial genomes. VRprofile might contribute to meet the increasing demands of re-annotations of bacterial variable regions, and aid in the real-time definitions of disease-relevant gene clusters in pathogenic bacteria of interest. VRprofile is freely available at http://bioinfo-mml.sjtu.edu.cn/VRprofile. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  1. Fragmentation of an aflatoxin-like gene cluster in a forest pathogen

    USDA-ARS?s Scientific Manuscript database

    Secondary metabolic pathway genes are typically clustered in fungi. An exception to this paradigm is seen for genes required for the production of dothistromin, an aflatoxin-like virulence factor produced by the pine needle pathogen Dothistroma septosporum. In contrast to the tight clustering of gen...

  2. Genome mining-directed activation of a silent angucycline biosynthetic gene cluster in Streptomyces chattanoogensis.

    PubMed

    Zhou, Zhenxing; Xu, Qingqing; Bu, Qingting; Guo, Yuanyang; Liu, Shuiping; Liu, Yu; Du, Yiling; Li, Yongquan

    2015-02-09

    Genomic sequencing of actinomycetes has revealed the presence of numerous gene clusters seemingly capable of natural product biosynthesis, yet most clusters are cryptic under laboratory conditions. Bioinformatics analysis of the completely sequenced genome of Streptomyces chattanoogensis L10 (CGMCC 2644) revealed a silent angucycline biosynthetic gene cluster. The overexpression of a pathway-specific activator gene under the constitutive ermE* promoter successfully triggered the expression of the angucycline biosynthetic genes. Two novel members of the angucycline antibiotic family, chattamycins A and B, were further isolated and elucidated. Biological activity assays demonstrated that chattamycin B possesses good antitumor activities against human cancer cell lines and moderate antibacterial activities. The results presented here provide a feasible method to activate silent angucycline biosynthetic gene clusters to discover potential new drug leads. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. The ergot alkaloid gene cluster: functional analyses and evolutionary aspects.

    PubMed

    Lorenz, Nicole; Haarmann, Thomas; Pazoutová, Sylvie; Jung, Manfred; Tudzynski, Paul

    2009-01-01

    Ergot alkaloids and their derivatives have been traditionally used as therapeutic agents in migraine, blood pressure regulation and help in childbirth and abortion. Their production in submerse culture is a long established biotechnological process. Ergot alkaloids are produced mainly by members of the genus Claviceps, with Claviceps purpurea as best investigated species concerning the biochemistry of ergot alkaloid synthesis (EAS). Genes encoding enzymes involved in EAS have been shown to be clustered; functional analyses of EAS cluster genes have allowed to assign specific functions to several gene products. Various Claviceps species differ with respect to their host specificity and their alkaloid content; comparison of the ergot alkaloid clusters in these species (and of clavine alkaloid clusters in other genera) yields interesting insights into the evolution of cluster structure. This review focuses on recently published and also yet unpublished data on the structure and evolution of the EAS gene cluster and on the function and regulation of cluster genes. These analyses have also significant biotechnological implications: the characterization of non-ribosomal peptide synthetases (NRPS) involved in the synthesis of the peptide moiety of ergopeptines opened interesting perspectives for the synthesis of ergot alkaloids; on the other hand, defined mutants could be generated producing interesting intermediates or only single peptide alkaloids (instead of the alkaloid mixtures usually produced by industrial strains).

  4. Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters.

    PubMed

    Schorn, Michelle A; Alanjary, Mohammad M; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R; Ziemert, Nadine; Moore, Bradley S

    2016-12-01

    Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites.

  5. Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters

    PubMed Central

    Schorn, Michelle A.; Alanjary, Mohammad M.; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R.; Ziemert, Nadine

    2016-01-01

    Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites. PMID:27902408

  6. Unique Physiological and Transcriptional Shifts under Combinations of Salinity, Drought, and Heat.

    PubMed

    Shaar-Moshe, Lidor; Blumwald, Eduardo; Peleg, Zvi

    2017-05-01

    Climate-change-driven stresses such as extreme temperatures, water deficit, and ion imbalance are projected to exacerbate and jeopardize global food security. Under field conditions, these stresses usually occur simultaneously and cause damages that exceed single stresses. Here, we investigated the transcriptional patterns and morpho-physiological acclimations of Brachypodium dystachion to single salinity, drought, and heat stresses, as well as their double and triple stress combinations. Hierarchical clustering analysis of morpho-physiological acclimations showed that several traits exhibited a gradually aggravating effect as plants were exposed to combined stresses. On the other hand, other morphological traits were dominated by salinity, while some physiological traits were shaped by heat stress. Response patterns of differentially expressed genes, under single and combined stresses (i.e. common stress genes), were maintained only among 37% of the genes, indicating a limited expression consistency among partially overlapping stresses. A comparison between common stress genes and genes that were uniquely expressed only under combined stresses (i.e. combination unique genes) revealed a significant shift from increased intensity to antagonistic responses, respectively. The different transcriptional signatures imply an alteration in the mode of action under combined stresses and limited ability to predict plant responses as different stresses are combined. Coexpression analysis coupled with enrichment analysis revealed that each gene subset was enriched with different biological processes. Common stress genes were enriched with known stress response pathways, while combination unique-genes were enriched with unique processes and genes with unknown functions that hold the potential to improve stress tolerance and enhance cereal productivity under suboptimal field conditions. © 2017 American Society of Plant Biologists. All Rights Reserved.

  7. Scoring clustering solutions by their biological relevance.

    PubMed

    Gat-Viks, I; Sharan, R; Shamir, R

    2003-12-12

    A central step in the analysis of gene expression data is the identification of groups of genes that exhibit similar expression patterns. Clustering gene expression data into homogeneous groups was shown to be instrumental in functional annotation, tissue classification, regulatory motif identification, and other applications. Although there is a rich literature on clustering algorithms for gene expression analysis, very few works addressed the systematic comparison and evaluation of clustering results. Typically, different clustering algorithms yield different clustering solutions on the same data, and there is no agreed upon guideline for choosing among them. We developed a novel statistically based method for assessing a clustering solution according to prior biological knowledge. Our method can be used to compare different clustering solutions or to optimize the parameters of a clustering algorithm. The method is based on projecting vectors of biological attributes of the clustered elements onto the real line, such that the ratio of between-groups and within-group variance estimators is maximized. The projected data are then scored using a non-parametric analysis of variance test, and the score's confidence is evaluated. We validate our approach using simulated data and show that our scoring method outperforms several extant methods, including the separation to homogeneity ratio and the silhouette measure. We apply our method to evaluate results of several clustering methods on yeast cell-cycle gene expression data. The software is available from the authors upon request.

  8. Protein-altering variants associated with body mass index implicate pathways that control energy intake and expenditure underpinning obesity

    PubMed Central

    Turcot, Valérie; Lu, Yingchang; Highland, Heather M; Schurmann, Claudia; Justice, Anne E; Fine, Rebecca S; Bradfield, Jonathan P; Esko, Tõnu; Giri, Ayush; Graff, Mariaelisa; Guo, Xiuqing; Hendricks, Audrey E; Karaderi, Tugce; Lempradl, Adelheid; Locke, Adam E; Mahajan, Anubha; Marouli, Eirini; Sivapalaratnam, Suthesh; Young, Kristin L; Alfred, Tamuno; Feitosa, Mary F; Masca, Nicholas GD; Manning, Alisa K; Medina-Gomez, Carolina; Mudgal, Poorva; Ng, Maggie CY; Reiner, Alex P; Vedantam, Sailaja; Willems, Sara M; Winkler, Thomas W; Abecasis, Goncalo; Aben, Katja K; Alam, Dewan S; Alharthi, Sameer E; Allison, Matthew; Amouyel, Philippe; Asselbergs, Folkert W; Auer, Paul L; Balkau, Beverley; Bang, Lia E; Barroso, Inês; Bastarache, Lisa; Benn, Marianne; Bergmann, Sven; Bielak, Lawrence F; Blüher, Matthias; Boehnke, Michael; Boeing, Heiner; Boerwinkle, Eric; Böger, Carsten A; Bork-Jensen, Jette; Bots, Michiel L; Bottinger, Erwin P; Bowden, Donald W; Brandslund, Ivan; Breen, Gerome; Brilliant, Murray H; Broer, Linda; Brumat, Marco; Burt, Amber A; Butterworth, Adam S; Campbell, Peter T; Cappellani, Stefania; Carey, David J; Catamo, Eulalia; Caulfield, Mark J; Chambers, John C; Chasman, Daniel I; Chen, Yii-Der Ida; Chowdhury, Rajiv; Christensen, Cramer; Chu, Audrey Y; Cocca, Massimiliano; Collins, Francis S; Cook, James P; Corley, Janie; Galbany, Jordi Corominas; Cox, Amanda J; Crosslin, David S; Cuellar-Partida, Gabriel; D'Eustacchio, Angela; Danesh, John; Davies, Gail; de Bakker, Paul IW; de Groot, Mark CH; de Mutsert, Renée; Deary, Ian J; Dedoussis, George; Demerath, Ellen W; den Heijer, Martin; den Hollander, Anneke I; den Ruijter, Hester M; Dennis, Joe G; Denny, Josh C; Di Angelantonio, Emanuele; Drenos, Fotios; Du, Mengmeng; Dubé, Marie-Pierre; Dunning, Alison M; Easton, Douglas F; Edwards, Todd L; Ellinghaus, David; Ellinor, Patrick T; Elliott, Paul; Evangelou, Evangelos; Farmaki, Aliki-Eleni; Farooqi, I. Sadaf; Faul, Jessica D; Fauser, Sascha; Feng, Shuang; Ferrannini, Ele; Ferrieres, Jean; Florez, Jose C; Ford, Ian; Fornage, Myriam; Franco, Oscar H; Franke, Andre; Franks, Paul W; Friedrich, Nele; Frikke-Schmidt, Ruth; Galesloot, Tessel E.; Gan, Wei; Gandin, Ilaria; Gasparini, Paolo; Gibson, Jane; Giedraitis, Vilmantas; Gjesing, Anette P; Gordon-Larsen, Penny; Gorski, Mathias; Grabe, Hans-Jörgen; Grant, Struan FA; Grarup, Niels; Griffiths, Helen L; Grove, Megan L; Gudnason, Vilmundur; Gustafsson, Stefan; Haessler, Jeff; Hakonarson, Hakon; Hammerschlag, Anke R; Hansen, Torben; Harris, Kathleen Mullan; Harris, Tamara B; Hattersley, Andrew T; Have, Christian T; Hayward, Caroline; He, Liang; Heard-Costa, Nancy L; Heath, Andrew C; Heid, Iris M; Helgeland, Øyvind; Hernesniemi, Jussi; Hewitt, Alex W; Holmen, Oddgeir L; Hovingh, G Kees; Howson, Joanna MM; Hu, Yao; Huang, Paul L; Huffman, Jennifer E; Ikram, M Arfan; Ingelsson, Erik; Jackson, Anne U; Jansson, Jan-Håkan; Jarvik, Gail P; Jensen, Gorm B; Jia, Yucheng; Johansson, Stefan; Jørgensen, Marit E; Jørgensen, Torben; Jukema, J Wouter; Kahali, Bratati; Kahn, René S; Kähönen, Mika; Kamstrup, Pia R; Kanoni, Stavroula; Kaprio, Jaakko; Karaleftheri, Maria; Kardia, Sharon LR; Karpe, Fredrik; Kathiresan, Sekar; Kee, Frank; Kiemeney, Lambertus A; Kim, Eric; Kitajima, Hidetoshi; Komulainen, Pirjo; Kooner, Jaspal S; Kooperberg, Charles; Korhonen, Tellervo; Kovacs, Peter; Kuivaniemi, Helena; Kutalik, Zoltán; Kuulasmaa, Kari; Kuusisto, Johanna; Laakso, Markku; Lakka, Timo A; Lamparter, David; Lange, Ethan M; Lange, Leslie A; Langenberg, Claudia; Larson, Eric B; Lee, Nanette R; Lehtimäki, Terho; Lewis, Cora E; Li, Huaixing; Li, Jin; Li-Gao, Ruifang; Lin, Honghuang; Lin, Keng-Hung; Lin, Li-An; Lin, Xu; Lind, Lars; Lindström, Jaana; Linneberg, Allan; Liu, Ching-Ti; Liu, Dajiang J; Liu, Yongmei; Lo, Ken Sin; Lophatananon, Artitaya; Lotery, Andrew J; Loukola, Anu; Luan, Jian'an; Lubitz, Steven A; Lyytikäinen, Leo-Pekka; Männistö, Satu; Marenne, Gaëlle; Mazul, Angela L; McCarthy, Mark I; McKean-Cowdin, Roberta; Medland, Sarah E; Meidtner, Karina; Milani, Lili; Mistry, Vanisha; Mitchell, Paul; Mohlke, Karen L; Moilanen, Leena; Moitry, Marie; Montgomery, Grant W; Mook-Kanamori, Dennis O; Moore, Carmel; Mori, Trevor A; Morris, Andrew D; Morris, Andrew P; Müller-Nurasyid, Martina; Munroe, Patricia B; Nalls, Mike A; Narisu, Narisu; Nelson, Christopher P; Neville, Matt; Nielsen, Sune F; Nikus, Kjell; Njølstad, Pål R; Nordestgaard, Børge G; Nyholt, Dale R; O'Connel, Jeffrey R; O’Donoghue, Michelle L.; Olde Loohuis, Loes M; Ophoff, Roel A; Owen, Katharine R; Packard, Chris J; Padmanabhan, Sandosh; Palmer, Colin NA; Palmer, Nicholette D; Pasterkamp, Gerard; Patel, Aniruddh P; Pattie, Alison; Pedersen, Oluf; Peissig, Peggy L; Peloso, Gina M; Pennell, Craig E; Perola, Markus; Perry, James A; Perry, John RB; Pers, Tune H; Person, Thomas N; Peters, Annette; Petersen, Eva RB; Peyser, Patricia A; Pirie, Ailith; Polasek, Ozren; Polderman, Tinca J; Puolijoki, Hannu; Raitakari, Olli T; Rasheed, Asif; Rauramaa, Rainer; Reilly, Dermot F; Renström, Frida; Rheinberger, Myriam; Ridker, Paul M; Rioux, John D; Rivas, Manuel A; Roberts, David J; Robertson, Neil R; Robino, Antonietta; Rolandsson, Olov; Rudan, Igor; Ruth, Katherine S; Saleheen, Danish; Salomaa, Veikko; Samani, Nilesh J; Sapkota, Yadav; Sattar, Naveed; Schoen, Robert E; Schreiner, Pamela J; Schulze, Matthias B; Scott, Robert A; Segura-Lepe, Marcelo P; Shah, Svati H; Sheu, Wayne H-H; Sim, Xueling; Slater, Andrew J; Small, Kerrin S; Smith, Albert Vernon; Southam, Lorraine; Spector, Timothy D; Speliotes, Elizabeth K; Starr, John M; Stefansson, Kari; Steinthorsdottir, Valgerdur; Stirrups, Kathleen E; Strauch, Konstantin; Stringham, Heather M; Stumvoll, Michael; Sun, Liang; Surendran, Praveen; Swift, Amy J; Tada, Hayato; Tansey, Katherine E; Tardif, Jean-Claude; Taylor, Kent D; Teumer, Alexander; Thompson, Deborah J; Thorleifsson, Gudmar; Thorsteinsdottir, Unnur; Thuesen, Betina H; Tönjes, Anke; Tromp, Gerard; Trompet, Stella; Tsafantakis, Emmanouil; Tuomilehto, Jaakko; Tybjaerg-Hansen, Anne; Tyrer, Jonathan P; Uher, Rudolf; Uitterlinden, André G; Uusitupa, Matti; van der Laan, Sander W; van Duijn, Cornelia M; van Leeuwen, Nienke; van Setten, Jessica; Vanhala, Mauno; Varbo, Anette; Varga, Tibor V; Varma, Rohit; Velez Edwards, Digna R; Vermeulen, Sita H; Veronesi, Giovanni; Vestergaard, Henrik; Vitart, Veronique; Vogt, Thomas F; Völker, Uwe; Vuckovic, Dragana; Wagenknecht, Lynne E; Walker, Mark; Wallentin, Lars; Wang, Feijie; Wang, Carol A; Wang, Shuai; Wang, Yiqin; Ware, Erin B; Wareham, Nicholas J; Warren, Helen R; Waterworth, Dawn M; Wessel, Jennifer; White, Harvey D; Willer, Cristen J; Wilson, James G; Witte, Daniel R; Wood, Andrew R; Wu, Ying; Yaghootkar, Hanieh; Yao, Jie; Yao, Pang; Yerges-Armstrong, Laura M; Young, Robin; Zeggini, Eleftheria; Zhan, Xiaowei; Zhang, Weihua; Zhao, Jing Hua; Zhao, Wei; Zhao, Wei; Zhou, Wei; Zondervan, Krina T; Rotter, Jerome I; Pospisilik, John A; Rivadeneira, Fernando; Borecki, Ingrid B; Deloukas, Panos; Frayling, Timothy M; Lettre, Guillaume; North, Kari E; Lindgren, Cecilia M; Hirschhorn, Joel N; Loos, Ruth JF

    2018-01-01

    Genome-wide association studies (GWAS) have identified >250 loci for body mass index (BMI), implicating pathways related to neuronal biology. Most GWAS loci represent clusters of common, non-coding variants from which pinpointing causal genes remains challenging. Here, we combined data from 718,734 individuals to discover rare and low-frequency (MAF<5%) coding variants associated with BMI. We identified 14 coding variants in 13 genes, of which eight in genes (ZBTB7B, ACHE, RAPGEF3, RAB21, ZFHX3, ENTPD6, ZFR2, ZNF169) newly implicated in human obesity, two (MC4R, KSR2) previously observed in extreme obesity, and two variants in GIPR. Effect sizes of rare variants are ~10 times larger than of common variants, with the largest effect observed in carriers of an MC4R stop-codon (p.Tyr35Ter, MAF=0.01%), weighing ~7kg more than non-carriers. Pathway analyses confirmed enrichment of neuronal genes and provide new evidence for adipocyte and energy expenditure biology, widening the potential of genetically-supported therapeutic targets to treat obesity. PMID:29273807

  9. Unraveling the efficiency of RAPD and SSR markers in diversity analysis and population structure estimation in common bean.

    PubMed

    Zargar, Sajad Majeed; Farhat, Sufia; Mahajan, Reetika; Bhakhri, Ayushi; Sharma, Arjun

    2016-01-01

    Increase in food production viz-a-viz quality of food is important to feed the growing human population to attain food as well as nutritional security. The availability of diverse germplasm of any crop is an important genetic resource to mine the genes that may assist in attaining food as well as nutritional security. Here we used 15 RAPD and 23 SSR markers to elucidate diversity among 51 common bean genotypes mostly landraces collected from the Himalayan region of Jammu and Kashmir, India. We observed that both the markers are highly polymorphic. The discriminatory power of these markers was determined using various parameters like; percent polymorphism, PIC, resolving power and marker index. 15 RAPDs produced 171 polymorphic bands, while 23 SSRs produced 268 polymorphic bands. SSRs showed a higher PIC value (0.300) compared to RAPDs (0.243). Further the resolving power of SSRs was 5.241 compared to 3.86 for RAPDs. However, RAPDs showed a higher marker index (2.69) compared to SSRs (1.279) that may be attributed to their higher multiplex ratio. The dendrograms generated with hierarchical UPGMA cluster analysis grouped genotypes into two main clusters with various degrees of sub clustering within the cluster. Here we observed that both the marker systems showed comparable accuracy in grouping genotypes of common bean according to their area of cultivation. The model based STRUCTURE analysis using 15 RAPD and 23 SSR markers identified a population with 3 sub-populations which corresponds to distance based groupings. High level of genetic diversity was observed within the population. These findings have further implications in common bean breeding as well as conservation programs.

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liebhaber, S.A.; Weiss, I.; Cash, F.E.

    Synthesis of normal human hemoglobin A, {alpha}{sub 2}{beta}{sub 2}, is based upon balanced expression of genes in the {alpha}-globin gene cluster on chromosome 15 and the {beta}-globin gene cluster on chromosome 11. Full levels of erythroid-specific activation of the {beta}-globin cluster depend on sequences located at a considerable distance 5{prime} to the {beta}-globin gene, referred to as the locus-activating or dominant control region. The existence of an analogous element(s) upstream of the {alpha}-globin cluster has been suggested from observations on naturally occurring deletions and experimental studies. The authors have identified an individual with {alpha}-thalassemia in whom structurally normal {alpha}-globin genesmore » have been inactivated in cis by a discrete de novo 35-kilobase deletion located {approximately}30 kilobases 5{prime} from the {alpha}-globin gene cluster. They conclude that this deletion inactivates expression of the {alpha}-globin genes by removing one or more of the previously identified upstream regulatory sequences that are critical to expression of the {alpha}-globin genes.« less

  11. Discovery of a Phosphonoacetic Acid Derived Natural Product by Pathway Refactoring.

    PubMed

    Freestone, Todd S; Ju, Kou-San; Wang, Bin; Zhao, Huimin

    2017-02-17

    The activation of silent natural product gene clusters is a synthetic biology problem of great interest. As the rate at which gene clusters are identified outpaces the discovery rate of new molecules, this unknown chemical space is rapidly growing, as too are the rewards for developing technologies to exploit it. One class of natural products that has been underrepresented is phosphonic acids, which have important medical and agricultural uses. Hundreds of phosphonic acid biosynthetic gene clusters have been identified encoding for unknown molecules. Although methods exist to elicit secondary metabolite gene clusters in native hosts, they require the strain to be amenable to genetic manipulation. One method to circumvent this is pathway refactoring, which we implemented in an effort to discover new phosphonic acids from a gene cluster from Streptomyces sp. strain NRRL F-525. By reengineering this cluster for expression in the production host Streptomyces lividans, utility of refactoring is demonstrated with the isolation of a novel phosphonic acid, O-phosphonoacetic acid serine, and the characterization of its biosynthesis. In addition, a new biosynthetic branch point is identified with a phosphonoacetaldehyde dehydrogenase, which was used to identify additional phosphonic acid gene clusters that share phosphonoacetic acid as an intermediate.

  12. The intact dupA cluster is a more reliable Helicobacter pylori virulence marker than dupA alone.

    PubMed

    Jung, Sung Woo; Sugimoto, Mitsushige; Shiota, Seiji; Graham, David Y; Yamaoka, Yoshio

    2012-01-01

    The duodenal ulcer promoting (dupA) gene, located in the plasticity region of Helicobacter pylori, is associated with duodenal ulcer development. dupA was predicted to form a type IV secretory system (T4SS) with vir genes around dupA (dupA cluster). We investigated the prevalence of dupA and dupA clusters and clarified associations between the dupA cluster status and clinical outcomes in the U.S. population. In all, 245 H. pylori strains were examined using PCR to evaluate the status of dupA and the adjacent vir genes predicted to form T4SS, in addition to the status of cag pathogenicity island (PAI). The associations between dupA cluster status and interleukin-8 (IL-8) and IL-12 production were also examined. The presence of dupA and all adjacent vir genes were defined as a complete dupA cluster. Many variations related to the status of dupA and dupA cluster genes were identified. Concurrent H. pylori infection and the presence of a complete dupA cluster increases duodenal ulcer risk compared to H. pylori infection with incomplete dupA cluster or without the dupA gene independent on the cag PAI status (adjusted odds ratio, 2.13; 95% confidence interval, 1.13 to 4.03). Gastric mucosal IL-8 levels were also significantly higher in the complete dupA cluster group than in other groups (P=0.01). In conclusion, although the causal relationship between the dupA cluster and duodenal ulcer development is not proved, the presence of a complete dupA cluster but not dupA alone, is associated with duodenal ulcer development.

  13. The Intact dupA Cluster Is a More Reliable Helicobacter pylori Virulence Marker than dupA Alone

    PubMed Central

    Jung, Sung Woo; Sugimoto, Mitsushige; Shiota, Seiji; Graham, David Y.

    2012-01-01

    The duodenal ulcer promoting (dupA) gene, located in the plasticity region of Helicobacter pylori, is associated with duodenal ulcer development. dupA was predicted to form a type IV secretory system (T4SS) with vir genes around dupA (dupA cluster). We investigated the prevalence of dupA and dupA clusters and clarified associations between the dupA cluster status and clinical outcomes in the U.S. population. In all, 245 H. pylori strains were examined using PCR to evaluate the status of dupA and the adjacent vir genes predicted to form T4SS, in addition to the status of cag pathogenicity island (PAI). The associations between dupA cluster status and interleukin-8 (IL-8) and IL-12 production were also examined. The presence of dupA and all adjacent vir genes were defined as a complete dupA cluster. Many variations related to the status of dupA and dupA cluster genes were identified. Concurrent H. pylori infection and the presence of a complete dupA cluster increases duodenal ulcer risk compared to H. pylori infection with incomplete dupA cluster or without the dupA gene independent on the cag PAI status (adjusted odds ratio, 2.13; 95% confidence interval, 1.13 to 4.03). Gastric mucosal IL-8 levels were also significantly higher in the complete dupA cluster group than in other groups (P = 0.01). In conclusion, although the causal relationship between the dupA cluster and duodenal ulcer development is not proved, the presence of a complete dupA cluster but not dupA alone, is associated with duodenal ulcer development. PMID:22038914

  14. Genetic analysis reveals the identity of the photoreceptor for phototaxis in hormogonium filaments of Nostoc punctiforme.

    PubMed

    Campbell, Elsie L; Hagen, Kari D; Chen, Rui; Risser, Douglas D; Ferreira, Daniela P; Meeks, John C

    2015-02-15

    In cyanobacterial Nostoc species, substratum-dependent gliding motility is confined to specialized nongrowing filaments called hormogonia, which differentiate from vegetative filaments as part of a conditional life cycle and function as dispersal units. Here we confirm that Nostoc punctiforme hormogonia are positively phototactic to white light over a wide range of intensities. N. punctiforme contains two gene clusters (clusters 2 and 2i), each of which encodes modular cyanobacteriochrome-methyl-accepting chemotaxis proteins (MCPs) and other proteins that putatively constitute a basic chemotaxis-like signal transduction complex. Transcriptional analysis established that all genes in clusters 2 and 2i, plus two additional clusters (clusters 1 and 3) with genes encoding MCPs lacking cyanobacteriochrome sensory domains, are upregulated during the differentiation of hormogonia. Mutational analysis determined that only genes in cluster 2i are essential for positive phototaxis in N. punctiforme hormogonia; here these genes are designated ptx (for phototaxis) genes. The cluster is unusual in containing complete or partial duplicates of genes encoding proteins homologous to the well-described chemotaxis elements CheY, CheW, MCP, and CheA. The cyanobacteriochrome-MCP gene (ptxD) lacks transmembrane domains and has 7 potential binding sites for bilins. The transcriptional start site of the ptx genes does not resemble a sigma 70 consensus recognition sequence; moreover, it is upstream of two genes encoding gas vesicle proteins (gvpA and gvpC), which also are expressed only in the hormogonium filaments of N. punctiforme. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  15. Identifying a gene expression signature of cluster headache in blood

    PubMed Central

    Eising, Else; Pelzer, Nadine; Vijfhuizen, Lisanne S.; Vries, Boukje de; Ferrari, Michel D.; ‘t Hoen, Peter A. C.; Terwindt, Gisela M.; van den Maagdenberg, Arn M. J. M.

    2017-01-01

    Cluster headache is a relatively rare headache disorder, typically characterized by multiple daily, short-lasting attacks of excruciating, unilateral (peri-)orbital or temporal pain associated with autonomic symptoms and restlessness. To better understand the pathophysiology of cluster headache, we used RNA sequencing to identify differentially expressed genes and pathways in whole blood of patients with episodic (n = 19) or chronic (n = 20) cluster headache in comparison with headache-free controls (n = 20). Gene expression data were analysed by gene and by module of co-expressed genes with particular attention to previously implicated disease pathways including hypocretin dysregulation. Only moderate gene expression differences were identified and no associations were found with previously reported pathogenic mechanisms. At the level of functional gene sets, associations were observed for genes involved in several brain-related mechanisms such as GABA receptor function and voltage-gated channels. In addition, genes and modules of co-expressed genes showed a role for intracellular signalling cascades, mitochondria and inflammation. Although larger study samples may be required to identify the full range of involved pathways, these results indicate a role for mitochondria, intracellular signalling and inflammation in cluster headache. PMID:28074859

  16. Ancient genes establish stress-induced mutation as a hallmark of cancer.

    PubMed

    Cisneros, Luis; Bussey, Kimberly J; Orr, Adam J; Miočević, Milica; Lineweaver, Charles H; Davies, Paul

    2017-01-01

    Cancer is sometimes depicted as a reversion to single cell behavior in cells adapted to live in a multicellular assembly. If this is the case, one would expect that mutation in cancer disrupts functional mechanisms that suppress cell-level traits detrimental to multicellularity. Such mechanisms should have evolved with or after the emergence of multicellularity. This leads to two related, but distinct hypotheses: 1) Somatic mutations in cancer will occur in genes that are younger than the emergence of multicellularity (1000 million years [MY]); and 2) genes that are frequently mutated in cancer and whose mutations are functionally important for the emergence of the cancer phenotype evolved within the past 1000 million years, and thus would exhibit an age distribution that is skewed to younger genes. In order to investigate these hypotheses we estimated the evolutionary ages of all human genes and then studied the probability of mutation and their biological function in relation to their age and genomic location for both normal germline and cancer contexts. We observed that under a model of uniform random mutation across the genome, controlled for gene size, genes less than 500 MY were more frequently mutated in both cases. Paradoxically, causal genes, defined in the COSMIC Cancer Gene Census, were depleted in this age group. When we used functional enrichment analysis to explain this unexpected result we discovered that COSMIC genes with recessive disease phenotypes were enriched for DNA repair and cell cycle control. The non-mutated genes in these pathways are orthologous to those underlying stress-induced mutation in bacteria, which results in the clustering of single nucleotide variations. COSMIC genes were less common in regions where the probability of observing mutational clusters is high, although they are approximately 2-fold more likely to harbor mutational clusters compared to other human genes. Our results suggest this ancient mutational response to stress that evolved among prokaryotes was co-opted to maintain diversity in the germline and immune system, while the original phenotype is restored in cancer. Reversion to a stress-induced mutational response is a hallmark of cancer that allows for effectively searching "protected" genome space where genes causally implicated in cancer are located and underlies the high adaptive potential and concomitant therapeutic resistance that is characteristic of cancer.

  17. Ancient genes establish stress-induced mutation as a hallmark of cancer

    PubMed Central

    Orr, Adam J.; Miočević, Milica; Lineweaver, Charles H.; Davies, Paul

    2017-01-01

    Cancer is sometimes depicted as a reversion to single cell behavior in cells adapted to live in a multicellular assembly. If this is the case, one would expect that mutation in cancer disrupts functional mechanisms that suppress cell-level traits detrimental to multicellularity. Such mechanisms should have evolved with or after the emergence of multicellularity. This leads to two related, but distinct hypotheses: 1) Somatic mutations in cancer will occur in genes that are younger than the emergence of multicellularity (1000 million years [MY]); and 2) genes that are frequently mutated in cancer and whose mutations are functionally important for the emergence of the cancer phenotype evolved within the past 1000 million years, and thus would exhibit an age distribution that is skewed to younger genes. In order to investigate these hypotheses we estimated the evolutionary ages of all human genes and then studied the probability of mutation and their biological function in relation to their age and genomic location for both normal germline and cancer contexts. We observed that under a model of uniform random mutation across the genome, controlled for gene size, genes less than 500 MY were more frequently mutated in both cases. Paradoxically, causal genes, defined in the COSMIC Cancer Gene Census, were depleted in this age group. When we used functional enrichment analysis to explain this unexpected result we discovered that COSMIC genes with recessive disease phenotypes were enriched for DNA repair and cell cycle control. The non-mutated genes in these pathways are orthologous to those underlying stress-induced mutation in bacteria, which results in the clustering of single nucleotide variations. COSMIC genes were less common in regions where the probability of observing mutational clusters is high, although they are approximately 2-fold more likely to harbor mutational clusters compared to other human genes. Our results suggest this ancient mutational response to stress that evolved among prokaryotes was co-opted to maintain diversity in the germline and immune system, while the original phenotype is restored in cancer. Reversion to a stress-induced mutational response is a hallmark of cancer that allows for effectively searching “protected” genome space where genes causally implicated in cancer are located and underlies the high adaptive potential and concomitant therapeutic resistance that is characteristic of cancer. PMID:28441401

  18. The complete chloroplast genome sequence of the chlorophycean green alga Scenedesmus obliquus reveals a compact gene organization and a biased distribution of genes on the two DNA strands

    PubMed Central

    de Cambiaire, Jean-Charles; Otis, Christian; Lemieux, Claude; Turmel, Monique

    2006-01-01

    Background The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. While the basal position of the Prasinophyceae is well established, the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae (UTC) remains uncertain. The five complete chloroplast DNA (cpDNA) sequences currently available for representatives of these classes display considerable variability in overall structure, gene content, gene density, intron content and gene order. Among these genomes, that of the chlorophycean green alga Chlamydomonas reinhardtii has retained the least ancestral features. The two single-copy regions, which are separated from one another by the large inverted repeat (IR), have similar sizes, rather than unequal sizes, and differ radically in both gene contents and gene organizations relative to the single-copy regions of prasinophyte and ulvophyte cpDNAs. To gain insights into the various changes that underwent the chloroplast genome during the evolution of chlorophycean green algae, we have sequenced the cpDNA of Scenedesmus obliquus, a member of a distinct chlorophycean lineage. Results The 161,452 bp IR-containing genome of Scenedesmus features single-copy regions of similar sizes, encodes 96 genes, i.e. only two additional genes (infA and rpl12) relative to its Chlamydomonas homologue and contains seven group I and two group II introns. It is clearly more compact than the four UTC algal cpDNAs that have been examined so far, displays the lowest proportion of short repeats among these algae and shows a stronger bias in clustering of genes on the same DNA strand compared to Chlamydomonas cpDNA. Like the latter genome, Scenedesmus cpDNA displays only a few ancestral gene clusters. The two chlorophycean genomes share 11 gene clusters that are not found in previously sequenced trebouxiophyte and ulvophyte cpDNAs as well as a few genes that have an unusual structure; however, their single-copy regions differ considerably in gene content. Conclusion Our results underscore the remarkable plasticity of the chlorophycean chloroplast genome. Owing to this plasticity, only a sketchy portrait could be drawn for the chloroplast genome of the last common ancestor of Scenedesmus and Chlamydomonas. PMID:16638149

  19. Genes with a spike expression are clustered in chromosome (sub)bands and spike (sub)bands have a powerful prognostic value in patients with multiple myeloma

    PubMed Central

    Kassambara, Alboukadel; Hose, Dirk; Moreaux, Jérôme; Walker, Brian A.; Protopopov, Alexei; Reme, Thierry; Pellestor, Franck; Pantesco, Véronique; Jauch, Anna; Morgan, Gareth; Goldschmidt, Hartmut; Klein, Bernard

    2012-01-01

    Background Genetic abnormalities are common in patients with multiple myeloma, and may deregulate gene products involved in tumor survival, proliferation, metabolism and drug resistance. In particular, translocations may result in a high expression of targeted genes (termed spike expression) in tumor cells. We identified spike genes in multiple myeloma cells of patients with newly-diagnosed myeloma and investigated their prognostic value. Design and Methods Genes with a spike expression in multiple myeloma cells were picked up using box plot probe set signal distribution and two selection filters. Results In a cohort of 206 newly diagnosed patients with multiple myeloma, 2587 genes/expressed sequence tags with a spike expression were identified. Some spike genes were associated with some transcription factors such as MAF or MMSET and with known recurrent translocations as expected. Spike genes were not associated with increased DNA copy number and for a majority of them, involved unknown mechanisms. Of spiked genes, 36.7% clustered significantly in 149 out of 862 documented chromosome (sub)bands, of which 53 had prognostic value (35 bad, 18 good). Their prognostic value was summarized with a spike band score that delineated 23.8% of patients with a poor median overall survival (27.4 months versus not reached, P<0.001) using the training cohort of 206 patients. The spike band score was independent of other gene expression profiling-based risk scores, t(4;14), or del17p in an independent validation cohort of 345 patients. Conclusions We present a new approach to identify spike genes and their relationship to patients’ survival. PMID:22102711

  20. Identification and characterization of the ergochrome gene cluster in the plant pathogenic fungus Claviceps purpurea.

    PubMed

    Neubauer, Lisa; Dopstadt, Julian; Humpf, Hans-Ulrich; Tudzynski, Paul

    2016-01-01

    Claviceps purpurea is a phytopathogenic fungus infecting a broad range of grasses including economically important cereal crop plants. The infection cycle ends with the formation of the typical purple-black pigmented sclerotia containing the toxic ergot alkaloids. Besides these ergot alkaloids little is known about the secondary metabolism of the fungus. Red anthraquinone derivatives and yellow xanthone dimers (ergochromes) have been isolated from sclerotia and described as ergot pigments, but the corresponding gene cluster has remained unknown. Fungal pigments gain increasing interest for example as environmentally friendly alternatives to existing dyes. Furthermore, several pigments show biological activities and may have some pharmaceutical value. This study identified the gene cluster responsible for the synthesis of the ergot pigments. Overexpression of the cluster-specific transcription factor led to activation of the gene cluster and to the production of several known ergot pigments. Knock out of the cluster key enzyme, a nonreducing polyketide synthase, clearly showed that this cluster is responsible for the production of red anthraquinones as well as yellow ergochromes. Furthermore, a tentative biosynthetic pathway for the ergot pigments is proposed. By changing the culture conditions, pigment production was activated in axenic culture so that high concentration of phosphate and low concentration of sucrose induced pigment syntheses. This is the first functional analysis of a secondary metabolite gene cluster in the ergot fungus besides that for the classical ergot alkaloids. We demonstrated that this gene cluster is responsible for the typical purple-black color of the ergot sclerotia and showed that the red and yellow ergot pigments are products of the same biosynthetic pathway. Activation of the gene cluster in axenic culture opened up new possibilities for biotechnological applications like the dye production or the development of new pharmaceuticals.

  1. Evidence for Horizontal Gene Transfer in Evolution of Elongation Factor Tu in Enterococci

    PubMed Central

    Ke, Danbing; Boissinot, Maurice; Huletsky, Ann; Picard, François J.; Frenette, Johanne; Ouellette, Marc; Roy, Paul H.; Bergeron, Michel G.

    2000-01-01

    The elongation factor Tu, encoded by tuf genes, is a GTP binding protein that plays a central role in protein synthesis. One to three tuf genes per genome are present, depending on the bacterial species. Most low-G+C-content gram-positive bacteria carry only one tuf gene. We have designed degenerate PCR primers derived from consensus sequences of the tuf gene to amplify partial tuf sequences from 17 enterococcal species and other phylogenetically related species. The amplified DNA fragments were sequenced either by direct sequencing or by sequencing cloned inserts containing putative amplicons. Two different tuf genes (tufA and tufB) were found in 11 enterococcal species, including Enterococcus avium, Enterococcus casseliflavus, Enterococcus dispar, Enterococcus durans, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Enterococcus malodoratus, Enterococcus mundtii, Enterococcus pseudoavium, and Enterococcus raffinosus. For the other six enterococcal species (Enterococcus cecorum, Enterococcus columbae, Enterococcus faecalis, Enterococcus sulfureus, Enterococcus saccharolyticus, and Enterococcus solitarius), only the tufA gene was present. Based on 16S rRNA gene sequence analysis, the 11 species having two tuf genes all have a common ancestor, while the six species having only one copy diverged from the enterococcal lineage before that common ancestor. The presence of one or two copies of the tuf gene in enterococci was confirmed by Southern hybridization. Phylogenetic analysis of tuf sequences demonstrated that the enterococcal tufA gene branches with the Bacillus, Listeria, and Staphylococcus genera, while the enterococcal tufB gene clusters with the genera Streptococcus and Lactococcus. Primary structure analysis showed that four amino acid residues encoded within the sequenced regions are conserved and unique to the enterococcal tufB genes and the tuf genes of streptococci and Lactococcus lactis. The data suggest that an ancestral streptococcus or a streptococcus-related species may have horizontally transferred a tuf gene to the common ancestor of the 11 enterococcal species which now carry two tuf genes. PMID:11092850

  2. Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects

    PubMed Central

    2012-01-01

    Background Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models. Results We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases. Conclusions Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data. PMID:23151154

  3. New natural products isolated from Metarhizium robertsii ARSEF 23 by chemical screening and identification of the gene cluster through engineered biosynthesis in Aspergillus nidulans A1145.

    PubMed

    Kato, Hiroki; Tsunematsu, Yuta; Yamamoto, Tsuyoshi; Namiki, Takuya; Kishimoto, Shinji; Noguchi, Hiroshi; Watanabe, Kenji

    2016-07-01

    To rapidly identify novel natural products and their associated biosynthetic genes from underutilized and genetically difficult-to-manipulate microbes, we developed a method that uses (1) chemical screening to isolate novel microbial secondary metabolites, (2) bioinformatic analyses to identify a potential biosynthetic gene cluster and (3) heterologous expression of the genes in a convenient host to confirm the identity of the gene cluster and the proposed biosynthetic mechanism. The chemical screen was achieved by searching known natural product databases with data from liquid chromatographic and high-resolution mass spectrometric analyses collected on the extract from a target microbe culture. Using this method, we were able to isolate two new meroterpenes, subglutinols C (1) and D (2), from an entomopathogenic filamentous fungus Metarhizium robertsii ARSEF 23. Bioinformatics analysis of the genome allowed us to identify a gene cluster likely to be responsible for the formation of subglutinols. Heterologous expression of three genes from the gene cluster encoding a polyketide synthase, a prenyltransferase and a geranylgeranyl pyrophosphate synthase in Aspergillus nidulans A1145 afforded an α-pyrone-fused uncyclized diterpene, the expected intermediate of the subglutinol biosynthesis, thereby confirming the gene cluster to be responsible for the subglutinol biosynthesis. These results indicate the usefulness of our methodology in isolating new natural products and identifying their associated biosynthetic gene cluster from microbes that are not amenable to genetic manipulation. Our method should facilitate the natural product discovery efforts by expediting the identification of new secondary metabolites and their associated biosynthetic genes from a wider source of microbes.

  4. The Genome of Tolypocladium inflatum: Evolution, Organization, and Expression of the Cyclosporin Biosynthetic Gene Cluster

    PubMed Central

    Bushley, Kathryn E.; Raja, Rajani; Jaiswal, Pankaj; Cumbie, Jason S.; Nonogaki, Mariko; Boyd, Alexander E.; Owensby, C. Alisha; Knaus, Brian J.; Elser, Justin; Miller, Daniel; Di, Yanming; McPhail, Kerry L.; Spatafora, Joseph W.

    2013-01-01

    The ascomycete fungus Tolypocladium inflatum, a pathogen of beetle larvae, is best known as the producer of the immunosuppressant drug cyclosporin. The draft genome of T. inflatum strain NRRL 8044 (ATCC 34921), the isolate from which cyclosporin was first isolated, is presented along with comparative analyses of the biosynthesis of cyclosporin and other secondary metabolites in T. inflatum and related taxa. Phylogenomic analyses reveal previously undetected and complex patterns of homology between the nonribosomal peptide synthetase (NRPS) that encodes for cyclosporin synthetase (simA) and those of other secondary metabolites with activities against insects (e.g., beauvericin, destruxins, etc.), and demonstrate the roles of module duplication and gene fusion in diversification of NRPSs. The secondary metabolite gene cluster responsible for cyclosporin biosynthesis is described. In addition to genes necessary for cyclosporin biosynthesis, it harbors a gene for a cyclophilin, which is a member of a family of immunophilins known to bind cyclosporin. Comparative analyses support a lineage specific origin of the cyclosporin gene cluster rather than horizontal gene transfer from bacteria or other fungi. RNA-Seq transcriptome analyses in a cyclosporin-inducing medium delineate the boundaries of the cyclosporin cluster and reveal high levels of expression of the gene cluster cyclophilin. In medium containing insect hemolymph, weaker but significant upregulation of several genes within the cyclosporin cluster, including the highly expressed cyclophilin gene, was observed. T. inflatum also represents the first reference draft genome of Ophiocordycipitaceae, a third family of insect pathogenic fungi within the fungal order Hypocreales, and supports parallel and qualitatively distinct radiations of insect pathogens. The T. inflatum genome provides additional insight into the evolution and biosynthesis of cyclosporin and lays a foundation for further investigations of the role of secondary metabolite gene clusters and their metabolites in fungal biology. PMID:23818858

  5. Evidence against the selfish operon theory.

    PubMed

    Pál, Csaba; Hurst, Laurence D

    2004-06-01

    According to the selfish operon hypothesis, the clustering of genes and their subsequent organization into operons is beneficial for the constituent genes because it enables the horizontal gene transfer of weakly selected, functionally coupled genes. The majority of these are expected to be non-essential genes. From our analysis of the Escherichia coli genome, we conclude that the selfish operon hypothesis is unlikely to provide a general explanation for clustering nor can it account for the gene composition of operons. Contrary to expectations, essential genes with related functions have an especially strong tendency to cluster, even if they are not in operons. Moreover, essential genes are particularly abundant in operons.

  6. Missing link in the evolution of Hox clusters.

    PubMed

    Ogishima, Soichi; Tanaka, Hiroshi

    2007-01-31

    Hox cluster has key roles in regulating the patterning of the antero-posterior axis in a metazoan embryo. It consists of the anterior, central and posterior genes; the central genes have been identified only in bilaterians, but not in cnidarians, and are responsible for archiving morphological complexity in bilaterian development. However, their evolutionary history has not been revealed, that is, there has been a "missing link". Here we show the evolutionary history of Hox clusters of 18 bilaterians and 2 cnidarians by using a new method, "motif-based reconstruction", examining the gain/loss processes of evolutionarily conserved sequences, "motifs", outside the homeodomain. We successfully identified the missing link in the evolution of Hox clusters between the cnidarian-bilaterian ancestor and the bilaterians as the ancestor of the central genes, which we call the proto-central gene. Exploring the correspondent gene with the proto-central gene, we found that one of the acoela Hox genes has the same motif repertory as that of the proto-central gene. This interesting finding suggests that the acoela Hox cluster corresponds with the missing link in the evolution of the Hox cluster between the cnidarian-bilaterian ancestor and the bilaterians. Our findings suggested that motif gains/diversifications led to the explosive diversity of the bilaterian body plan.

  7. Heterochromatin influences the secondary metabolite profile in the plant pathogen Fusarium graminearum

    PubMed Central

    Reyes-Dominguez, Yazmid; Boedi, Stefan; Sulyok, Michael; Wiesenberger, Gerlinde; Stoppacher, Norbert; Krska, Rudolf; Strauss, Joseph

    2012-01-01

    Chromatin modifications and heterochromatic marks have been shown to be involved in the regulation of secondary metabolism gene clusters in the fungal model system Aspergillus nidulans. We examine here the role of HEP1, the heterochromatin protein homolog of Fusarium graminearum, for the production of secondary metabolites. Deletion of Hep1 in a PH-1 background strongly influences expression of genes required for the production of aurofusarin and the main tricothecene metabolite DON. In the Hep1 deletion strains AUR genes are highly up-regulated and aurofusarin production is greatly enhanced suggesting a repressive role for heterochromatin on gene expression of this cluster. Unexpectedly, gene expression and metabolites are lower for the trichothecene cluster suggesting a positive function of Hep1 for DON biosynthesis. However, analysis of histone modifications in chromatin of AUR and DON gene promoters reveals that in both gene clusters the H3K9me3 heterochromatic mark is strongly reduced in the Hep1 deletion strain. This, and the finding that a DON-cluster flanking gene is up-regulated, suggests that the DON biosynthetic cluster is repressed by HEP1 directly and indirectly. Results from this study point to a conserved mode of secondary metabolite (SM) biosynthesis regulation in fungi by chromatin modifications and the formation of facultative heterochromatin. PMID:22100541

  8. Whole-Genome Duplication and the Functional Diversification of Teleost Fish Hemoglobins

    PubMed Central

    Opazo, Juan C.; Butts, G. Tyler; Nery, Mariana F.; Storz, Jay F.; Hoffmann, Federico G.

    2013-01-01

    Subsequent to the two rounds of whole-genome duplication that occurred in the common ancestor of vertebrates, a third genome duplication occurred in the stem lineage of teleost fishes. This teleost-specific genome duplication (TGD) is thought to have provided genetic raw materials for the physiological, morphological, and behavioral diversification of this highly speciose group. The extreme physiological versatility of teleost fish is manifest in their diversity of blood–gas transport traits, which reflects the myriad solutions that have evolved to maintain tissue O2 delivery in the face of changing metabolic demands and environmental O2 availability during different ontogenetic stages. During the course of development, regulatory changes in blood–O2 transport are mediated by the expression of multiple, functionally distinct hemoglobin (Hb) isoforms that meet the particular O2-transport challenges encountered by the developing embryo or fetus (in viviparous or oviparous species) and in free-swimming larvae and adults. The main objective of the present study was to assess the relative contributions of whole-genome duplication, large-scale segmental duplication, and small-scale gene duplication in producing the extraordinary functional diversity of teleost Hbs. To accomplish this, we integrated phylogenetic reconstructions with analyses of conserved synteny to characterize the genomic organization and evolutionary history of the globin gene clusters of teleosts. These results were then integrated with available experimental data on functional properties and developmental patterns of stage-specific gene expression. Our results indicate that multiple α- and β-globin genes were present in the common ancestor of gars (order Lepisoteiformes) and teleosts. The comparative genomic analysis revealed that teleosts possess a dual set of TGD-derived globin gene clusters, each of which has undergone lineage-specific changes in gene content via repeated duplication and deletion events. Phylogenetic reconstructions revealed that paralogous genes convergently evolved similar functional properties in different teleost lineages. Consistent with other recent studies of globin gene family evolution in vertebrates, our results revealed evidence for repeated evolutionary transitions in the developmental regulation of Hb synthesis. PMID:22949522

  9. Gene expression profiles of breast biopsies from healthy women identify a group with claudin-low features

    PubMed Central

    2011-01-01

    Background Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Methods Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Results Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. Conclusion This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer. PMID:22044755

  10. Globin gene structure in a reptile supports the transpositional model for amniote α- and β-globin gene evolution.

    PubMed

    Patel, Vidushi S; Ezaz, Tariq; Deakin, Janine E; Graves, Jennifer A Marshall

    2010-12-01

    The haemoglobin protein, required for oxygen transportation in the body, is encoded by α- and β-globin genes that are arranged in clusters. The transpositional model for the evolution of distinct α-globin and β-globin clusters in amniotes is much simpler than the previously proposed whole genome duplication model. According to this model, all jawed vertebrates share one ancient region containing α- and β-globin genes and several flanking genes in the order MPG-C16orf35-(α-β)-GBY-LUC7L that has been conserved for more than 410 million years, whereas amniotes evolved a distinct β-globin cluster by insertion of a transposed β-globin gene from this ancient region into a cluster of olfactory receptors flanked by CCKBR and RRM1. It could not be determined whether this organisation is conserved in all amniotes because of the paucity of information from non-avian reptiles. To fill in this gap, we examined globin gene organisation in a squamate reptile, the Australian bearded dragon lizard, Pogona vitticeps (Agamidae). We report here that the α-globin cluster (HBK, HBA) is flanked by C16orf35 and GBY and is located on a pair of microchromosomes, whereas the β-globin cluster is flanked by RRM1 on the 3' end and is located on the long arm of chromosome 3. However, the CCKBR gene that flanks the β-globin cluster on the 5' end in other amniotes is located on the short arm of chromosome 5 in P. vitticeps, indicating that a chromosomal break between the β-globin cluster and CCKBR occurred at least in the agamid lineage. Our data from a reptile species provide further evidence to support the transpositional model for the evolution of β-globin gene cluster in amniotes.

  11. ExprAlign - the identification of ESTs in non-model species by alignment of cDNA microarray expression profiles

    PubMed Central

    2009-01-01

    Background Sequence identification of ESTs from non-model species offers distinct challenges particularly when these species have duplicated genomes and when they are phylogenetically distant from sequenced model organisms. For the common carp, an environmental model of aquacultural interest, large numbers of ESTs remained unidentified using BLAST sequence alignment. We have used the expression profiles from large-scale microarray experiments to suggest gene identities. Results Expression profiles from ~700 cDNA microarrays describing responses of 7 major tissues to multiple environmental stressors were used to define a co-expression landscape. This was based on the Pearsons correlation coefficient relating each gene with all other genes, from which a network description provided clusters of highly correlated genes as 'mountains'. We show that these contain genes with known identities and genes with unknown identities, and that the correlation constitutes evidence of identity in the latter. This procedure has suggested identities to 522 of 2701 unknown carp ESTs sequences. We also discriminate several common carp genes and gene isoforms that were not discriminated by BLAST sequence alignment alone. Precision in identification was substantially improved by use of data from multiple tissues and treatments. Conclusion The detailed analysis of co-expression landscapes is a sensitive technique for suggesting an identity for the large number of BLAST unidentified cDNAs generated in EST projects. It is capable of detecting even subtle changes in expression profiles, and thereby of distinguishing genes with a common BLAST identity into different identities. It benefits from the use of multiple treatments or contrasts, and from the large-scale microarray data. PMID:19939286

  12. Heterologous Production of a Novel Cyclic Peptide Compound, KK-1, in Aspergillus oryzae.

    PubMed

    Yoshimi, Akira; Yamaguchi, Sigenari; Fujioka, Tomonori; Kawai, Kiyoshi; Gomi, Katsuya; Machida, Masayuki; Abe, Keietsu

    2018-01-01

    A novel cyclic peptide compound, KK-1, was originally isolated from the plant-pathogenic fungus Curvularia clavata . It consists of 10 amino acid residues, including five N -methylated amino acid residues, and has potent antifungal activity. Recently, the genome-sequencing analysis of C. clavata was completed, and the biosynthetic genes involved in KK-1 production were predicted by using a novel gene cluster mining tool, MIDDAS-M. These genes form an approximately 75-kb cluster, which includes nine open reading frames, containing a non-ribosomal peptide synthetase (NRPS) gene. To determine whether the predicted genes were responsible for the biosynthesis of KK-1, we performed heterologous production of KK-1 in Aspergillus oryzae by introduction of the cluster genes into the genome of A. oryzae . The NRPS gene was split in two fragments and then reconstructed in the A. oryzae genome, because the gene was quite large (approximately 40 kb). The remaining seven genes in the cluster, excluding the regulatory gene kkR , were simultaneously introduced into the strain of A. oryzae in which NRPS had already been incorporated. To evaluate the heterologous production of KK-1 in A. oryzae , gene expression was analyzed by RT-PCR and KK-1 productivity was quantified by HPLC. KK-1 was produced in variable quantities by a number of transformed strains, along with expression of the cluster genes. The amount of KK-1 produced by the strain with the greatest expression of all genes was lower than that produced by the original producer, C. clavata . Therefore, expression of the cluster genes is necessary and sufficient for the heterologous production of KK-1 in A. oryzae , although there may be unknown factors limiting productivity in this species.

  13. Heterologous Production of a Novel Cyclic Peptide Compound, KK-1, in Aspergillus oryzae

    PubMed Central

    Yoshimi, Akira; Yamaguchi, Sigenari; Fujioka, Tomonori; Kawai, Kiyoshi; Gomi, Katsuya; Machida, Masayuki; Abe, Keietsu

    2018-01-01

    A novel cyclic peptide compound, KK-1, was originally isolated from the plant-pathogenic fungus Curvularia clavata. It consists of 10 amino acid residues, including five N-methylated amino acid residues, and has potent antifungal activity. Recently, the genome-sequencing analysis of C. clavata was completed, and the biosynthetic genes involved in KK-1 production were predicted by using a novel gene cluster mining tool, MIDDAS-M. These genes form an approximately 75-kb cluster, which includes nine open reading frames, containing a non-ribosomal peptide synthetase (NRPS) gene. To determine whether the predicted genes were responsible for the biosynthesis of KK-1, we performed heterologous production of KK-1 in Aspergillus oryzae by introduction of the cluster genes into the genome of A. oryzae. The NRPS gene was split in two fragments and then reconstructed in the A. oryzae genome, because the gene was quite large (approximately 40 kb). The remaining seven genes in the cluster, excluding the regulatory gene kkR, were simultaneously introduced into the strain of A. oryzae in which NRPS had already been incorporated. To evaluate the heterologous production of KK-1 in A. oryzae, gene expression was analyzed by RT-PCR and KK-1 productivity was quantified by HPLC. KK-1 was produced in variable quantities by a number of transformed strains, along with expression of the cluster genes. The amount of KK-1 produced by the strain with the greatest expression of all genes was lower than that produced by the original producer, C. clavata. Therefore, expression of the cluster genes is necessary and sufficient for the heterologous production of KK-1 in A. oryzae, although there may be unknown factors limiting productivity in this species. PMID:29686660

  14. The biosynthetic gene cluster for the cyanogenic glucoside dhurrin in Sorghum bicolor contains its co-expressed vacuolar MATE transporter

    PubMed Central

    Darbani, Behrooz; Motawia, Mohammed Saddik; Olsen, Carl Erik; Nour-Eldin, Hussam H.; Møller, Birger Lindberg; Rook, Fred

    2016-01-01

    Genomic gene clusters for the biosynthesis of chemical defence compounds are increasingly identified in plant genomes. We previously reported the independent evolution of biosynthetic gene clusters for cyanogenic glucoside biosynthesis in three plant lineages. Here we report that the gene cluster for the cyanogenic glucoside dhurrin in Sorghum bicolor additionally contains a gene, SbMATE2, encoding a transporter of the multidrug and toxic compound extrusion (MATE) family, which is co-expressed with the biosynthetic genes. The predicted localisation of SbMATE2 to the vacuolar membrane was demonstrated experimentally by transient expression of a SbMATE2-YFP fusion protein and confocal microscopy. Transport studies in Xenopus laevis oocytes demonstrate that SbMATE2 is able to transport dhurrin. In addition, SbMATE2 was able to transport non-endogenous cyanogenic glucosides, but not the anthocyanin cyanidin 3-O-glucoside or the glucosinolate indol-3-yl-methyl glucosinolate. The genomic co-localisation of a transporter gene with the biosynthetic genes producing the transported compound is discussed in relation to the role self-toxicity of chemical defence compounds may play in the formation of gene clusters. PMID:27841372

  15. Multi-gene phylogenetic analysis reveals that shochu-fermenting Saccharomyces cerevisiae strains form a distinct sub-clade of the Japanese sake cluster.

    PubMed

    Futagami, Taiki; Kadooka, Chihiro; Ando, Yoshinori; Okutsu, Kayu; Yoshizaki, Yumiko; Setoguchi, Shinji; Takamine, Kazunori; Kawai, Mikihiko; Tamaki, Hisanori

    2017-10-01

    Shochu is a traditional Japanese distilled spirit. The formation of the distinguishing flavour of shochu produced in individual distilleries is attributed to putative indigenous yeast strains. In this study, we performed the first (to our knowledge) phylogenetic classification of shochu strains based on nucleotide gene sequences. We performed phylogenetic classification of 21 putative indigenous shochu yeast strains isolated from 11 distilleries. All of these strains were shown or confirmed to be Saccharomyces cerevisiae, sharing species identification with 34 known S. cerevisiae strains (including commonly used shochu, sake, ale, whisky, bakery, bioethanol and laboratory yeast strains and clinical isolate) that were tested in parallel. Our analysis used five genes that reflect genome-level phylogeny for the strain-level classification. In a first step, we demonstrated that partial regions of the ZAP1, THI7, PXL1, YRR1 and GLG1 genes were sufficient to reproduce previous sub-species classifications. In a second step, these five analysed regions from each of 25 strains (four commonly used shochu strains and the 21 putative indigenous shochu strains) were concatenated and used to generate a phylogenetic tree. Further analysis revealed that the putative indigenous shochu yeast strains form a monophyletic group that includes both the shochu yeasts and a subset of the sake group strains; this cluster is a sister group to other sake yeast strains, together comprising a sake-shochu group. Differences among shochu strains were small, suggesting that it may be possible to correlate subtle phenotypic differences among shochu flavours with specific differences in genome sequences. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  16. Characterization and Evolution of Cell Division and Cell Wall Synthesis Genes in the Bacterial Phyla Verrucomicrobia, Lentisphaerae, Chlamydiae, and Planctomycetes and Phylogenetic Comparison with rRNA Genes▿ †

    PubMed Central

    Pilhofer, Martin; Rappl, Kristina; Eckl, Christina; Bauer, Andreas Peter; Ludwig, Wolfgang; Schleifer, Karl-Heinz; Petroni, Giulio

    2008-01-01

    In the past, studies on the relationships of the bacterial phyla Planctomycetes, Chlamydiae, Lentisphaerae, and Verrucomicrobia using different phylogenetic markers have been controversial. Investigations based on 16S rRNA sequence analyses suggested a relationship of the four phyla, showing the branching order Planctomycetes, Chlamydiae, Verrucomicrobia/Lentisphaerae. Phylogenetic analyses of 23S rRNA genes in this study also support a monophyletic grouping and their branching order—this grouping is significant for understanding cell division, since the major bacterial cell division protein FtsZ is absent from members of two of the phyla Chlamydiae and Planctomycetes. In Verrucomicrobia, knowledge about cell division is mainly restricted to the recent report of ftsZ in the closely related genera Prosthecobacter and Verrucomicrobium. In this study, genes of the conserved division and cell wall (dcw) cluster (ddl, ftsQ, ftsA, and ftsZ) were characterized in all verrucomicrobial subdivisions (1 to 4) with cultivable representatives (1 to 4). Sequence analyses and transcriptional analyses in Verrucomicrobia and genome data analyses in Lentisphaerae suggested that cell division is based on FtsZ in all verrucomicrobial subdivisions and possibly also in the sister phylum Lentisphaerae. Comprehensive sequence analyses of available genome data for representatives of Verrucomicrobia, Lentisphaerae, Chlamydiae, and Planctomycetes strongly indicate that their last common ancestor possessed a conserved, ancestral type of dcw gene cluster and an FtsZ-based cell division mechanism. This implies that Planctomycetes and Chlamydiae may have shifted independently to a non-FtsZ-based cell division mechanism after their separate branchings from their last common ancestor with Verrucomicrobia. PMID:18310338

  17. MicroRNA-Target Network Inference and Local Network Enrichment Analysis Identify Two microRNA Clusters with Distinct Functions in Head and Neck Squamous Cell Carcinoma

    PubMed Central

    Sass, Steffen; Pitea, Adriana; Unger, Kristian; Hess, Julia; Mueller, Nikola S.; Theis, Fabian J.

    2015-01-01

    MicroRNAs represent ~22 nt long endogenous small RNA molecules that have been experimentally shown to regulate gene expression post-transcriptionally. One main interest in miRNA research is the investigation of their functional roles, which can typically be accomplished by identification of mi-/mRNA interactions and functional annotation of target gene sets. We here present a novel method “miRlastic”, which infers miRNA-target interactions using transcriptomic data as well as prior knowledge and performs functional annotation of target genes by exploiting the local structure of the inferred network. For the network inference, we applied linear regression modeling with elastic net regularization on matched microRNA and messenger RNA expression profiling data to perform feature selection on prior knowledge from sequence-based target prediction resources. The novelty of miRlastic inference originates in predicting data-driven intra-transcriptome regulatory relationships through feature selection. With synthetic data, we showed that miRlastic outperformed commonly used methods and was suitable even for low sample sizes. To gain insight into the functional role of miRNAs and to determine joint functional properties of miRNA clusters, we introduced a local enrichment analysis procedure. The principle of this procedure lies in identifying regions of high functional similarity by evaluating the shortest paths between genes in the network. We can finally assign functional roles to the miRNAs by taking their regulatory relationships into account. We thoroughly evaluated miRlastic on a cohort of head and neck cancer (HNSCC) patients provided by The Cancer Genome Atlas. We inferred an mi-/mRNA regulatory network for human papilloma virus (HPV)-associated miRNAs in HNSCC. The resulting network best enriched for experimentally validated miRNA-target interaction, when compared to common methods. Finally, the local enrichment step identified two functional clusters of miRNAs that were predicted to mediate HPV-associated dysregulation in HNSCC. Our novel approach was able to characterize distinct pathway regulations from matched miRNA and mRNA data. An R package of miRlastic was made available through: http://icb.helmholtz-muenchen.de/mirlastic. PMID:26694379

  18. MicroRNA-Target Network Inference and Local Network Enrichment Analysis Identify Two microRNA Clusters with Distinct Functions in Head and Neck Squamous Cell Carcinoma.

    PubMed

    Sass, Steffen; Pitea, Adriana; Unger, Kristian; Hess, Julia; Mueller, Nikola S; Theis, Fabian J

    2015-12-18

    MicroRNAs represent ~22 nt long endogenous small RNA molecules that have been experimentally shown to regulate gene expression post-transcriptionally. One main interest in miRNA research is the investigation of their functional roles, which can typically be accomplished by identification of mi-/mRNA interactions and functional annotation of target gene sets. We here present a novel method "miRlastic", which infers miRNA-target interactions using transcriptomic data as well as prior knowledge and performs functional annotation of target genes by exploiting the local structure of the inferred network. For the network inference, we applied linear regression modeling with elastic net regularization on matched microRNA and messenger RNA expression profiling data to perform feature selection on prior knowledge from sequence-based target prediction resources. The novelty of miRlastic inference originates in predicting data-driven intra-transcriptome regulatory relationships through feature selection. With synthetic data, we showed that miRlastic outperformed commonly used methods and was suitable even for low sample sizes. To gain insight into the functional role of miRNAs and to determine joint functional properties of miRNA clusters, we introduced a local enrichment analysis procedure. The principle of this procedure lies in identifying regions of high functional similarity by evaluating the shortest paths between genes in the network. We can finally assign functional roles to the miRNAs by taking their regulatory relationships into account. We thoroughly evaluated miRlastic on a cohort of head and neck cancer (HNSCC) patients provided by The Cancer Genome Atlas. We inferred an mi-/mRNA regulatory network for human papilloma virus (HPV)-associated miRNAs in HNSCC. The resulting network best enriched for experimentally validated miRNA-target interaction, when compared to common methods. Finally, the local enrichment step identified two functional clusters of miRNAs that were predicted to mediate HPV-associated dysregulation in HNSCC. Our novel approach was able to characterize distinct pathway regulations from matched miRNA and mRNA data. An R package of miRlastic was made available through: http://icb.helmholtz-muenchen.de/mirlastic.

  19. Quantitative assessment of Hox complex expression in the indirect development of the polychaete annelid Chaetopterus sp

    NASA Technical Reports Server (NTRS)

    Peterson, K. J.; Irvine, S. Q.; Cameron, R. A.; Davidson, E. H.

    2000-01-01

    A prediction from the set-aside theory of bilaterian origins is that pattern formation processes such as those controlled by the Hox cluster genes are required specifically for adult body plan formation. This prediction can be tested in animals that use maximal indirect development, in which the embryonic formation of the larva and the postembryonic formation of the adult body plan are temporally and spatially distinct. To this end, we quantitatively measured the amount of transcripts for five Hox genes in embryos of a lophotrochozoan, the polychaete annelid Chaetopterus sp. The polychaete Hox complex is shown not to be expressed during embryogenesis, but transcripts of all measured Hox complex genes are detected at significant levels during the initial stages of adult body plan formation. Temporal colinearity in the sequence of their activation is observed, so that activation follows the 3'-5' arrangement of the genes. Moreover, Hox gene expression is spatially localized to the region of teloblastic set-aside cells of the later-stage embryos. This study shows that an indirectly developing lophotrochozoan shares with an indirectly developing deuterostome, the sea urchin, a common mode of Hox complex utilization: construction of the larva, whether a trochophore or dipleurula, does not involve Hox cluster expression, but in both forms the complex is expressed in the set-aside cells from which the adult body plan derives.

  20. Recombination-Mediated Host Adaptation by Avian Staphylococcus aureus

    PubMed Central

    Murray, Susan; Pascoe, Ben; Méric, Guillaume; Mageiros, Leonardos; Yahara, Koji; Hitchings, Matthew D.; Friedmann, Yasmin; Wilkinson, Thomas S.; Gormley, Fraser J.; Mack, Dietrich; Bray, James E.; Lamble, Sarah; Bowden, Rory; Jolley, Keith A.; Maiden, Martin C.J.; Wendlandt, Sarah; Schwarz, Stefan; Corander, Jukka; Fitzgerald, J. Ross

    2017-01-01

    Staphylococcus aureus are globally disseminated among farmed chickens causing skeletal muscle infections, dermatitis, and septicaemia. The emergence of poultry-associated lineages has involved zoonotic transmission from humans to chickens but questions remain about the specific adaptations that promote proliferation of chicken pathogens. We characterized genetic variation in a population of genome-sequenced S. aureus isolates of poultry and human origin. Genealogical analysis identified a dominant poultry-associated sequence cluster within the CC5 clonal complex. Poultry and human CC5 isolates were significantly distinct from each other and more recombination events were detected in the poultry isolates. We identified 44 recombination events in 33 genes along the branch extending to the poultry-specific CC5 cluster, and 47 genes were found more often in CC5 poultry isolates compared with those from humans. Many of these gene sequences were common in chicken isolates from other clonal complexes suggesting horizontal gene transfer among poultry associated lineages. Consistent with functional predictions for putative poultry-associated genes, poultry isolates showed enhanced growth at 42 °C and greater erythrocyte lysis on chicken blood agar in comparison with human isolates. By combining phenotype information with evolutionary analyses of staphylococcal genomes, we provide evidence of adaptation, following a human-to-poultry host transition. This has important implications for the emergence and dissemination of new pathogenic clones associated with modern agriculture. PMID:28338786

  1. Overproduction of Ristomycin A by Activation of a Silent Gene Cluster in Amycolatopsis japonicum MG417-CF17

    PubMed Central

    Spohn, Marius; Kirchner, Norbert; Kulik, Andreas; Jochim, Angelika; Wolf, Felix; Muenzer, Patrick; Borst, Oliver; Gross, Harald; Wohlleben, Wolfgang

    2014-01-01

    The emergence of antibiotic-resistant pathogenic bacteria within the last decades is one reason for the urgent need for new antibacterial agents. A strategy to discover new anti-infective compounds is the evaluation of the genetic capacity of secondary metabolite producers and the activation of cryptic gene clusters (genome mining). One genus known for its potential to synthesize medically important products is Amycolatopsis. However, Amycolatopsis japonicum does not produce an antibiotic under standard laboratory conditions. In contrast to most Amycolatopsis strains, A. japonicum is genetically tractable with different methods. In order to activate a possible silent glycopeptide cluster, we introduced a gene encoding the transcriptional activator of balhimycin biosynthesis, the bbr gene from Amycolatopsis balhimycina (bbrAba), into A. japonicum. This resulted in the production of an antibiotically active compound. Following whole-genome sequencing of A. japonicum, 29 cryptic gene clusters were identified by genome mining. One of these gene clusters is a putative glycopeptide biosynthesis gene cluster. Using bioinformatic tools, ristomycin (syn. ristocetin), a type III glycopeptide, which has antibacterial activity and which is used for the diagnosis of von Willebrand disease and Bernard-Soulier syndrome, was deduced as a possible product of the gene cluster. Chemical analyses by high-performance liquid chromatography and mass spectrometry (HPLC-MS), tandem mass spectrometry (MS/MS), and nuclear magnetic resonance (NMR) spectroscopy confirmed the in silico prediction that the recombinant A. japonicum/pRM4-bbrAba synthesizes ristomycin A. PMID:25114137

  2. Clustering of two genes putatively involved in cyanate detoxification evolved recently and independently in multiple fungal lineages.

    PubMed

    Elmore, M Holly; McGary, Kriston L; Wisecaver, Jennifer H; Slot, Jason C; Geiser, David M; Sink, Stacy; O'Donnell, Kerry; Rokas, Antonis

    2015-02-06

    Fungi that have the enzymes cyanase and carbonic anhydrase show a limited capacity to detoxify cyanate, a fungicide employed by both plants and humans. Here, we describe a novel two-gene cluster that comprises duplicated cyanase and carbonic anhydrase copies, which we name the CCA gene cluster, trace its evolution across Ascomycetes, and examine the evolutionary dynamics of its spread among lineages of the Fusarium oxysporum species complex (hereafter referred to as the FOSC), a cosmopolitan clade of purportedly clonal vascular wilt plant pathogens. Phylogenetic analysis of fungal cyanase and carbonic anhydrase genes reveals that the CCA gene cluster arose independently at least twice and is now present in three lineages, namely Cochliobolus lunatus, Oidiodendron maius, and the FOSC. Genome-wide surveys within the FOSC indicate that the CCA gene cluster varies in copy number across isolates, is always located on accessory chromosomes, and is absent in FOSC's closest relatives. Phylogenetic reconstruction of the CCA gene cluster in 163 FOSC strains from a wide variety of hosts suggests a recent history of rampant transfers between isolates. We hypothesize that the independent formation of the CCA gene cluster in different fungal lineages and its spread across FOSC strains may be associated with resistance to plant-produced cyanates or to use of cyanate fungicides in agriculture. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  3. dbWFA: a web-based database for functional annotation of Triticum aestivum transcripts

    PubMed Central

    Vincent, Jonathan; Dai, Zhanwu; Ravel, Catherine; Choulet, Frédéric; Mouzeyar, Said; Bouzidi, M. Fouad; Agier, Marie; Martre, Pierre

    2013-01-01

    The functional annotation of genes based on sequence homology with genes from model species genomes is time-consuming because it is necessary to mine several unrelated databases. The aim of the present work was to develop a functional annotation database for common wheat Triticum aestivum (L.). The database, named dbWFA, is based on the reference NCBI UniGene set, an expressed gene catalogue built by expressed sequence tag clustering, and on full-length coding sequences retrieved from the TriFLDB database. Information from good-quality heterogeneous sources, including annotations for model plant species Arabidopsis thaliana (L.) Heynh. and Oryza sativa L., was gathered and linked to T. aestivum sequences through BLAST-based homology searches. Even though the complexity of the transcriptome cannot yet be fully appreciated, we developed a tool to easily and promptly obtain information from multiple functional annotation systems (Gene Ontology, MapMan bin codes, MIPS Functional Categories, PlantCyc pathway reactions and TAIR gene families). The use of dbWFA is illustrated here with several query examples. We were able to assign a putative function to 45% of the UniGenes and 81% of the full-length coding sequences from TriFLDB. Moreover, comparison of the annotation of the whole T. aestivum UniGene set along with curated annotations of the two model species assessed the accuracy of the annotation provided by dbWFA. To further illustrate the use of dbWFA, genes specifically expressed during the early cell division or late storage polymer accumulation phases of T. aestivum grain development were identified using a clustering analysis and then annotated using dbWFA. The annotation of these two sets of genes was consistent with previous analyses of T. aestivum grain transcriptomes and proteomes. Database URL: urgi.versailles.inra.fr/dbWFA/ PMID:23660284

  4. Sequence analyses reveal that a TPR–DP module, surrounded by recombinable flanking introns, could be at the origin of eukaryotic Hop and Hip TPR–DP domains and prokaryotic GerD proteins

    PubMed Central

    Papandreou, Nikolaos; Chomilier, Jacques

    2008-01-01

    The co-chaperone Hop [heat shock protein (HSP) organising protein] is known to bind both Hsp70 and Hsp90. Hop comprises three repeats of a tetratricopeptide repeat (TPR) domain, each consisting of three TPR motifs. The first and last TPR domains are followed by a domain containing several dipeptide (DP) repeats called the DP domain. These analyses suggest that the hop genes result from successive recombination events of an ancestral TPR–DP module. From a hydrophobic cluster analysis of homologous Hop protein sequences derived from gene families, we can postulate that shifts in the open reading frames are at the origin of the present sequences. Moreover, these shifts can be related to the presence or absence of biological function. We propose to extend the family of Hop co-chaperons into the kingdom of bacteria, as several structurally related genes have been identified by hydrophobic cluster analysis. We also provide evidence of common structural characteristics between hop and hip genes, suggesting a shared precursor of ancestral TPR–DP domains. Electronic supplementary material The online version of this article (doi:10.1007/s12192-008-0083-8) contains supplementary material, which is available to authorized users. PMID:18987995

  5. Gene cluster conservation provides insight into cercosporin biosynthesis and extends production to the genus Colletotrichum.

    PubMed

    de Jonge, Ronnie; Ebert, Malaika K; Huitt-Roehl, Callie R; Pal, Paramita; Suttle, Jeffrey C; Spanner, Rebecca E; Neubauer, Jonathan D; Jurick, Wayne M; Stott, Karina A; Secor, Gary A; Thomma, Bart P H J; Van de Peer, Yves; Townsend, Craig A; Bolton, Melvin D

    2018-06-12

    Species in the genus Cercospora cause economically devastating diseases in sugar beet, maize, rice, soy bean, and other major food crops. Here, we sequenced the genome of the sugar beet pathogen Cercospora beticola and found it encodes 63 putative secondary metabolite gene clusters, including the cercosporin toxin biosynthesis ( CTB ) cluster. We show that the CTB gene cluster has experienced multiple duplications and horizontal transfers across a spectrum of plant pathogenic fungi, including the wide-host range Colletotrichum genus as well as the rice pathogen Magnaporthe oryzae Although cercosporin biosynthesis has been thought to rely on an eight-gene CTB cluster, our phylogenomic analysis revealed gene collinearity adjacent to the established cluster in all CTB cluster-harboring species. We demonstrate that the CTB cluster is larger than previously recognized and includes cercosporin facilitator protein, previously shown to be involved with cercosporin autoresistance, and four additional genes required for cercosporin biosynthesis, including the final pathway enzymes that install the unusual cercosporin methylenedioxy bridge. Lastly, we demonstrate production of cercosporin by Colletotrichum fioriniae , the first known cercosporin producer within this agriculturally important genus. Thus, our results provide insight into the intricate evolution and biology of a toxin critical to agriculture and broaden the production of cercosporin to another fungal genus containing many plant pathogens of important crops worldwide. Copyright © 2018 the Author(s). Published by PNAS.

  6. Bioinformatic analysis of the nucleotide binding site-encoding disease-resistance genes in foxtail millet (Setaria italica (L.) Beauv.).

    PubMed

    Zhu, Y B; Xie, X Q; Li, Z Y; Bai, H; Dong, L; Dong, Z P; Dong, J G

    2014-08-28

    The nucleotide-binding site (NBS) disease-resistance genes are the largest category of plant disease-resistance gene analogs. The complete set of disease-resistant candidate genes, which encode the NBS sequence, was filtered in the genomes of two varieties of foxtail millet (Yugu1 and 'Zhang gu'). This study investigated a number of characteristics of the putative NBS genes, such as structural diversity and phylogenetic relationships. A total of 269 and 281 NBS-coding sequences were identified in Yugu1 and 'Zhang gu', respectively. When the two databases were compared, 72 genes were found to be identical and 164 genes showed more than 90% similarity. Physical positioning and gene family analysis of the NBS disease-resistance genes in the genome revealed that the number of genes on each chromosome was similar in both varieties. The eighth chromosome contained the largest number of genes and the ninth chromosome contained the lowest number of genes. Exactly 34 gene clusters containing the 161 genes were found in the Yugu1 genome, with each cluster containing 4.7 genes on average. In comparison, the 'Zhang gu' genome possessed 28 gene clusters, which had 151 genes, with an average of 5.4 genes in each cluster. The largest gene cluster, located on the eighth chromosome, contained 12 genes in the Yugu1 database, whereas it contained 16 genes in the 'Zhang gu' database. The classification results showed that the CC-NBS-LRR gene made up the largest part of each chromosome in the two databases. Two TIR-NBS genes were also found in the Yugu1 genome.

  7. SPINE: SParse eIgengene NEtwork linking gene expression clusters in Dehalococcoides mccartyi to perturbations in experimental conditions

    DOE PAGES

    Mansfeldt, Cresten B.; Logsdon, Benjamin A.; Debs, Garrett E.; ...

    2015-02-25

    We present a statistical model designed to identify the effect of experimental perturbations on the aggregate behavior of the transcriptome expressed by the bacterium Dehalococcoides mccartyi strain 195. Strains of Dehalococcoides are used in sub-surface bioremediation applications because they organohalorespire tetrachloroethene and trichloroethene (common chlorinated solvents that contaminate the environment) to non-toxic ethene. However, the biochemical mechanism of this process remains incompletely described. Additionally, the response of Dehalococcoides to stress-inducing conditions that may be encountered at field-sites is not well understood. The constructed statistical model captured the aggregate behavior of gene expression phenotypes by modeling the distinct eigengenes of 100more » transcript clusters, determining stable relationships among these clusters of gene transcripts with a sparse network-inference algorithm, and directly modeling the effect of changes in experimental conditions by constructing networks conditioned on the experimental state. Based on the model predictions, we discovered new response mechanisms for DMC, notably when the bacterium is exposed to solvent toxicity. The network identified a cluster containing thirteen gene transcripts directly connected to the solvent toxicity condition. Transcripts in this cluster include an iron-dependent regulator (DET0096-97) and a methylglyoxal synthase (DET0137). To validate these predictions, additional experiments were performed. Continuously fed cultures were exposed to saturating levels of tetrachloethene, thereby causing solvent toxicity, and transcripts that were predicted to be linked to solvent toxicity were monitored by quantitative reverse-transcription polymerase chain reaction. Twelve hours after being shocked with saturating levels of tetrachloroethene, the control transcripts (encoding for a key hydrogenase and the 16S rRNA) did not significantly change. By contrast, transcripts for DET0137 and DET0097 displayed a 46.8±11.5 and 14.6±9.3 fold up-regulation, respectively, supporting the model. This is the first study to identify transcripts in Dehalococcoides that potentially respond to tetrachloroethene solvent-toxicity conditions that may be encountered near contamination source zones in sub-surface environments.« less

  8. Acquisition and evolution of plant pathogenesis-associated gene clusters and candidate determinants of tissue-specificity in xanthomonas.

    PubMed

    Lu, Hong; Patil, Prabhu; Van Sluys, Marie-Anne; White, Frank F; Ryan, Robert P; Dow, J Maxwell; Rabinowicz, Pablo; Salzberg, Steven L; Leach, Jan E; Sonti, Ramesh; Brendel, Volker; Bogdanove, Adam J

    2008-01-01

    Xanthomonas is a large genus of plant-associated and plant-pathogenic bacteria. Collectively, members cause diseases on over 392 plant species. Individually, they exhibit marked host- and tissue-specificity. The determinants of this specificity are unknown. To assess potential contributions to host- and tissue-specificity, pathogenesis-associated gene clusters were compared across genomes of eight Xanthomonas strains representing vascular or non-vascular pathogens of rice, brassicas, pepper and tomato, and citrus. The gum cluster for extracellular polysaccharide is conserved except for gumN and sequences downstream. The xcs and xps clusters for type II secretion are conserved, except in the rice pathogens, in which xcs is missing. In the otherwise conserved hrp cluster, sequences flanking the core genes for type III secretion vary with respect to insertion sequence element and putative effector gene content. Variation at the rpf (regulation of pathogenicity factors) cluster is more pronounced, though genes with established functional relevance are conserved. A cluster for synthesis of lipopolysaccharide varies highly, suggesting multiple horizontal gene transfers and reassortments, but this variation does not correlate with host- or tissue-specificity. Phylogenetic trees based on amino acid alignments of gum, xps, xcs, hrp, and rpf cluster products generally reflect strain phylogeny. However, amino acid residues at four positions correlate with tissue specificity, revealing hpaA and xpsD as candidate determinants. Examination of genome sequences of xanthomonads Xylella fastidiosa and Stenotrophomonas maltophilia revealed that the hrp, gum, and xcs clusters are recent acquisitions in the Xanthomonas lineage. Our results provide insight into the ancestral Xanthomonas genome and indicate that differentiation with respect to host- and tissue-specificity involved not major modifications or wholesale exchange of clusters, but subtle changes in a small number of genes or in non-coding sequences, and/or differences outside the clusters, potentially among regulatory targets or secretory substrates.

  9. The emergence of overlapping scale-free genetic architecture in digital organisms.

    PubMed

    Gerlee, P; Lundh, T

    2008-01-01

    We have studied the evolution of genetic architecture in digital organisms and found that the gene overlap follows a scale-free distribution, which is commonly found in metabolic networks of many organisms. Our results show that the slope of the scale-free distribution depends on the mutation rate and that the gene development is driven by expansion of already existing genes, which is in direct correspondence to the preferential growth algorithm that gives rise to scale-free networks. To further validate our results we have constructed a simple model of gene development, which recapitulates the results from the evolutionary process and shows that the mutation rate affects the tendency of genes to cluster. In addition we could relate the slope of the scale-free distribution to the genetic complexity of the organisms and show that a high mutation rate gives rise to a more complex genetic architecture.

  10. Metabolic and spatio-taxonomic response of uncultivated seafloor bacteria following the Deepwater Horizon oil spill

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Handley, K. M.; Piceno, Y. M.; Hu, P.

    The release of 700 million liters of oil into the Gulf of Mexico over a few months in 2010 produced dramatic changes in the microbial ecology of the water and sediment. Here, we reconstructed the genomes of 57 widespread uncultivated bacteria from post-spill deep-sea sediments, and recovered their gene expression pattern across the seafloor. These genomes comprised a common collection of bacteria that were enriched in heavily affected sediments around the wellhead. Although rare in distal sediments, some members were still detectable at sites up to 60 km away. Many of these genomes exhibited phylogenetic clustering indicative of common traitmore » selection by the environment, and within half we identified 264 genes associated with hydrocarbon degradation. Alkane degradation ability was near ubiquitous among candidate hydrocarbon degraders, whereas just three harbored elaborate gene inventories for the degradation of alkanes and aromatic and polycyclic aromatic hydrocarbons (PAHs). Differential gene expression profiles revealed a spill-promoted microbial sulfur cycle alongside gene upregulation associated with PAH degradation. Gene expression associated with alkane degradation was widespread, although active alkane degrader identities changed along the pollution gradient. Analyses suggest that a broad metabolic capacity to respond to oil inputs exists across a large array of usually rare indigenous deep-sea bacteria.« less

  11. [Chromosomal large fragment deletion induced by CRISPR/Cas9 gene editing system].

    PubMed

    Cheng, L H; Liu, Y; Niu, T

    2017-05-14

    Objective: Using CRISPR-Cas9 gene editing technology to achieve a number of genes co-deletion on the same chromosome. Methods: CRISPR-Cas9 lentiviral plasmid that could induce deletion of Aloxe3-Alox12b-Alox8 cluster genes located on mouse 11B3 chromosome was constructed via molecular clone. HEK293T cells were transfected to package lentivirus of CRISPR or Cas9 cDNA, then mouse NIH3T3 cells were infected by lentivirus and genomic DNA of these cells was extracted. The deleted fragment was amplified by PCR, TA clone, Sanger sequencing and other techniques were used to confirm the deletion of Aloxe3-Alox12b-Alox8 cluster genes. Results: The CRISPR-Cas9 lentiviral plasmid, which could induce deletion of Aloxe3-Alox12b-Alox8 cluster genes, was successfully constructed. Deletion of target chromosome fragment (Aloxe3-Alox12b-Alox8 cluster genes) was verified by PCR. The deletion of Aloxe3-Alox12b-Alox8 cluster genes was affirmed by TA clone, Sanger sequencing, and the breakpoint junctions of the CRISPR-Cas9 system mediate cutting events were accurately recombined, insertion mutation did not occur between two cleavage sites at all. Conclusion: Large fragment deletion of Aloxe3-Alox12b-Alox8 cluster genes located on mouse chromosome 11B3 was successfully induced by CRISPR-Cas9 gene editing system.

  12. Genomics-driven discovery of the pneumocandin biosynthetic gene cluster in the fungus Glarea lozoyensis

    PubMed Central

    2013-01-01

    Background The antifungal therapy caspofungin is a semi-synthetic derivative of pneumocandin B0, a lipohexapeptide produced by the fungus Glarea lozoyensis, and was the first member of the echinocandin class approved for human therapy. The nonribosomal peptide synthetase (NRPS)-polyketide synthases (PKS) gene cluster responsible for pneumocandin biosynthesis from G. lozoyensis has not been elucidated to date. In this study, we report the elucidation of the pneumocandin biosynthetic gene cluster by whole genome sequencing of the G. lozoyensis wild-type strain ATCC 20868. Results The pneumocandin biosynthetic gene cluster contains a NRPS (GLNRPS4) and a PKS (GLPKS4) arranged in tandem, two cytochrome P450 monooxygenases, seven other modifying enzymes, and genes for L-homotyrosine biosynthesis, a component of the peptide core. Thus, the pneumocandin biosynthetic gene cluster is significantly more autonomous and organized than that of the recently characterized echinocandin B gene cluster. Disruption mutants of GLNRPS4 and GLPKS4 no longer produced the pneumocandins (A0 and B0), and the Δglnrps4 and Δglpks4 mutants lost antifungal activity against the human pathogenic fungus Candida albicans. In addition to pneumocandins, the G. lozoyensis genome encodes a rich repertoire of natural product-encoding genes including 24 PKSs, six NRPSs, five PKS-NRPS hybrids, two dimethylallyl tryptophan synthases, and 14 terpene synthases. Conclusions Characterization of the gene cluster provides a blueprint for engineering new pneumocandin derivatives with improved pharmacological properties. Whole genome estimation of the secondary metabolite-encoding genes from G. lozoyensis provides yet another example of the huge potential for drug discovery from natural products from the fungal kingdom. PMID:23688303

  13. Gene/QTL discovery for Anthracnose in common bean (Phaseolus vulgaris L.) from North-western Himalayas

    PubMed Central

    Choudhary, Neeraj; Bawa, Vanya; Paliwal, Rajneesh; Singh, Bikram; Bhat, Mohd. Ashraf; Mir, Javid Iqbal; Gupta, Moni; Sofi, Parvaze A.; Thudi, Mahendar; Varshney, Rajeev K.

    2018-01-01

    Common bean (Phaseolus vulgaris L.) is one of the most important grain legume crops in the world. The beans grown in north-western Himalayas possess huge diversity for seed color, shape and size but are mostly susceptible to Anthracnose disease caused by seed born fungus Colletotrichum lindemuthianum. Dozens of QTLs/genes have been already identified for this disease in common bean world-wide. However, this is the first report of gene/QTL discovery for Anthracnose using bean germplasm from north-western Himalayas of state Jammu & Kashmir, India. A core set of 96 bean lines comprising 54 indigenous local landraces from 11 hot-spots and 42 exotic lines from 10 different countries were phenotyped at two locations (SKUAST-Jammu and Bhaderwah, Jammu) for Anthracnose resistance. The core set was also genotyped with genome-wide (91) random and trait linked SSR markers. The study of marker-trait associations (MTAs) led to the identification of 10 QTLs/genes for Anthracnose resistance. Among the 10 QTLs/genes identified, two MTAs are stable (BM45 & BM211), two MTAs (PVctt1 & BM211) are major explaining more than 20% phenotypic variation for Anthracnose and one MTA (BM211) is both stable and major. Six (06) genomic regions are reported for the first time, while as four (04) genomic regions validated the already known QTL/gene regions/clusters for Anthracnose. The major, stable and validated markers reported during the present study associated with Anthracnose resistance will prove useful in common bean molecular breeding programs aimed at enhancing Anthracnose resistance of local bean landraces grown in north-western Himalayas of state Jammu and Kashmir. PMID:29389971

  14. Gene/QTL discovery for Anthracnose in common bean (Phaseolus vulgaris L.) from North-western Himalayas.

    PubMed

    Choudhary, Neeraj; Bawa, Vanya; Paliwal, Rajneesh; Singh, Bikram; Bhat, Mohd Ashraf; Mir, Javid Iqbal; Gupta, Moni; Sofi, Parvaze A; Thudi, Mahendar; Varshney, Rajeev K; Mir, Reyazul Rouf

    2018-01-01

    Common bean (Phaseolus vulgaris L.) is one of the most important grain legume crops in the world. The beans grown in north-western Himalayas possess huge diversity for seed color, shape and size but are mostly susceptible to Anthracnose disease caused by seed born fungus Colletotrichum lindemuthianum. Dozens of QTLs/genes have been already identified for this disease in common bean world-wide. However, this is the first report of gene/QTL discovery for Anthracnose using bean germplasm from north-western Himalayas of state Jammu & Kashmir, India. A core set of 96 bean lines comprising 54 indigenous local landraces from 11 hot-spots and 42 exotic lines from 10 different countries were phenotyped at two locations (SKUAST-Jammu and Bhaderwah, Jammu) for Anthracnose resistance. The core set was also genotyped with genome-wide (91) random and trait linked SSR markers. The study of marker-trait associations (MTAs) led to the identification of 10 QTLs/genes for Anthracnose resistance. Among the 10 QTLs/genes identified, two MTAs are stable (BM45 & BM211), two MTAs (PVctt1 & BM211) are major explaining more than 20% phenotypic variation for Anthracnose and one MTA (BM211) is both stable and major. Six (06) genomic regions are reported for the first time, while as four (04) genomic regions validated the already known QTL/gene regions/clusters for Anthracnose. The major, stable and validated markers reported during the present study associated with Anthracnose resistance will prove useful in common bean molecular breeding programs aimed at enhancing Anthracnose resistance of local bean landraces grown in north-western Himalayas of state Jammu and Kashmir.

  15. Clustering of two genes putatively involved in cyanate detoxification evolved recently and independently in multiple fungal lineages

    USDA-ARS?s Scientific Manuscript database

    Fungi that have the enzymes cyanase and carbonic anhydrase show a limited capacity to detoxify cyanate, a fungicide employed by both plants and humans. Here, we describe a novel two-gene cluster that comprises duplicated cyanase and carbonic anhydrase copies, which we name the CCA gene cluster, trac...

  16. Network Analysis Reveals a Common Host-Pathogen Interaction Pattern in Arabidopsis Immune Responses.

    PubMed

    Li, Hong; Zhou, Yuan; Zhang, Ziding

    2017-01-01

    Many plant pathogens secrete virulence effectors into host cells to target important proteins in host cellular network. However, the dynamic interactions between effectors and host cellular network have not been fully understood. Here, an integrative network analysis was conducted by combining Arabidopsis thaliana protein-protein interaction network, known targets of Pseudomonas syringae and Hyaloperonospora arabidopsidis effectors, and gene expression profiles in the immune response. In particular, we focused on the characteristic network topology of the effector targets and differentially expressed genes (DEGs). We found that effectors tended to manipulate key network positions with higher betweenness centrality. The effector targets, especially those that are common targets of an individual effector, tended to be clustered together in the network. Moreover, the distances between the effector targets and DEGs increased over time during infection. In line with this observation, pathogen-susceptible mutants tended to have more DEGs surrounding the effector targets compared with resistant mutants. Our results suggest a common plant-pathogen interaction pattern at the cellular network level, where pathogens employ potent local impact mode to interfere with key positions in the host network, and plant organizes an in-depth defense by sequentially activating genes distal to the effector targets.

  17. Function and Regulation of the Formate Dehydrogenase Genes of the Methanogenic Archaeon Methanococcus maripaludis

    PubMed Central

    Wood, Gwendolyn E.; Haydock, Andrew K.; Leigh, John A.

    2003-01-01

    Methanococcus maripaludis is a mesophilic species of Archaea capable of producing methane from two substrates: hydrogen plus carbon dioxide and formate. To study the latter, we identified the formate dehydrogenase genes of M. maripaludis and found that the genome contains two gene clusters important for formate utilization. Phylogenetic analysis suggested that the two formate dehydrogenase gene sets arose from duplication events within the methanococcal lineage. The first gene cluster encodes homologs of formate dehydrogenase α (FdhA) and β (FdhB) subunits and a putative formate transporter (FdhC) as well as a carbonic anhydrase analog. The second gene cluster encodes only FdhA and FdhB homologs. Mutants lacking either fdhA gene exhibited a partial growth defect on formate, whereas a double mutant was completely unable to grow on formate as a sole methanogenic substrate. Investigation of fdh gene expression revealed that transcription of both gene clusters is controlled by the presence of H2 and not by the presence of formate. PMID:12670979

  18. Genetic and serological typing of European infectious haematopoietic necrosis virus (IHNV) isolates

    USGS Publications Warehouse

    Johansson, T.; Einer-Jensen, K.; Batts, W.; Ahrens, P.; Bjorkblom, C.; Kurath, G.; Bjorklund, H.; Lorenzen, N.

    2009-01-01

    Infectious haematopoietic necrosis virus (IHNV) causes the lethal disease infectious haematopoietic necrosis (IHN) in juvenile salmon and trout. The nucleocapsid (N) protein gene and partial glycoprotein (G) gene (nucleotides 457 to 1061) of the European isolates IT-217A, FR-32/87, DE-DF 13/98 11621, DE-DF 4/99-8/99, AU-9695338 and RU-FR1 were sequenced and compared with IHNV isolates from the North American genogroups U, M and L. In phylogenetic studies the N gene of the Italian, French, German and Austrian isolates clustered in the M genogroup, though in a different subgroup than the isolates from the USA. Analyses of the partial G gene of these European isolates clustered them in the M genogroup close to the root while the Russian isolate clustered in the U genogroup. The European isolates together with US-WRAC and US-Col-80 were also tested in an enzyme-linked immunosorbent assay (ELISA) using monoclonal antibodies (MAbs) against the N protein. MAbs 136-1 and 136-3 reacted equally at all concentrations with the isolates tested, indicating that these antibodies identify a common epitope. MAb 34D3 separated the M and L genogroup isolates from the U genogroup isolate. MAb 1DW14D divided the European isolates into 2 groups. MAb 1DW14D reacted more strongly with DE-DF 13/98 11621 and RU-FR1 than with IT-217A, FR- 32/87, DE-DF 4/99-8/99 and AU-9695338. In the phylogenetic studies, the Italian, French, German and Austrian isolates clustered in the M genogroup, whereas in the serological studies using MAbs, the European M genogroup isolates could not be placed in the same specific group. These results indicate that genotypic and serotypic classification do not correlate. ?? 2009 Inter-Research.

  19. Reconsideration of the seven discrete typing units within the species Trypanosoma cruzi, a new proposal of three reliable mitochondrial clades.

    PubMed

    Barnabé, Christian; Mobarec, Hugo Ignacio; Jurado, Marcelo Roman; Cortez, Jacqueline Andrea; Brenière, Simone Frédérique

    2016-04-01

    It is generally acknowledged that Trypanosoma cruzi, responsible for Chagas disease, is structured into six or seven distinct discrete typing units (DTUs), and termed TcI through TcVI and TcBat for the seventh, by a collective of researchers. However, such structuring can be validated only when the species is analyzed over its entire distribution area with the same genetic markers. Many works have dealt with several DTUs in limited areas, generally one country, others have dealt with only one DTU over the endemic area, but no work has reported data of all DTUs over the entire endemic area. Hence, the aim of this minireview was to analyze three gene sequences, already deposited in GenBank by others, over the entire geographical distribution of Chagas disease. Two mitochondrial (CytB and COII) and one nuclear gene (Gpi) were selected (i) among those most widely used in the field, (ii) of single copy for the nuclear one, and (iii) presenting common sequences of sufficient size for applying phylogenetic tools. They were analyzed using maximum likelihood trees and phylogenetic networks. Remarkably, only three significant clusters instead of seven were found with the mitochondrial genes. With the nuclear gene, surprisingly, all seven expected clusters did not have significant bootstrap values. Moreover, DTUs TcV and TcVI were indistinguishable as were TcIII and TcIV. Additionally, we have undertaken a minireview of seventy-five publications presenting phylogenetic trees with identifiable DTUs that allowed us, together with our own results, to seriously question the structuring of T. cruzi into six or seven separated DTUs. We propose that mitochondrial typing in three clusters currently named mtTcI, mtTcII, and mtTcIII is robust whereas nuclear typing may lead to a questionable clustering but it is valuable for detecting mitochondrial introgression, heterozygous states and allelic composition. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. Innate responses to gene knockouts impact overlapping gene networks and vary with respect to resistance to viral infection.

    PubMed

    Liu, Yonghong; Liu, Yuanyuan; Wu, Jiaming; Roizman, Bernard; Zhou, Grace Guoying

    2018-04-03

    Analyses of the levels of mRNAs encoding IFIT1, IFI16, RIG-1, MDA5, CXCL10, LGP2, PUM1, LSD1, STING, and IFNβ in cell lines from which the gene encoding LGP2, LSD1, PML, HDAC4, IFI16, PUM1, STING, MDA5, IRF3, or HDAC 1 had been knocked out, as well as the ability of these cell lines to support the replication of HSV-1, revealed the following: ( i ) Cell lines lacking the gene encoding LGP2, PML, or HDAC4 (cluster 1) exhibited increased levels of expression of partially overlapping gene networks. Concurrently, these cell lines produced from 5 fold to 12 fold lower yields of HSV-1 than the parental cells. ( ii ) Cell lines lacking the genes encoding STING, LSD1, MDA5, IRF3, or HDAC 1 (cluster 2) exhibited decreased levels of mRNAs of partially overlapping gene networks. Concurrently, these cell lines produced virus yields that did not differ from those produced by the parental cell line. The genes up-regulated in cell lines forming cluster 1, overlapped in part with genes down-regulated in cluster 2. The key conclusions are that gene knockouts and subsequent selection for growth causes changes in expression of multiple genes, and hence the phenotype of the cell lines cannot be ascribed to a single gene; the patterns of gene expression may be shared by multiple knockouts; and the enhanced immunity to viral replication by cluster 1 knockout cell lines but not by cluster 2 cell lines suggests that in parental cells, the expression of innate resistance to infection is specifically repressed.

  1. CRAWview: for viewing splicing variation, gene families, and polymorphism in clusters of ESTs and full-length sequences.

    PubMed

    Chou, A; Burke, J

    1999-05-01

    DNA sequence clustering has become a valuable method in support of gene discovery and gene expression analysis. Our interest lies in leveraging the sequence diversity within clusters of expressed sequence tags (ESTs) to model gene structure for the study of gene variants that arise from, among other things, alternative mRNA splicing, polymorphism, and divergence after gene duplication, fusion, and translocation events. In previous work, CRAW was developed to discover gene variants from assembled clusters of ESTs. Most importantly, novel gene features (the differing units between gene variants, for example alternative exons, polymorphisms, transposable elements, etc.) that are specialized to tissue, disease, population, or developmental states can be identified when these tools collate DNA source information with gene variant discrimination. While the goal is complete automation of novel feature and gene variant detection, current methods are far from perfect and hence the development of effective tools for visualization and exploratory data analysis are of paramount importance in the process of sifting through candidate genes and validating targets. We present CRAWview, a Java based visualization extension to CRAW. Features that vary between gene forms are displayed using an automatically generated color coded index. The reporting format of CRAWview gives a brief, high level summary report to display overlap and divergence within clusters of sequences as well as the ability to 'drill down' and see detailed information concerning regions of interest. Additionally, the alignment viewing and editing capabilities of CRAWview make it possible to interactively correct frame-shifts and otherwise edit cluster assemblies. We have implemented CRAWview as a Java application across windows NT/95 and UNIX platforms. A beta version of CRAWview will be freely available to academic users from Pangea Systems (http://www.pangeasystems.com). Contact :

  2. Identification of an unusual type II thioesterase in the dithiolopyrrolone antibiotics biosynthetic pathway.

    PubMed

    Zhai, Ying; Bai, Silei; Liu, Jingjing; Yang, Liyuan; Han, Li; Huang, Xueshi; He, Jing

    2016-04-22

    Dithiolopyrrolone group antibiotics characterized by an electronically unique dithiolopyrrolone heterobicyclic core are known for their antibacterial, antifungal, insecticidal and antitumor activities. Recently the biosynthetic gene clusters for two dithiolopyrrolone compounds, holomycin and thiomarinol, have been identified respectively in different bacterial species. Here, we report a novel dithiolopyrrolone biosynthetic gene cluster (aut) isolated from Streptomyces thioluteus DSM 40027 which produces two pyrrothine derivatives, aureothricin and thiolutin. By comparison with other characterized dithiolopyrrolone clusters, eight genes in the aut cluster were verified to be responsible for the assembly of dithiolopyrrolone core. The aut cluster was further confirmed by heterologous expression and in-frame gene deletion experiments. Intriguingly, we found that the heterogenetic thioesterase HlmK derived from the holomycin (hlm) gene cluster in Streptomyces clavuligerus significantly improved heterologous biosynthesis of dithiolopyrrolones in Streptomyces albus through coexpression with the aut cluster. In the previous studies, HlmK was considered invalid because it has a Ser to Gly point mutation within the canonical Ser-His-Asp catalytic triad of thioesterases. However, gene inactivation and complementation experiments in our study unequivocally demonstrated that HlmK is an active distinctive type II thioesterase that plays a beneficial role in dithiolopyrrolone biosynthesis. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. Post-genome research on the biosynthesis of ergot alkaloids.

    PubMed

    Li, Shu-Ming; Unsöld, Inge A

    2006-10-01

    Genome sequencing provides new opportunities and challenges for identifying genes for the biosynthesis of secondary metabolites. A putative biosynthetic gene cluster of fumigaclavine C, an ergot alkaloid of the clavine type, was identified in the genome sequence of ASPERGILLUS FUMIGATUS by a bioinformatic approach. This cluster spans 22 kb of genomic DNA and comprises at least 11 open reading frames (ORFs). Seven of them are orthologous to genes from the biosynthetic gene cluster of ergot alkaloids in CLAVICEPS PURPUREA. Experimental evidence of the identified cluster was provided by heterologous expression and biochemical characterization of two ORFs, FgaPT1 and FgaPT2, in the cluster of A. FUMIGATUS, which show remarkable similarities to dimethylallyltryptophan synthase from C. PURPUREA and function as prenyltransferases. FgaPT2 converts L-tryptophan to dimethylallyltryptophan and thereby catalyzes the first step of ergot alkaloid biosynthesis, whilst FgaPT1 catalyzes the last step of the fumigaclavine C biosynthesis, i. e., the prenylation of fumigaclavine A at C-2 position of the indole nucleus. In addition to information obtained from the gene cluster of ergot alkaloids from C. PURPUREA, the identification of the biosynthetic gene cluster of fumigaclavine C in A. FUMIGATUS opens an alternative way to study the biosynthesis of ergot alkaloids in fungi.

  4. Fast clustering using adaptive density peak detection.

    PubMed

    Wang, Xiao-Feng; Xu, Yifan

    2017-12-01

    Common limitations of clustering methods include the slow algorithm convergence, the instability of the pre-specification on a number of intrinsic parameters, and the lack of robustness to outliers. A recent clustering approach proposed a fast search algorithm of cluster centers based on their local densities. However, the selection of the key intrinsic parameters in the algorithm was not systematically investigated. It is relatively difficult to estimate the "optimal" parameters since the original definition of the local density in the algorithm is based on a truncated counting measure. In this paper, we propose a clustering procedure with adaptive density peak detection, where the local density is estimated through the nonparametric multivariate kernel estimation. The model parameter is then able to be calculated from the equations with statistical theoretical justification. We also develop an automatic cluster centroid selection method through maximizing an average silhouette index. The advantage and flexibility of the proposed method are demonstrated through simulation studies and the analysis of a few benchmark gene expression data sets. The method only needs to perform in one single step without any iteration and thus is fast and has a great potential to apply on big data analysis. A user-friendly R package ADPclust is developed for public use.

  5. A comprehensive analysis of Helicobacter pylori plasticity zones reveals that they are integrating conjugative elements with intermediate integration specificity.

    PubMed

    Fischer, Wolfgang; Breithaupt, Ute; Kern, Beate; Smith, Stella I; Spicher, Carolin; Haas, Rainer

    2014-04-27

    The human gastric pathogen Helicobacter pylori is a paradigm for chronic bacterial infections. Its persistence in the stomach mucosa is facilitated by several mechanisms of immune evasion and immune modulation, but also by an unusual genetic variability which might account for the capability to adapt to changing environmental conditions during long-term colonization. This variability is reflected by the fact that almost each infected individual is colonized by a genetically unique strain. Strain-specific genes are dispersed throughout the genome, but clusters of genes organized as genomic islands may also collectively be present or absent. We have comparatively analysed such clusters, which are commonly termed plasticity zones, in a high number of H. pylori strains of varying geographical origin. We show that these regions contain fixed gene sets, rather than being true regions of genome plasticity, but two different types and several subtypes with partly diverging gene content can be distinguished. Their genetic diversity is incongruent with variations in the rest of the genome, suggesting that they are subject to horizontal gene transfer within H. pylori populations. We identified 40 distinct integration sites in 45 genome sequences, with a conserved heptanucleotide motif that seems to be the minimal requirement for integration. The significant number of possible integration sites, together with the requirement for a short conserved integration motif and the high level of gene conservation, indicates that these elements are best described as integrating conjugative elements (ICEs) with an intermediate integration site specificity.

  6. Genomics of Sponge-Associated Streptomyces spp. Closely Related to Streptomyces albus J1074: Insights into Marine Adaptation and Secondary Metabolite Biosynthesis Potential

    PubMed Central

    Ian, Elena; Malko, Dmitry B.; Sekurova, Olga N.; Bredholt, Harald; Rückert, Christian; Borisova, Marina E.; Albersmeier, Andreas; Kalinowski, Jörn; Gelfand, Mikhail S.; Zotchev, Sergey B.

    2014-01-01

    A total of 74 actinomycete isolates were cultivated from two marine sponges, Geodia barretti and Phakellia ventilabrum collected at the same spot at the bottom of the Trondheim fjord (Norway). Phylogenetic analyses of sponge-associated actinomycetes based on the 16S rRNA gene sequences demonstrated the presence of species belonging to the genera Streptomyces, Nocardiopsis, Rhodococcus, Pseudonocardia and Micromonospora. Most isolates required sea water for growth, suggesting them being adapted to the marine environment. Phylogenetic analysis of Streptomyces spp. revealed two isolates that originated from different sponges and had 99.7% identity in their 16S rRNA gene sequences, indicating that they represent very closely related strains. Sequencing, annotation, and analyses of the genomes of these Streptomyces isolates demonstrated that they are sister organisms closely related to terrestrial Streptomyces albus J1074. Unlike S. albus J1074, the two sponge streptomycetes grew and differentiated faster on the medium containing sea water. Comparative genomics revealed several genes presumably responsible for partial marine adaptation of these isolates. Genome mining targeted to secondary metabolite biosynthesis gene clusters identified several of those, which were not present in S. albus J1074, and likely to have been retained from a common ancestor, or acquired from other actinomycetes. Certain genes and gene clusters were shown to be differentially acquired or lost, supporting the hypothesis of divergent evolution of the two Streptomyces species in different sponge hosts. PMID:24819608

  7. Genomics of sponge-associated Streptomyces spp. closely related to Streptomyces albus J1074: insights into marine adaptation and secondary metabolite biosynthesis potential.

    PubMed

    Ian, Elena; Malko, Dmitry B; Sekurova, Olga N; Bredholt, Harald; Rückert, Christian; Borisova, Marina E; Albersmeier, Andreas; Kalinowski, Jörn; Gelfand, Mikhail S; Zotchev, Sergey B

    2014-01-01

    A total of 74 actinomycete isolates were cultivated from two marine sponges, Geodia barretti and Phakellia ventilabrum collected at the same spot at the bottom of the Trondheim fjord (Norway). Phylogenetic analyses of sponge-associated actinomycetes based on the 16S rRNA gene sequences demonstrated the presence of species belonging to the genera Streptomyces, Nocardiopsis, Rhodococcus, Pseudonocardia and Micromonospora. Most isolates required sea water for growth, suggesting them being adapted to the marine environment. Phylogenetic analysis of Streptomyces spp. revealed two isolates that originated from different sponges and had 99.7% identity in their 16S rRNA gene sequences, indicating that they represent very closely related strains. Sequencing, annotation, and analyses of the genomes of these Streptomyces isolates demonstrated that they are sister organisms closely related to terrestrial Streptomyces albus J1074. Unlike S. albus J1074, the two sponge streptomycetes grew and differentiated faster on the medium containing sea water. Comparative genomics revealed several genes presumably responsible for partial marine adaptation of these isolates. Genome mining targeted to secondary metabolite biosynthesis gene clusters identified several of those, which were not present in S. albus J1074, and likely to have been retained from a common ancestor, or acquired from other actinomycetes. Certain genes and gene clusters were shown to be differentially acquired or lost, supporting the hypothesis of divergent evolution of the two Streptomyces species in different sponge hosts.

  8. Statistical indicators of collective behavior and functional clusters in gene networks of yeast

    NASA Astrophysics Data System (ADS)

    Živković, J.; Tadić, B.; Wick, N.; Thurner, S.

    2006-03-01

    We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.

  9. Genome Engineering and Modification Toward Synthetic Biology for the Production of Antibiotics.

    PubMed

    Zou, Xuan; Wang, Lianrong; Li, Zhiqiang; Luo, Jie; Wang, Yunfu; Deng, Zixin; Du, Shiming; Chen, Shi

    2018-01-01

    Antibiotic production is often governed by large gene clusters composed of genes related to antibiotic scaffold synthesis, tailoring, regulation, and resistance. With the expansion of genome sequencing, a considerable number of antibiotic gene clusters has been isolated and characterized. The emerging genome engineering techniques make it possible towards more efficient engineering of antibiotics. In addition to genomic editing, multiple synthetic biology approaches have been developed for the exploration and improvement of antibiotic natural products. Here, we review the progress in the development of these genome editing techniques used to engineer new antibiotics, focusing on three aspects of genome engineering: direct cloning of large genomic fragments, genome engineering of gene clusters, and regulation of gene cluster expression. This review will not only summarize the current uses of genomic engineering techniques for cloning and assembly of antibiotic gene clusters or for altering antibiotic synthetic pathways but will also provide perspectives on the future directions of rebuilding biological systems for the design of novel antibiotics. © 2017 Wiley Periodicals, Inc.

  10. Querying Co-regulated Genes on Diverse Gene Expression Datasets Via Biclustering.

    PubMed

    Deveci, Mehmet; Küçüktunç, Onur; Eren, Kemal; Bozdağ, Doruk; Kaya, Kamer; Çatalyürek, Ümit V

    2016-01-01

    Rapid development and increasing popularity of gene expression microarrays have resulted in a number of studies on the discovery of co-regulated genes. One important way of discovering such co-regulations is the query-based search since gene co-expressions may indicate a shared role in a biological process. Although there exist promising query-driven search methods adapting clustering, they fail to capture many genes that function in the same biological pathway because microarray datasets are fraught with spurious samples or samples of diverse origin, or the pathways might be regulated under only a subset of samples. On the other hand, a class of clustering algorithms known as biclustering algorithms which simultaneously cluster both the items and their features are useful while analyzing gene expression data, or any data in which items are related in only a subset of their samples. This means that genes need not be related in all samples to be clustered together. Because many genes only interact under specific circumstances, biclustering may recover the relationships that traditional clustering algorithms can easily miss. In this chapter, we briefly summarize the literature using biclustering for querying co-regulated genes. Then we present a novel biclustering approach and evaluate its performance by a thorough experimental analysis.

  11. Novel genomic island modifies DNA with 7-deazaguanine derivatives

    PubMed Central

    Thiaville, Jennifer J.; Kellner, Stefanie M.; Yuan, Yifeng; Hutinet, Geoffrey; Thiaville, Patrick C.; Jumpathong, Watthanachai; Mohapatra, Susovan; Brochier-Armanet, Celine; Letarov, Andrey V.; Hillebrand, Roman; Malik, Chanchal K.; Rizzo, Carmelo J.; Dedon, Peter C.; de Crécy-Lagard, Valérie

    2016-01-01

    The discovery of ∼20-kb gene clusters containing a family of paralogs of tRNA guanosine transglycosylase genes, called tgtA5, alongside 7-cyano-7-deazaguanine (preQ0) synthesis and DNA metabolism genes, led to the hypothesis that 7-deazaguanine derivatives are inserted in DNA. This was established by detecting 2’-deoxy-preQ0 and 2’-deoxy-7-amido-7-deazaguanosine in enzymatic hydrolysates of DNA extracted from the pathogenic, Gram-negative bacteria Salmonella enterica serovar Montevideo. These modifications were absent in the closely related S. enterica serovar Typhimurium LT2 and from a mutant of S. Montevideo, each lacking the gene cluster. This led us to rename the genes of the S. Montevideo cluster as dpdA-K for 7-deazapurine in DNA. Similar gene clusters were analyzed in ∼150 phylogenetically diverse bacteria, and the modifications were detected in DNA from other organisms containing these clusters, including Kineococcus radiotolerans, Comamonas testosteroni, and Sphingopyxis alaskensis. Comparative genomic analysis shows that, in Enterobacteriaceae, the cluster is a genomic island integrated at the leuX locus, and the phylogenetic analysis of the TgtA5 family is consistent with widespread horizontal gene transfer. Comparison of transformation efficiencies of modified or unmodified plasmids into isogenic S. Montevideo strains containing or lacking the cluster strongly suggests a restriction–modification role for the cluster in Enterobacteriaceae. Another preQ0 derivative, 2’-deoxy-7-formamidino-7-deazaguanosine, was found in the Escherichia coli bacteriophage 9g, as predicted from the presence of homologs of genes involved in the synthesis of the archaeosine tRNA modification. These results illustrate a deep and unexpected evolutionary connection between DNA and tRNA metabolism. PMID:26929322

  12. Analysis of genetic association in Listeria and Diabetes using Hierarchical Clustering and Silhouette Index

    NASA Astrophysics Data System (ADS)

    Pagnuco, Inti A.; Pastore, Juan I.; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L.

    2016-04-01

    It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, where significative groups of genes are defined based on some criteria. This task is usually performed by clustering algorithms, where the whole family of genes, or a subset of them, are clustered into meaningful groups based on their expression values in a set of experiment. In this work we used a methodology based on the Silhouette index as a measure of cluster quality for individual gene groups, and a combination of several variants of hierarchical clustering to generate the candidate groups, to obtain sets of co-expressed genes for two real data examples. We analyzed the quality of the best ranked groups, obtained by the algorithm, using an online bioinformatics tool that provides network information for the selected genes. Moreover, to verify the performance of the algorithm, considering the fact that it doesn’t find all possible subsets, we compared its results against a full search, to determine the amount of good co-regulated sets not detected.

  13. The Fdb3 transcription factor of the Fusarium Detoxification of Benzoxazolinone gene cluster is required for MBOA but not BOA degradation in Fusarium pseudograminearum.

    PubMed

    Kettle, Andrew J; Carere, Jason; Batley, Jacqueline; Manners, John M; Kazan, Kemal; Gardiner, Donald M

    2016-03-01

    A number of cereals produce the benzoxazolinone class of phytoalexins. Fusarium species pathogenic towards these hosts can typically degrade these compounds via an aminophenol intermediate, and the ability to do so is encoded by a group of genes found in the Fusarium Detoxification of Benzoxazolinone (FDB) cluster. A zinc finger transcription factor encoded by one of the FDB cluster genes (FDB3) has been proposed to regulate the expression of other genes in the cluster and hence is potentially involved in benzoxazolinone degradation. Herein we show that Fdb3 is essential for the ability of Fusarium pseudograminearum to efficiently detoxify the predominant wheat benzoxazolinone, 6-methoxy-benzoxazolin-2-one (MBOA), but not benzoxazoline-2-one (BOA). Furthermore, additional genes thought to be part of the FDB gene cluster, based upon transcriptional response to benzoxazolinones, are regulated by Fdb3. However, deletion mutants for these latter genes remain capable of benzoxazolinone degradation, suggesting that they are not essential for this process. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.

  14. Hierarchical Dirichlet process model for gene expression clustering

    PubMed Central

    2013-01-01

    Clustering is an important data processing tool for interpreting microarray data and genomic network inference. In this article, we propose a clustering algorithm based on the hierarchical Dirichlet processes (HDP). The HDP clustering introduces a hierarchical structure in the statistical model which captures the hierarchical features prevalent in biological data such as the gene express data. We develop a Gibbs sampling algorithm based on the Chinese restaurant metaphor for the HDP clustering. We apply the proposed HDP algorithm to both regulatory network segmentation and gene expression clustering. The HDP algorithm is shown to outperform several popular clustering algorithms by revealing the underlying hierarchical structure of the data. For the yeast cell cycle data, we compare the HDP result to the standard result and show that the HDP algorithm provides more information and reduces the unnecessary clustering fragments. PMID:23587447

  15. Allelic recombination between distinct genomic locations generates copy number diversity in human β-defensins

    PubMed Central

    Bakar, Suhaili Abu; Hollox, Edward J.; Armour, John A. L.

    2009-01-01

    β-Defensins are small secreted antimicrobial and signaling peptides involved in the innate immune response of vertebrates. In humans, a cluster of at least 7 of these genes shows extensive copy number variation, with a diploid copy number commonly ranging between 2 and 7. Using a genetic mapping approach, we show that this cluster is at not 1 but 2 distinct genomic loci ≈5 Mb apart on chromosome band 8p23.1, contradicting the most recent genome assembly. We also demonstrate that the predominant mechanism of change in β-defensin copy number is simple allelic recombination occurring in the interval between the 2 distinct genomic loci for these genes. In 416 meiotic transmissions, we observe 3 events creating a haplotype copy number not found in the parent, equivalent to a germ-line rate of copy number change of ≈0.7% per gamete. This places it among the fastest-changing copy number variants currently known. PMID:19131514

  16. Identification of the Coumermycin A1 Biosynthetic Gene Cluster of Streptomyces rishiriensis DSM 40489

    PubMed Central

    Wang, Zhao-Xin; Li, Shu-Ming; Heide, Lutz

    2000-01-01

    The biosynthetic gene cluster of the aminocoumarin antibiotic coumermycin A1 was cloned by screening of a cosmid library of Streptomyces rishiriensis DSM 40489 with heterologous probes from a dTDP-glucose 4,6-dehydratase gene, involved in deoxysugar biosynthesis, and from the aminocoumarin resistance gyrase gene gyrBr. Sequence analysis of a 30.8-kb region upstream of gyrBr revealed the presence of 28 complete open reading frames (ORFs). Fifteen of the identified ORFs showed, on average, 84% identity to corresponding ORFs in the biosynthetic gene cluster of novobiocin, another aminocoumarin antibiotic. Possible functions of 17 ORFs in the biosynthesis of coumermycin A1 could be assigned by comparison with sequences in GenBank. Experimental proof for the function of the identified gene cluster was provided by an insertional gene inactivation experiment, which resulted in an abolishment of coumermycin A1 production. PMID:11036020

  17. Whole Blood Gene Expression Profiling Predicts Severe Morbidity and Mortality in Cystic Fibrosis: A 5-Year Follow-Up Study.

    PubMed

    Saavedra, Milene T; Quon, Bradley S; Faino, Anna; Caceres, Silvia M; Poch, Katie R; Sanders, Linda A; Malcolm, Kenneth C; Nichols, David P; Sagel, Scott D; Taylor-Cousar, Jennifer L; Leach, Sonia M; Strand, Matthew; Nick, Jerry A

    2018-05-01

    Cystic fibrosis pulmonary exacerbations accelerate pulmonary decline and increase mortality. Previously, we identified a 10-gene leukocyte panel measured directly from whole blood, which indicates response to exacerbation treatment. We hypothesized that molecular characteristics of exacerbations could also predict future disease severity. We tested whether a 10-gene panel measured from whole blood could identify patient cohorts at increased risk for severe morbidity and mortality, beyond standard clinical measures. Transcript abundance for the 10-gene panel was measured from whole blood at the beginning of exacerbation treatment (n = 57). A hierarchical cluster analysis of subjects based on their gene expression was performed, yielding four molecular clusters. An analysis of cluster membership and outcomes incorporating an independent cohort (n = 21) was completed to evaluate robustness of cluster partitioning of genes to predict severe morbidity and mortality. The four molecular clusters were analyzed for differences in forced expiratory volume in 1 second, C-reactive protein, return to baseline forced expiratory volume in 1 second after treatment, time to next exacerbation, and time to morbidity or mortality events (defined as lung transplant referral, lung transplant, intensive care unit admission for respiratory insufficiency, or death). Clustering based on gene expression discriminated between patient groups with significant differences in forced expiratory volume in 1 second, admission frequency, and overall morbidity and mortality. At 5 years, all subjects in cluster 1 (very low risk) were alive and well, whereas 90% of subjects in cluster 4 (high risk) had suffered a major event (P = 0.0001). In multivariable analysis, the ability of gene expression to predict clinical outcomes remained significant, despite adjustment for forced expiratory volume in 1 second, sex, and admission frequency. The robustness of gene clustering to categorize patients appropriately in terms of clinical characteristics, and short- and long-term clinical outcomes, remained consistent, even when adding in a secondary population with significantly different clinical outcomes. Whole blood gene expression profiling allows molecular classification of acute pulmonary exacerbations, beyond standard clinical measures, providing a predictive tool for identifying subjects at increased risk for mortality and disease progression.

  18. An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data.

    PubMed

    Hsu, Arthur L; Tang, Sen-Lin; Halgamuge, Saman K

    2003-11-01

    Current Self-Organizing Maps (SOMs) approaches to gene expression pattern clustering require the user to predefine the number of clusters likely to be expected. Hierarchical clustering methods used in this area do not provide unique partitioning of data. We describe an unsupervised dynamic hierarchical self-organizing approach, which suggests an appropriate number of clusters, to perform class discovery and marker gene identification in microarray data. In the process of class discovery, the proposed algorithm identifies corresponding sets of predictor genes that best distinguish one class from other classes. The approach integrates merits of hierarchical clustering with robustness against noise known from self-organizing approaches. The proposed algorithm applied to DNA microarray data sets of two types of cancers has demonstrated its ability to produce the most suitable number of clusters. Further, the corresponding marker genes identified through the unsupervised algorithm also have a strong biological relationship to the specific cancer class. The algorithm tested on leukemia microarray data, which contains three leukemia types, was able to determine three major and one minor cluster. Prediction models built for the four clusters indicate that the prediction strength for the smaller cluster is generally low, therefore labelled as uncertain cluster. Further analysis shows that the uncertain cluster can be subdivided further, and the subdivisions are related to two of the original clusters. Another test performed using colon cancer microarray data has automatically derived two clusters, which is consistent with the number of classes in data (cancerous and normal). JAVA software of dynamic SOM tree algorithm is available upon request for academic use. A comparison of rectangular and hexagonal topologies for GSOM is available from http://www.mame.mu.oz.au/mechatronics/journalinfo/Hsu2003supp.pdf

  19. The type VI secretion system of Vibrio cholerae fosters horizontal gene transfer.

    PubMed

    Borgeaud, Sandrine; Metzger, Lisa C; Scrignari, Tiziana; Blokesch, Melanie

    2015-01-02

    Natural competence for transformation is a common mode of horizontal gene transfer and contributes to bacterial evolution. Transformation occurs through the uptake of external DNA and its integration into the genome. Here we show that the type VI secretion system (T6SS), which serves as a predatory killing device, is part of the competence regulon in the naturally transformable pathogen Vibrio cholerae. The T6SS-encoding gene cluster is under the positive control of the competence regulators TfoX and QstR and is induced by growth on chitinous surfaces. Live-cell imaging revealed that deliberate killing of nonimmune cells via competence-mediated induction of T6SS releases DNA and makes it accessible for horizontal gene transfer in V. cholerae. Copyright © 2015, American Association for the Advancement of Science.

  20. Transcription factor clusters regulate genes in eukaryotic cells

    PubMed Central

    Hedlund, Erik G; Friemann, Rosmarie; Hohmann, Stefan

    2017-01-01

    Transcription is regulated through binding factors to gene promoters to activate or repress expression, however, the mechanisms by which factors find targets remain unclear. Using single-molecule fluorescence microscopy, we determined in vivo stoichiometry and spatiotemporal dynamics of a GFP tagged repressor, Mig1, from a paradigm signaling pathway of Saccharomyces cerevisiae. We find the repressor operates in clusters, which upon extracellular signal detection, translocate from the cytoplasm, bind to nuclear targets and turnover. Simulations of Mig1 configuration within a 3D yeast genome model combined with a promoter-specific, fluorescent translation reporter confirmed clusters are the functional unit of gene regulation. In vitro and structural analysis on reconstituted Mig1 suggests that clusters are stabilized by depletion forces between intrinsically disordered sequences. We observed similar clusters of a co-regulatory activator from a different pathway, supporting a generalized cluster model for transcription factors that reduces promoter search times through intersegment transfer while stabilizing gene expression. PMID:28841133

Top