gene expression clusters: Topics by Science.gov

Sample records for gene expression clusters

Clustering cancer gene expression data by projective clustering ensemble

PubMed Central

Yu, Xianxue; Yu, Guoxian

2017-01-01

Gene expression data analysis has paramount implications for gene treatments, cancer diagnosis and other domains. Clustering is an important and promising tool to analyze gene expression data. Gene expression data is often characterized by a large amount of genes but with limited samples, thus various projective clustering techniques and ensemble techniques have been suggested to combat with these challenges. However, it is rather challenging to synergy these two kinds of techniques together to avoid the curse of dimensionality problem and to boost the performance of gene expression data clustering. In this paper, we employ a projective clustering ensemble (PCE) to integrate the advantages of projective clustering and ensemble clustering, and to avoid the dilemma of combining multiple projective clusterings. Our experimental results on publicly available cancer gene expression data show PCE can improve the quality of clustering gene expression data by at least 4.5% (on average) than other related techniques, including dimensionality reduction based single clustering and ensemble approaches. The empirical study demonstrates that, to further boost the performance of clustering cancer gene expression data, it is necessary and promising to synergy projective clustering with ensemble clustering. PCE can serve as an effective alternative technique for clustering gene expression data. PMID:28234920
Functional clustering of time series gene expression data by Granger causality

PubMed Central

2012-01-01

Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425
Constrained clusters of gene expression profiles with pathological features.

PubMed

Sese, Jun; Kurokawa, Yukinori; Monden, Morito; Kato, Kikuya; Morishita, Shinichi

2004-11-22

Gene expression profiles should be useful in distinguishing variations in disease, since they reflect accurately the status of cells. The primary clustering of gene expression reveals the genotypes that are responsible for the proximity of members within each cluster, while further clustering elucidates the pathological features of the individual members of each cluster. However, since the first clustering process and the second classification step, in which the features are associated with clusters, are performed independently, the initial set of clusters may omit genes that are associated with pathologically meaningful features. Therefore, it is important to devise a way of identifying gene expression clusters that are associated with pathological features. We present the novel technique of 'itemset constrained clustering' (IC-Clustering), which computes the optimal cluster that maximizes the interclass variance of gene expression between groups, which are divided according to the restriction that only divisions that can be expressed using common features are allowed. This constraint automatically labels each cluster with a set of pathological features which characterize that cluster. When applied to liver cancer datasets, IC-Clustering revealed informative gene expression clusters, which could be annotated with various pathological features, such as 'tumor' and 'man', or 'except tumor' and 'normal liver function'. In contrast, the k-means method overlooked these clusters.
Analysis of multiplex gene expression maps obtained by voxelation.

PubMed

An, Li; Xie, Hongbo; Chin, Mark H; Obradovic, Zoran; Smith, Desmond J; Megalooikonomou, Vasileios

2009-04-29

Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions. To analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum. The experimental results confirm the hypothesis that genes with similar gene expression maps might have similar gene functions. The voxelation data takes into account the location information of gene expression level in mouse brain, which is novel in related research. The proposed approach can potentially be used to predict gene functions and provide helpful suggestions to biologists.
Computational gene expression profiling under salt stress reveals patterns of co-expression

PubMed Central

Sanchita; Sharma, Ashok

2016-01-01

Plants respond differently to environmental conditions. Among various abiotic stresses, salt stress is a condition where excess salt in soil causes inhibition of plant growth. To understand the response of plants to the stress conditions, identification of the responsible genes is required. Clustering is a data mining technique used to group the genes with similar expression. The genes of a cluster show similar expression and function. We applied clustering algorithms on gene expression data of Solanum tuberosum showing differential expression in Capsicum annuum under salt stress. The clusters, which were common in multiple algorithms were taken further for analysis. Principal component analysis (PCA) further validated the findings of other cluster algorithms by visualizing their clusters in three-dimensional space. Functional annotation results revealed that most of the genes were involved in stress related responses. Our findings suggest that these algorithms may be helpful in the prediction of the function of co-expressed genes. PMID:26981411
Fractal Clustering and Knowledge-driven Validation Assessment for Gene Expression Profiling.

PubMed

Wang, Lu-Yong; Balasubramanian, Ammaiappan; Chakraborty, Amit; Comaniciu, Dorin

2005-01-01

DNA microarray experiments generate a substantial amount of information about the global gene expression. Gene expression profiles can be represented as points in multi-dimensional space. It is essential to identify relevant groups of genes in biomedical research. Clustering is helpful in pattern recognition in gene expression profiles. A number of clustering techniques have been introduced. However, these traditional methods mainly utilize shape-based assumption or some distance metric to cluster the points in multi-dimension linear Euclidean space. Their results shows poor consistence with the functional annotation of genes in previous validation study. From a novel different perspective, we propose fractal clustering method to cluster genes using intrinsic (fractal) dimension from modern geometry. This method clusters points in such a way that points in the same clusters are more self-affine among themselves than to the points in other clusters. We assess this method using annotation-based validation assessment for gene clusters. It shows that this method is superior in identifying functional related gene groups than other traditional methods.
Diametrical clustering for identifying anti-correlated gene clusters.

PubMed

Dhillon, Inderjit S; Marcotte, Edward M; Roshan, Usman

2003-09-01

Clustering genes based upon their expression patterns allows us to predict gene function. Most existing clustering algorithms cluster genes together when their expression patterns show high positive correlation. However, it has been observed that genes whose expression patterns are strongly anti-correlated can also be functionally similar. Biologically, this is not unintuitive-genes responding to the same stimuli, regardless of the nature of the response, are more likely to operate in the same pathways. We present a new diametrical clustering algorithm that explicitly identifies anti-correlated clusters of genes. Our algorithm proceeds by iteratively (i). re-partitioning the genes and (ii). computing the dominant singular vector of each gene cluster; each singular vector serving as the prototype of a 'diametric' cluster. We empirically show the effectiveness of the algorithm in identifying diametrical or anti-correlated clusters. Testing the algorithm on yeast cell cycle data, fibroblast gene expression data, and DNA microarray data from yeast mutants reveals that opposed cellular pathways can be discovered with this method. We present systems whose mRNA expression patterns, and likely their functions, oppose the yeast ribosome and proteosome, along with evidence for the inverse transcriptional regulation of a number of cellular systems.
Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters.

PubMed

Lukashin, A V; Fuchs, R

2001-05-01

Cluster analysis of genome-wide expression data from DNA microarray hybridization studies has proved to be a useful tool for identifying biologically relevant groupings of genes and samples. In the present paper, we focus on several important issues related to clustering algorithms that have not yet been fully studied. We describe a simple and robust algorithm for the clustering of temporal gene expression profiles that is based on the simulated annealing procedure. In general, this algorithm guarantees to eventually find the globally optimal distribution of genes over clusters. We introduce an iterative scheme that serves to evaluate quantitatively the optimal number of clusters for each specific data set. The scheme is based on standard approaches used in regular statistical tests. The basic idea is to organize the search of the optimal number of clusters simultaneously with the optimization of the distribution of genes over clusters. The efficiency of the proposed algorithm has been evaluated by means of a reverse engineering experiment, that is, a situation in which the correct distribution of genes over clusters is known a priori. The employment of this statistically rigorous test has shown that our algorithm places greater than 90% genes into correct clusters. Finally, the algorithm has been tested on real gene expression data (expression changes during yeast cell cycle) for which the fundamental patterns of gene expression and the assignment of genes to clusters are well understood from numerous previous studies.
Clustering approaches to identifying gene expression patterns from DNA microarray data.

PubMed

Do, Jin Hwan; Choi, Dong-Kug

2008-04-30

The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.
An effective fuzzy kernel clustering analysis approach for gene expression data.

PubMed

Sun, Lin; Xu, Jiucheng; Yin, Jiaojiao

2015-01-01

Fuzzy clustering is an important tool for analyzing microarray data. A major problem in applying fuzzy clustering method to microarray gene expression data is the choice of parameters with cluster number and centers. This paper proposes a new approach to fuzzy kernel clustering analysis (FKCA) that identifies desired cluster number and obtains more steady results for gene expression data. First of all, to optimize characteristic differences and estimate optimal cluster number, Gaussian kernel function is introduced to improve spectrum analysis method (SAM). By combining subtractive clustering with max-min distance mean, maximum distance method (MDM) is proposed to determine cluster centers. Then, the corresponding steps of improved SAM (ISAM) and MDM are given respectively, whose superiority and stability are illustrated through performing experimental comparisons on gene expression data. Finally, by introducing ISAM and MDM into FKCA, an effective improved FKCA algorithm is proposed. Experimental results from public gene expression data and UCI database show that the proposed algorithms are feasible for cluster analysis, and the clustering accuracy is higher than the other related clustering algorithms.
Large clusters of co-expressed genes in the Drosophila genome.

PubMed

Boutanaev, Alexander M; Kalmykova, Alla I; Shevelyov, Yuri Y; Nurminsky, Dmitry I

2002-12-12

Clustering of co-expressed, non-homologous genes on chromosomes implies their co-regulation. In lower eukaryotes, co-expressed genes are often found in pairs. Clustering of genes that share aspects of transcriptional regulation has also been reported in higher eukaryotes. To advance our understanding of the mode of coordinated gene regulation in multicellular organisms, we performed a genome-wide analysis of the chromosomal distribution of co-expressed genes in Drosophila. We identified a total of 1,661 testes-specific genes, one-third of which are clustered on chromosomes. The number of clusters of three or more genes is much higher than expected by chance. We observed a similar trend for genes upregulated in the embryo and in the adult head, although the expression pattern of individual genes cannot be predicted on the basis of chromosomal position alone. Our data suggest that the prevalent mechanism of transcriptional co-regulation in higher eukaryotes operates with extensive chromatin domains that comprise multiple genes.
Genome-wide identification of physically clustered genes suggests chromatin-level co-regulation in male reproductive development in Arabidopsis thaliana

PubMed Central

Reimegård, Johan; Kundu, Snehangshu; Pendle, Ali; Irish, Vivian F.; Shaw, Peter

2017-01-01

Abstract Co-expression of physically linked genes occurs surprisingly frequently in eukaryotes. Such chromosomal clustering may confer a selective advantage as it enables coordinated gene regulation at the chromatin level. We studied the chromosomal organization of genes involved in male reproductive development in Arabidopsis thaliana. We developed an in-silico tool to identify physical clusters of co-regulated genes from gene expression data. We identified 17 clusters (96 genes) involved in stamen development and acting downstream of the transcriptional activator MS1 (MALE STERILITY 1), which contains a PHD domain associated with chromatin re-organization. The clusters exhibited little gene homology or promoter element similarity, and largely overlapped with reported repressive histone marks. Experiments on a subset of the clusters suggested a link between expression activation and chromatin conformation: qRT-PCR and mRNA in situ hybridization showed that the clustered genes were up-regulated within 48 h after MS1 induction; out of 14 chromatin-remodeling mutants studied, expression of clustered genes was consistently down-regulated only in hta9/hta11, previously associated with metabolic cluster activation; DNA fluorescence in situ hybridization confirmed that transcriptional activation of the clustered genes was correlated with open chromatin conformation. Stamen development thus appears to involve transcriptional activation of physically clustered genes through chromatin de-condensation. PMID:28175342
Clustering Algorithms: Their Application to Gene Expression Data

PubMed Central

Oyelade, Jelili; Isewon, Itunuoluwa; Oladipupo, Funke; Aromolaran, Olufemi; Uwoghiren, Efosa; Ameh, Faridah; Achas, Moses; Adebiyi, Ezekiel

2016-01-01

Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure. PMID:27932867
TimesVector: a vectorized clustering approach to the analysis of time series transcriptome data from multiple phenotypes.

PubMed

Jung, Inuk; Jo, Kyuri; Kang, Hyejin; Ahn, Hongryul; Yu, Youngjae; Kim, Sun

2017-12-01

Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i.e. gene-time-condition. Computational complexity for analyzing such data is very high, compared to the already difficult NP-hard two dimensional biclustering algorithms. Because of this challenge, traditional time series clustering algorithms are designed to capture co-expressed genes with similar expression pattern in two sample conditions. We present a triclustering algorithm, TimesVector, specifically designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. TimesVector identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters. Using four sets of time series gene expression data, generated by both microarray and high throughput sequencing platforms, we demonstrated that TimesVector successfully detected biologically meaningful clusters of high quality. TimesVector improved the clustering quality compared to existing triclustering tools and only TimesVector detected clusters with differential expression patterns across conditions successfully. The TimesVector software is available at http://biohealth.snu.ac.kr/software/TimesVector/. sunkim.bioinfo@snu.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

PubMed

Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

2012-07-15

Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of E<10(-5)) are included in 27 clusters. Five clusters are associated with metabolism, containing P450 genes restricted to the Brassica family and predicted to be involved in secondary metabolism. Operon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.
Functional genomics of commercial baker's yeasts that have different abilities for sugar utilization and high-sucrose tolerance under different sugar conditions.

PubMed

Tanaka-Tsuno, Fumiko; Mizukami-Murata, Satomi; Murata, Yoshinori; Nakamura, Toshihide; Ando, Akira; Takagi, Hiroshi; Shima, Jun

2007-10-01

In the modern baking industry, high-sucrose-tolerant (HS) and maltose-utilizing (LS) yeast were developed using breeding techniques and are now used commercially. Sugar utilization and high-sucrose tolerance differ significantly between HS and LS yeasts. We analysed the gene expression profiles of HS and LS yeasts under different sucrose conditions in order to determine their basic physiology. Two-way hierarchical clustering was performed to obtain the overall patterns of gene expression. The clustering clearly showed that the gene expression patterns of LS yeast differed from those of HS yeast. Quality threshold clustering was used to identify the gene clusters containing upregulated genes (cluster 1) and downregulated genes (cluster 2) under high-sucrose conditions. Clusters 1 and 2 contained numerous genes involved in carbon and nitrogen metabolism, respectively. The expression level of the genes involved in the metabolism of glycerol and trehalose, which are known to be osmoprotectants, in LS yeast was higher than that in HS yeast under sucrose concentrations of 5-40%. No clear correlation was found between the expression level of the genes involved in the biosynthesis of the osmoprotectants and the intracellular contents of the osmoprotectants. The present gene expression data were compared with data previously reported in a comprehensive analysis of a gene deletion strain collection. Welch's t-test for this comparison showed that the relative growth rates of the deletion strains whose deletion occurred in genes belonging to cluster 1 were significantly higher than the average growth rates of all deletion strains. Copyright 2007 John Wiley & Sons, Ltd.
Transcriptome Analysis of Aspergillus flavus Reveals veA-Dependent Regulation of Secondary Metabolite Gene Clusters, Including the Novel Aflavarin Cluster

PubMed Central

Cary, J. W.; Han, Z.; Yin, Y.; Lohmar, J. M.; Shantappa, S.; Harris-Coward, P. Y.; Mack, B.; Ehrlich, K. C.; Wei, Q.; Arroyo-Manzanares, N.; Uka, V.; Vanhaecke, L.; Bhatnagar, D.; Yu, J.; Nierman, W. C.; Johns, M. A.; Sorensen, D.; Shen, H.; De Saeger, S.; Diana Di Mavungu, J.

2015-01-01

The global regulatory veA gene governs development and secondary metabolism in numerous fungal species, including Aspergillus flavus. This is especially relevant since A. flavus infects crops of agricultural importance worldwide, contaminating them with potent mycotoxins. The most well-known are aflatoxins, which are cytotoxic and carcinogenic polyketide compounds. The production of aflatoxins and the expression of genes implicated in the production of these mycotoxins are veA dependent. The genes responsible for the synthesis of aflatoxins are clustered, a signature common for genes involved in fungal secondary metabolism. Studies of the A. flavus genome revealed many gene clusters possibly connected to the synthesis of secondary metabolites. Many of these metabolites are still unknown, or the association between a known metabolite and a particular gene cluster has not yet been established. In the present transcriptome study, we show that veA is necessary for the expression of a large number of genes. Twenty-eight out of the predicted 56 secondary metabolite gene clusters include at least one gene that is differentially expressed depending on presence or absence of veA. One of the clusters under the influence of veA is cluster 39. The absence of veA results in a downregulation of the five genes found within this cluster. Interestingly, our results indicate that the cluster is expressed mainly in sclerotia. Chemical analysis of sclerotial extracts revealed that cluster 39 is responsible for the production of aflavarin. PMID:26209694
From Coexpression to Coregulation: An Approach to Inferring Transcriptional Regulation Among Gene Classes from Large-Scale Expression Data

NASA Technical Reports Server (NTRS)

Mjolsness, Eric; Castano, Rebecca; Mann, Tobias; Wold, Barbara

2000-01-01

We provide preliminary evidence that existing algorithms for inferring small-scale gene regulation networks from gene expression data can be adapted to large-scale gene expression data coming from hybridization microarrays. The essential steps are (I) clustering many genes by their expression time-course data into a minimal set of clusters of co-expressed genes, (2) theoretically modeling the various conditions under which the time-courses are measured using a continuous-time analog recurrent neural network for the cluster mean time-courses, (3) fitting such a regulatory model to the cluster mean time courses by simulated annealing with weight decay, and (4) analysing several such fits for commonalities in the circuit parameter sets including the connection matrices. This procedure can be used to assess the adequacy of existing and future gene expression time-course data sets for determining transcriptional regulatory relationships such as coregulation.
Finding gene clusters for a replicated time course study

PubMed Central

2014-01-01

Background Finding genes that share similar expression patterns across samples is an important question that is frequently asked in high-throughput microarray studies. Traditional clustering algorithms such as K-means clustering and hierarchical clustering base gene clustering directly on the observed measurements and do not take into account the specific experimental design under which the microarray data were collected. A new model-based clustering method, the clustering of regression models method, takes into account the specific design of the microarray study and bases the clustering on how genes are related to sample covariates. It can find useful gene clusters for studies from complicated study designs such as replicated time course studies. Findings In this paper, we applied the clustering of regression models method to data from a time course study of yeast on two genotypes, wild type and YOX1 mutant, each with two technical replicates, and compared the clustering results with K-means clustering. We identified gene clusters that have similar expression patterns in wild type yeast, two of which were missed by K-means clustering. We further identified gene clusters whose expression patterns were changed in YOX1 mutant yeast compared to wild type yeast. Conclusions The clustering of regression models method can be a valuable tool for identifying genes that are coordinately transcribed by a common mechanism. PMID:24460656
Regulatory Feedback Loop of Two phz Gene Clusters through 5′-Untranslated Regions in Pseudomonas sp. M18

PubMed Central

Li, Yaqian; Du, Xilin; Lu, Zhi John; Wu, Daqiang; Zhao, Yilei; Ren, Bin; Huang, Jiaofang; Huang, Xianqing; Xu, Yuhong; Xu, Yuquan

2011-01-01

Background Phenazines are important compounds produced by pseudomonads and other bacteria. Two phz gene clusters called phzA1-G1 and phzA2-G2, respectively, were found in the genome of Pseudomonas sp. M18, an effective biocontrol agent, which is highly homologous to the opportunistic human pathogen P. aeruginosa PAO1, however little is known about the correlation between the expressions of two phz gene clusters. Methodology/Principal Findings Two chromosomal insertion inactivated mutants for the two gene clusters were constructed respectively and the correlation between the expressions of two phz gene clusters was investigated in strain M18. Phenazine-1-carboxylic acid (PCA) molecules produced from phzA2-G2 gene cluster are able to auto-regulate expression itself and activate the expression of phzA1-G1 gene cluster in a circulated amplification pattern. However, the post-transcriptional expression of phzA1-G1 transcript was blocked principally through 5′-untranslated region (UTR). In contrast, the phzA2-G2 gene cluster was transcribed to a lesser extent and translated efficiently and was negatively regulated by the GacA signal transduction pathway, mainly at a post-transcriptional level. Conclusions/Significance A single molecule, PCA, produced in different quantities by the two phz gene clusters acted as the functional mediator and the two phz gene clusters developed a specific regulatory mechanism which acts through 5′-UTR to transfer a single, but complex bacterial signaling event in Pseudomonas sp. strain M18. PMID:21559370

Identifying a gene expression signature of cluster headache in blood

PubMed Central

Eising, Else; Pelzer, Nadine; Vijfhuizen, Lisanne S.; Vries, Boukje de; Ferrari, Michel D.; ‘t Hoen, Peter A. C.; Terwindt, Gisela M.; van den Maagdenberg, Arn M. J. M.

2017-01-01

Cluster headache is a relatively rare headache disorder, typically characterized by multiple daily, short-lasting attacks of excruciating, unilateral (peri-)orbital or temporal pain associated with autonomic symptoms and restlessness. To better understand the pathophysiology of cluster headache, we used RNA sequencing to identify differentially expressed genes and pathways in whole blood of patients with episodic (n = 19) or chronic (n = 20) cluster headache in comparison with headache-free controls (n = 20). Gene expression data were analysed by gene and by module of co-expressed genes with particular attention to previously implicated disease pathways including hypocretin dysregulation. Only moderate gene expression differences were identified and no associations were found with previously reported pathogenic mechanisms. At the level of functional gene sets, associations were observed for genes involved in several brain-related mechanisms such as GABA receptor function and voltage-gated channels. In addition, genes and modules of co-expressed genes showed a role for intracellular signalling cascades, mitochondria and inflammation. Although larger study samples may be required to identify the full range of involved pathways, these results indicate a role for mitochondria, intracellular signalling and inflammation in cluster headache. PMID:28074859
Structure-related clustering of gene expression fingerprints of thp-1 cells exposed to smaller polycyclic aromatic hydrocarbons.

PubMed

Wan, B; Yarbrough, J W; Schultz, T W

2008-01-01

This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.
Supervised group Lasso with applications to microarray data analysis

PubMed Central

Ma, Shuangge; Song, Xiao; Huang, Jian

2007-01-01

Background A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure. Results We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data. Conclusion We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods. PMID:17316436
Clustering change patterns using Fourier transformation with time-course gene expression data.

PubMed

Kim, Jaehee

2011-01-01

To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a period of time because biologically related gene groups can share the same change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. This work is aimed at discovering gene groups with similar change patterns which share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. We applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns.
Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Data Analysis and Visualization; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,'' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA

2008-05-12

The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii)more » evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.« less
Iterative local Gaussian clustering for expressed genes identification linked to malignancy of human colorectal carcinoma

PubMed Central

Wasito, Ito; Hashim, Siti Zaiton M; Sukmaningrum, Sri

2007-01-01

Gene expression profiling plays an important role in the identification of biological and clinical properties of human solid tumors such as colorectal carcinoma. Profiling is required to reveal underlying molecular features for diagnostic and therapeutic purposes. A non-parametric density-estimation-based approach called iterative local Gaussian clustering (ILGC), was used to identify clusters of expressed genes. We used experimental data from a previous study by Muro and others consisting of 1,536 genes in 100 colorectal cancer and 11 normal tissues. In this dataset, the ILGC finds three clusters, two large and one small gene clusters, similar to their results which used Gaussian mixture clustering. The correlation of each cluster of genes and clinical properties of malignancy of human colorectal cancer was analysed for the existence of tumor or normal, the existence of distant metastasis and the existence of lymph node metastasis. PMID:18305825
Iterative local Gaussian clustering for expressed genes identification linked to malignancy of human colorectal carcinoma.

PubMed

Wasito, Ito; Hashim, Siti Zaiton M; Sukmaningrum, Sri

2007-12-30

Gene expression profiling plays an important role in the identification of biological and clinical properties of human solid tumors such as colorectal carcinoma. Profiling is required to reveal underlying molecular features for diagnostic and therapeutic purposes. A non-parametric density-estimation-based approach called iterative local Gaussian clustering (ILGC), was used to identify clusters of expressed genes. We used experimental data from a previous study by Muro and others consisting of 1,536 genes in 100 colorectal cancer and 11 normal tissues. In this dataset, the ILGC finds three clusters, two large and one small gene clusters, similar to their results which used Gaussian mixture clustering. The correlation of each cluster of genes and clinical properties of malignancy of human colorectal cancer was analysed for the existence of tumor or normal, the existence of distant metastasis and the existence of lymph node metastasis.
Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects

PubMed Central

2012-01-01

Background Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models. Results We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases. Conclusions Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data. PMID:23151154
pySAPC, a python package for sparse affinity propagation clustering: Application to odontogenesis whole genome time series gene-expression data.

PubMed

Cao, Huojun; Amendt, Brad A

2016-11-01

Developmental dental anomalies are common forms of congenital defects. The molecular mechanisms of dental anomalies are poorly understood. Systematic approaches such as clustering genes based on similar expression patterns could identify novel genes involved in dental anomalies and provide a framework for understanding molecular regulatory mechanisms of these genes during tooth development (odontogenesis). A python package (pySAPC) of sparse affinity propagation clustering algorithm for large datasets was developed. Whole genome pair-wise similarity was calculated based on expression pattern similarity based on 45 microarrays of several stages during odontogenesis. pySAPC identified 743 gene clusters based on expression pattern similarity during mouse tooth development. Three clusters are significantly enriched for genes associated with dental anomalies (with FDR <0.1). The three clusters of genes have distinct expression patterns during odontogenesis. Clustering genes based on similar expression profiles recovered several known regulatory relationships for genes involved in odontogenesis, as well as many novel genes that may be involved with the same genetic pathways as genes that have already been shown to contribute to dental defects. By using sparse similarity matrix, pySAPC use much less memory and CPU time compared with the original affinity propagation program that uses a full similarity matrix. This python package will be useful for many applications where dataset(s) are too large to use full similarity matrix. This article is part of a Special Issue entitled "System Genetics" Guest Editor: Dr. Yudong Cai and Dr. Tao Huang. Copyright © 2016. Published by Elsevier B.V.
Analysis of lamprey clustered Fox genes: insight into Fox gene evolution and expression in vertebrates.

PubMed

Wotton, Karl R; Shimeld, Sebastian M

2011-12-01

In the human genome, members of the FoxC, FoxF, FoxL1, and FoxQ1 gene families are found in two paralagous clusters. One cluster contains the genes FOXQ1, FOXF2, FOXC1 and the second consists of FOXF1, FOXC2, and FOXL1. In jawed vertebrates these genes are known to be expressed in different pharyngeal tissues and all, except FoxQ1, are involved in patterning the early embryonic mesoderm. We have previously traced the evolution of this cluster in the bony vertebrates, and the gene content is identical in the dogfish, a member of the most basally branching lineage of the jawed vertebrates. Here we extend these analyses to jawless vertebrates. Using genomic searches and molecular approaches we have identified homologues of these genes from lampreys. We identify two FoxC genes, two FoxF genes, two FoxQ1 genes and single FoxL1 gene. We examine the embryonic expression of one predominantly mesodermally expressed gene family, FoxC, and the endodermally expressed member of the cluster, FoxQ1. We identified FoxQ1 transcripts in the pharyngeal endoderm, while the two FoxC genes are differentially expressed in the pharyngeal mesenchyme and ectoderm. Furthermore we identify conserved expression of lamprey FoxC genes in the paraxial and intermediate mesoderms. We interpret our results through a chordate-wide comparison of expression patterns and discuss gene content in the context of theories on the evolution of the vertebrate genome. 2011 Elsevier B.V. All rights reserved.
Multiscale Embedded Gene Co-expression Network Analysis

PubMed Central

Song, Won-Min; Zhang, Bin

2015-01-01

Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778
Multiscale Embedded Gene Co-expression Network Analysis.

PubMed

Song, Won-Min; Zhang, Bin

2015-11-01

Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.
Analysis of genetic association using hierarchical clustering and cluster validation indices.

PubMed

Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

2017-10-01

It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.
Expression patterns of WRKY genes in di-haploid Populus simonii × P. nigra in response to salinity stress revealed by quantitative real-time PCR and RNA sequencing.

PubMed

Wang, Shengji; Wang, Jiying; Yao, Wenjing; Zhou, Boru; Li, Renhua; Jiang, Tingbo

2014-10-01

Spatio-temporal expression patterns of 13 out of 119 poplar WRKY genes indicated dynamic and tissue-specific roles of WRKY family proteins in salinity stress tolerance. To understand the expression patterns of poplar WRKY genes under salinity stress, 51 of the 119 WRKY genes were selected from di-haploid Populus simonii × P. nigra by quantitative real-time PCR (qRT-PCR). We used qRT-PCR to profile the expression of the top 13 genes under salinity stress across seven time points, and employed RNA-Seq platforms to cross-validate it. Results demonstrated that all the 13 WRKY genes were expressed in root, stem, and leaf tissues, but their expression levels and overall patterns varied notably in these tissues. Regarding overall gene expression in roots, the 13 genes were significantly highly expressed at all six time points after the treatment, reaching the plateau of expression at hour 9. In leaves, the 13 genes were similarly up-regulated from 3 to 12 h in response to NaCl treatment. In stems, however, expression levels of the 13 genes did not show significant changes after the NaCl treatment. Regarding individual gene expression across the time points and the three tissues, the 13 genes can be classified into three clusters: the lowly expressed Cluster 1 containing PthWRKY28, 45 and 105; intermediately expressed Clusters 2 including PthWRKY56, 88 and 116; and highly expressed Cluster 3 consisting of PthWRKY41, 44, 51, 61, 62, 75 and 106. In general, genes in Cluster 2 and 3 displayed a dynamic pattern of "induced amplification-recovering", suggesting that these WRKY genes and corresponding pathways may play a critical role in mediating salt response and tolerance in a dynamic and tissue-specific manner.
Horizontal transfer of a large and highly toxic secondary metabolic gene cluster between fungi.

PubMed

Slot, Jason C; Rokas, Antonis

2011-01-25

Genes involved in intermediary and secondary metabolism in fungi are frequently physically linked or clustered. For example, in Aspergillus nidulans the entire pathway for the production of sterigmatocystin (ST), a highly toxic secondary metabolite and a precursor to the aflatoxins (AF), is located in a ∼54 kb, 23 gene cluster. We discovered that a complete ST gene cluster in Podospora anserina was horizontally transferred from Aspergillus. Phylogenetic analysis shows that most Podospora cluster genes are adjacent to or nested within Aspergillus cluster genes, although the two genera belong to different taxonomic classes. Furthermore, the Podospora cluster is highly conserved in content, sequence, and microsynteny with the Aspergillus ST/AF clusters and its intergenic regions contain 14 putative binding sites for AflR, the transcription factor required for activation of the ST/AF biosynthetic genes. Examination of ∼52,000 Podospora expressed sequence tags identified transcripts for 14 genes in the cluster, with several expressed at multiple life cycle stages. The presence of putative AflR-binding sites and the expression evidence for several cluster genes, coupled with the recent independent discovery of ST production in Podospora [1], suggest that this HGT event probably resulted in a functional cluster. Given the abundance of metabolic gene clusters in fungi, our finding that one of the largest known metabolic gene clusters moved intact between species suggests that such transfers might have significantly contributed to fungal metabolic diversity. PAPERFLICK: Copyright Â© 2011 Elsevier Ltd. All rights reserved.
A Stationary Wavelet Entropy-Based Clustering Approach Accurately Predicts Gene Expression

PubMed Central

Nguyen, Nha; Vo, An; Choi, Inchan

2015-01-01

Abstract Studying epigenetic landscapes is important to understand the condition for gene regulation. Clustering is a useful approach to study epigenetic landscapes by grouping genes based on their epigenetic conditions. However, classical clustering approaches that often use a representative value of the signals in a fixed-sized window do not fully use the information written in the epigenetic landscapes. Clustering approaches to maximize the information of the epigenetic signals are necessary for better understanding gene regulatory environments. For effective clustering of multidimensional epigenetic signals, we developed a method called Dewer, which uses the entropy of stationary wavelet of epigenetic signals inside enriched regions for gene clustering. Interestingly, the gene expression levels were highly correlated with the entropy levels of epigenetic signals. Dewer separates genes better than a window-based approach in the assessment using gene expression and achieved a correlation coefficient above 0.9 without using any training procedure. Our results show that the changes of the epigenetic signals are useful to study gene regulation. PMID:25383910
Cancer Detection in Microarray Data Using a Modified Cat Swarm Optimization Clustering Approach

PubMed

M, Pandi; R, Balamurugan; N, Sadhasivam

2017-12-29

Objective: A better understanding of functional genomics can be obtained by extracting patterns hidden in gene expression data. This could have paramount implications for cancer diagnosis, gene treatments and other domains. Clustering may reveal natural structures and identify interesting patterns in underlying data. The main objective of this research was to derive a heuristic approach to detection of highly co-expressed genes related to cancer from gene expression data with minimum Mean Squared Error (MSE). Methods: A modified CSO algorithm using Harmony Search (MCSO-HS) for clustering cancer gene expression data was applied. Experiment results are analyzed using two cancer gene expression benchmark datasets, namely for leukaemia and for breast cancer. Result: The results indicated MCSO-HS to be better than HS and CSO, 13% and 9% with the leukaemia dataset. For breast cancer dataset improvement was by 22% and 17%, respectively, in terms of MSE. Conclusion: The results showed MCSO-HS to outperform HS and CSO with both benchmark datasets. To validate the clustering results, this work was tested with internal and external cluster validation indices. Also this work points to biological validation of clusters with gene ontology in terms of function, process and component. Creative Commons Attribution License
Heterologous expression of pikromycin biosynthetic gene cluster using Streptomyces artificial chromosome system.

PubMed

Pyeon, Hye-Rim; Nah, Hee-Ju; Kang, Seung-Hoon; Choi, Si-Sun; Kim, Eung-Soo

2017-05-31

Heterologous expression of biosynthetic gene clusters of natural microbial products has become an essential strategy for titer improvement and pathway engineering of various potentially-valuable natural products. A Streptomyces artificial chromosomal conjugation vector, pSBAC, was previously successfully applied for precise cloning and tandem integration of a large polyketide tautomycetin (TMC) biosynthetic gene cluster (Nah et al. in Microb Cell Fact 14(1):1, 2015), implying that this strategy could be employed to develop a custom overexpression scheme of natural product pathway clusters present in actinomycetes. To validate the pSBAC system as a generally-applicable heterologous overexpression system for a large-sized polyketide biosynthetic gene cluster in Streptomyces, another model polyketide compound, the pikromycin biosynthetic gene cluster, was preciously cloned and heterologously expressed using the pSBAC system. A unique HindIII restriction site was precisely inserted at one of the border regions of the pikromycin biosynthetic gene cluster within the chromosome of Streptomyces venezuelae, followed by site-specific recombination of pSBAC into the flanking region of the pikromycin gene cluster. Unlike the previous cloning process, one HindIII site integration step was skipped through pSBAC modification. pPik001, a pSBAC containing the pikromycin biosynthetic gene cluster, was directly introduced into two heterologous hosts, Streptomyces lividans and Streptomyces coelicolor, resulting in the production of 10-deoxymethynolide, a major pikromycin derivative. When two entire pikromycin biosynthetic gene clusters were tandemly introduced into the S. lividans chromosome, overproduction of 10-deoxymethynolide and the presence of pikromycin, which was previously not detected, were both confirmed. Moreover, comparative qRT-PCR results confirmed that the transcription of pikromycin biosynthetic genes was significantly upregulated in S. lividans containing tandem clusters of pikromycin biosynthetic gene clusters. The 60 kb pikromycin biosynthetic gene cluster was isolated in a single integration pSBAC vector. Introduction of the pikromycin biosynthetic gene cluster into the pikromycin non-producing strains resulted in higher pikromycin production. The utility of the pSBAC system as a precise cloning tool for large-sized biosynthetic gene clusters was verified through heterologous expression of the pikromycin biosynthetic gene cluster. Moreover, this pSBAC-driven heterologous expression strategy was confirmed to be an ideal approach for production of low and inconsistent natural products such as pikromycin in S. venezuelae, implying that this strategy could be employed for development of a custom overexpression scheme of natural product biosynthetic gene clusters in actinomycetes.
Function Clustering Self-Organization Maps (FCSOMs) for mining differentially expressed genes in Drosophila and its correlation with the growth medium.

PubMed

Liu, L L; Liu, M J; Ma, M

2015-09-28

The central task of this study was to mine the gene-to-medium relationship. Adequate knowledge of this relationship could potentially improve the accuracy of differentially expressed gene mining. One of the approaches to differentially expressed gene mining uses conventional clustering algorithms to identify the gene-to-medium relationship. Compared to conventional clustering algorithms, self-organization maps (SOMs) identify the nonlinear aspects of the gene-to-medium relationships by mapping the input space into another higher dimensional feature space. However, SOMs are not suitable for huge datasets consisting of millions of samples. Therefore, a new computational model, the Function Clustering Self-Organization Maps (FCSOMs), was developed. FCSOMs take advantage of the theory of granular computing as well as advanced statistical learning methodologies, and are built specifically for each information granule (a function cluster of genes), which are intelligently partitioned by the clustering algorithm provided by the DAVID_6.7 software platform. However, only the gene functions, and not their expression values, are considered in the fuzzy clustering algorithm of DAVID. Compared to the clustering algorithm of DAVID, these experimental results show a marked improvement in the accuracy of classification with the application of FCSOMs. FCSOMs can handle huge datasets and their complex classification problems, as each FCSOM (modeled for each function cluster) can be easily parallelized.
GEsture: an online hand-drawing tool for gene expression pattern search.

PubMed

Wang, Chunyan; Xu, Yiqing; Wang, Xuelin; Zhang, Li; Wei, Suyun; Ye, Qiaolin; Zhu, Youxiang; Yin, Hengfu; Nainwal, Manoj; Tanon-Reyes, Luis; Cheng, Feng; Yin, Tongming; Ye, Ning

2018-01-01

Gene expression profiling data provide useful information for the investigation of biological function and process. However, identifying a specific expression pattern from extensive time series gene expression data is not an easy task. Clustering, a popular method, is often used to classify similar expression genes, however, genes with a 'desirable' or 'user-defined' pattern cannot be efficiently detected by clustering methods. To address these limitations, we developed an online tool called GEsture. Users can draw, or graph a curve using a mouse instead of inputting abstract parameters of clustering methods. GEsture explores genes showing similar, opposite and time-delay expression patterns with a gene expression curve as input from time series datasets. We presented three examples that illustrate the capacity of GEsture in gene hunting while following users' requirements. GEsture also provides visualization tools (such as expression pattern figure, heat map and correlation network) to display the searching results. The result outputs may provide useful information for researchers to understand the targets, function and biological processes of the involved genes.

Utility and Limitations of Using Gene Expression Data to Identify Functional Associations

PubMed Central

Peng, Cheng; Shiu, Shin-Han

2016-01-01

Gene co-expression has been widely used to hypothesize gene function through guilt-by association. However, it is not clear to what degree co-expression is informative, whether it can be applied to genes involved in different biological processes, and how the type of dataset impacts inferences about gene functions. Here our goal is to assess the utility and limitations of using co-expression as a criterion to recover functional associations between genes. By determining the percentage of gene pairs in a metabolic pathway with significant expression correlation, we found that many genes in the same pathway do not have similar transcript profiles and the choice of dataset, annotation quality, gene function, expression similarity measure, and clustering approach significantly impacts the ability to recover functional associations between genes using Arabidopsis thaliana as an example. Some datasets are more informative in capturing coordinated expression profiles and larger data sets are not always better. In addition, to recover the maximum number of known pathways and identify candidate genes with similar functions, it is important to explore rather exhaustively multiple dataset combinations, similarity measures, clustering algorithms and parameters. Finally, we validated the biological relevance of co-expression cluster memberships with an independent phenomics dataset and found that genes that consistently cluster with leucine degradation genes tend to have similar leucine levels in mutants. This study provides a framework for obtaining gene functional associations by maximizing the information that can be obtained from gene expression datasets. PMID:27935950
Scoring clustering solutions by their biological relevance.

PubMed

Gat-Viks, I; Sharan, R; Shamir, R

2003-12-12

A central step in the analysis of gene expression data is the identification of groups of genes that exhibit similar expression patterns. Clustering gene expression data into homogeneous groups was shown to be instrumental in functional annotation, tissue classification, regulatory motif identification, and other applications. Although there is a rich literature on clustering algorithms for gene expression analysis, very few works addressed the systematic comparison and evaluation of clustering results. Typically, different clustering algorithms yield different clustering solutions on the same data, and there is no agreed upon guideline for choosing among them. We developed a novel statistically based method for assessing a clustering solution according to prior biological knowledge. Our method can be used to compare different clustering solutions or to optimize the parameters of a clustering algorithm. The method is based on projecting vectors of biological attributes of the clustered elements onto the real line, such that the ratio of between-groups and within-group variance estimators is maximized. The projected data are then scored using a non-parametric analysis of variance test, and the score's confidence is evaluated. We validate our approach using simulated data and show that our scoring method outperforms several extant methods, including the separation to homogeneity ratio and the silhouette measure. We apply our method to evaluate results of several clustering methods on yeast cell-cycle gene expression data. The software is available from the authors upon request.
Concerted Changes in Gene Expression and Cell Physiology of the Cyanobacterium Synechocystis sp. Strain PCC 6803 during Transitions between Nitrogen and Light-Limited Growth1[W][OA

PubMed Central

Aguirre von Wobeser, Eneas; Ibelings, Bas W.; Bok, Jasper; Krasikov, Vladimir; Huisman, Jef; Matthijs, Hans C.P.

2011-01-01

Physiological adaptation and genome-wide expression profiles of the cyanobacterium Synechocystis sp. strain PCC 6803 in response to gradual transitions between nitrogen-limited and light-limited growth conditions were measured in continuous cultures. Transitions induced changes in pigment composition, light absorption coefficient, photosynthetic electron transport, and specific growth rate. Physiological changes were accompanied by reproducible changes in the expression of several hundred open reading frames, genes with functions in photosynthesis and respiration, carbon and nitrogen assimilation, protein synthesis, phosphorus metabolism, and overall regulation of cell function and proliferation. Cluster analysis of the nearly 1,600 regulated open reading frames identified eight clusters, each showing a different temporal response during the transitions. Two large clusters mirrored each other. One cluster included genes involved in photosynthesis, which were up-regulated during light-limited growth but down-regulated during nitrogen-limited growth. Conversely, genes in the other cluster were down-regulated during light-limited growth but up-regulated during nitrogen-limited growth; this cluster included several genes involved in nitrogen uptake and assimilation. These results demonstrate complementary regulation of gene expression for two major metabolic activities of cyanobacteria. Comparison with batch-culture experiments revealed interesting differences in gene expression between batch and continuous culture and illustrates that continuous-culture experiments can pick up subtle changes in cell physiology and gene expression. PMID:21205618
The human TREM gene cluster at 6p21.1 encodes both activating and inhibitory single IgV domain receptors and includes NKp44.

PubMed

Allcock, Richard J N; Barrow, Alexander D; Forbes, Simon; Beck, Stephan; Trowsdale, John

2003-02-01

We have characterized a cluster of single immunoglobulin variable (IgV) domain receptors centromeric of the major histocompatibility complex (MHC) on human chromosome 6. In addition to triggering receptor expressed on myeloid cells (TREM)-1 and TREM2, the cluster contains NKp44, a triggering receptor whose expression is limited to NK cells. We identified three new related genes and two gene fragments within a cluster of approximately 200 kb. Two of the three new genes lack charged residues in their transmembrane domain tails. Further, one of the genes contains two potential immunotyrosine Inhibitory motifs in its cytoplasmic tail, suggesting that it delivers inhibitory signals. The human and mouse TREM clusters appear to have diverged such that there are unique sequences in each species. Finally, each gene in the TREM cluster was expressed in a different range of cell types.
Gene expression profiles of breast biopsies from healthy women identify a group with claudin-low features.

PubMed

Haakensen, Vilde D; Lingjaerde, Ole Christian; Lüders, Torben; Riis, Margit; Prat, Aleix; Troester, Melissa A; Holmen, Marit M; Frantzen, Jan Ole; Romundstad, Linda; Navjord, Dina; Bukholm, Ida K; Johannesen, Tom B; Perou, Charles M; Ursin, Giske; Kristensen, Vessela N; Børresen-Dale, Anne-Lise; Helland, Aslaug

2011-11-01

Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer.
A multi-Poisson dynamic mixture model to cluster developmental patterns of gene expression by RNA-seq.

PubMed

Ye, Meixia; Wang, Zhong; Wang, Yaqun; Wu, Rongling

2015-03-01

Dynamic changes of gene expression reflect an intrinsic mechanism of how an organism responds to developmental and environmental signals. With the increasing availability of expression data across a time-space scale by RNA-seq, the classification of genes as per their biological function using RNA-seq data has become one of the most significant challenges in contemporary biology. Here we develop a clustering mixture model to discover distinct groups of genes expressed during a period of organ development. By integrating the density function of multivariate Poisson distribution, the model accommodates the discrete property of read counts characteristic of RNA-seq data. The temporal dependence of gene expression is modeled by the first-order autoregressive process. The model is implemented with the Expectation-Maximization algorithm and model selection to determine the optimal number of gene clusters and obtain the estimates of Poisson parameters that describe the pattern of time-dependent expression of genes from each cluster. The model has been demonstrated by analyzing a real data from an experiment aimed to link the pattern of gene expression to catkin development in white poplar. The usefulness of the model has been validated through computer simulation. The model provides a valuable tool for clustering RNA-seq data, facilitating our global view of expression dynamics and understanding of gene regulation mechanisms. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Statistical indicators of collective behavior and functional clusters in gene networks of yeast

NASA Astrophysics Data System (ADS)

Živković, J.; Tadić, B.; Wick, N.; Thurner, S.

2006-03-01

We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.
Identification of an intact ParaHox cluster with temporal colinearity but altered spatial colinearity in the hemichordate Ptychodera flava

PubMed Central

2013-01-01

Background ParaHox and Hox genes are thought to have evolved from a common ancestral ProtoHox cluster or from tandem duplication prior to the divergence of cnidarians and bilaterians. Similar to Hox clusters, chordate ParaHox genes including Gsx, Xlox, and Cdx, are clustered and their expression exhibits temporal and spatial colinearity. In non-chordate animals, however, studies on the genomic organization of ParaHox genes are limited to only a few animal taxa. Hemichordates, such as the Enteropneust acorn worms, have been used to gain insights into the origins of chordate characters. In this study, we investigated the genomic organization and expression of ParaHox genes in the indirect developing hemichordate acorn worm Ptychodera flava. Results We found that P. flava contains an intact ParaHox cluster with a similar arrangement to that of chordates. The temporal expression order of the P. flava ParaHox genes is the same as that of the chordate ParaHox genes. During embryogenesis, the spatial expression pattern of PfCdx in the posterior endoderm represents a conserved feature similar to the expression of its orthologs in other animals. On the other hand, PfXlox and PfGsx show a novel expression pattern in the blastopore. Nevertheless, during metamorphosis, PfXlox and PfCdx are expressed in the endoderm in a spatially staggered pattern similar to the situation in chordates. Conclusions Our study shows that P. flava ParaHox genes, despite forming an intact cluster, exhibit temporal colinearity but lose spatial colinearity during embryogenesis. During metamorphosis, partial spatial colinearity is retained in the transforming larva. These results strongly suggest that intact ParaHox gene clustering was retained in the deuterostome ancestor and is correlated with temporal colinearity. PMID:23802544
A tripartite clustering analysis on microRNA, gene and disease model.

PubMed

Shen, Chengcheng; Liu, Ying

2012-02-01

Alteration of gene expression in response to regulatory molecules or mutations could lead to different diseases. MicroRNAs (miRNAs) have been discovered to be involved in regulation of gene expression and a wide variety of diseases. In a tripartite biological network of human miRNAs, their predicted target genes and the diseases caused by altered expressions of these genes, valuable knowledge about the pathogenicity of miRNAs, involved genes and related disease classes can be revealed by co-clustering miRNAs, target genes and diseases simultaneously. Tripartite co-clustering can lead to more informative results than traditional co-clustering with only two kinds of members and pass the hidden relational information along the relation chain by considering multi-type members. Here we report a spectral co-clustering algorithm for k-partite graph to find clusters with heterogeneous members. We use the method to explore the potential relationships among miRNAs, genes and diseases. The clusters obtained from the algorithm have significantly higher density than randomly selected clusters, which means members in the same cluster are more likely to have common connections. Results also show that miRNAs in the same family based on the hairpin sequences tend to belong to the same cluster. We also validate the clustering results by checking the correlation of enriched gene functions and disease classes in the same cluster. Finally, widely studied miR-17-92 and its paralogs are analyzed as a case study to reveal that genes and diseases co-clustered with the miRNAs are in accordance with current research findings.
Gene structure and expression characteristic of a novel odorant receptor gene cluster in the parasitoid wasp Microplitis mediator (Hymenoptera: Braconidae).

PubMed

Wang, S-N; Shan, S; Zheng, Y; Peng, Y; Lu, Z-Y; Yang, Y-Q; Li, R-J; Zhang, Y-J; Guo, Y-Y

2017-08-01

Odorant receptors (ORs) expressed in the antennae of parasitoid wasps are responsible for detection of various lipophilic airborne molecules. In the present study, 107 novel OR genes were identified from Microplitis mediator antennal transcriptome data. Phylogenetic analysis of the set of OR genes from M. mediator and Microplitis demolitor revealed that M. mediator OR (MmedOR) genes can be classified into different subfamilies, and the majority of MmedORs in each subfamily shared high sequence identities and clear orthologous relationships to M. demolitor ORs. Within a subfamily, six MmedOR genes, MmedOR98, 124, 125, 126, 131 and 155, shared a similar gene structure and were tightly linked in the genome. To evaluate whether the clustered MmedOR genes share common regulatory features, the transcription profile and expression characteristics of the six closely related OR genes were investigated in M. mediator. Rapid amplification of cDNA ends-PCR experiments revealed that the OR genes within the cluster were transcribed as single mRNAs, and a bicistronic mRNA for two adjacent genes (MmedOR124 and MmedOR98) was also detected in female antennae by reverse transcription PCR. In situ hybridization experiments indicated that each OR gene within the cluster was expressed in a different number of cells. Moreover, there was no co-expression of the two highly related OR genes, MmedOR124 and MmedOR98, which appeared to be individually expressed in a distinct population of neurons. Overall, there were distinct expression profiles of closely related MmedOR genes from the same cluster in M. mediator. These data provide a basic understanding of the olfactory coding in parasitoid wasps. © 2017 The Royal Entomological Society.
Analyzing gene expression time-courses based on multi-resolution shape mixture model.

PubMed

Li, Ying; He, Ye; Zhang, Yu

2016-11-01

Biological processes actually are a dynamic molecular process over time. Time course gene expression experiments provide opportunities to explore patterns of gene expression change over a time and understand the dynamic behavior of gene expression, which is crucial for study on development and progression of biology and disease. Analysis of the gene expression time-course profiles has not been fully exploited so far. It is still a challenge problem. We propose a novel shape-based mixture model clustering method for gene expression time-course profiles to explore the significant gene groups. Based on multi-resolution fractal features and mixture clustering model, we proposed a multi-resolution shape mixture model algorithm. Multi-resolution fractal features is computed by wavelet decomposition, which explore patterns of change over time of gene expression at different resolution. Our proposed multi-resolution shape mixture model algorithm is a probabilistic framework which offers a more natural and robust way of clustering time-course gene expression. We assessed the performance of our proposed algorithm using yeast time-course gene expression profiles compared with several popular clustering methods for gene expression profiles. The grouped genes identified by different methods are evaluated by enrichment analysis of biological pathways and known protein-protein interactions from experiment evidence. The grouped genes identified by our proposed algorithm have more strong biological significance. A novel multi-resolution shape mixture model algorithm based on multi-resolution fractal features is proposed. Our proposed model provides a novel horizons and an alternative tool for visualization and analysis of time-course gene expression profiles. The R and Matlab program is available upon the request. Copyright © 2016 Elsevier Inc. All rights reserved.
A cross-species bi-clustering approach to identifying conserved co-regulated genes.

PubMed

Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun; Bi, Jinbo

2016-06-15

A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticated method to date for identifying conserved co-regulated genes is a two-step approach. This approach first identifies gene clusters for each species by a cluster analysis of gene expression data, and subsequently computes the overlaps of clusters identified from different species to reveal common subgroups. This approach is ineffective to deal with the noise in the expression data introduced by the complicated procedures in quantifying gene expression. Furthermore, due to the sequential nature of the approach, the gene clusters identified in the first step may have little overlap among different species in the second step, thus difficult to detect conserved co-regulated genes. We propose a cross-species bi-clustering approach which first denoises the gene expression data of each species into a data matrix. The rows of the data matrices of different species represent the same set of genes that are characterized by their expression patterns over the developmental stages of each species as columns. A novel bi-clustering method is then developed to cluster genes into subgroups by a joint sparse rank-one factorization of all the data matrices. This method decomposes a data matrix into a product of a column vector and a row vector where the column vector is a consistent indicator across the matrices (species) to identify the same gene cluster and the row vector specifies for each species the developmental stages that the clustered genes co-regulate. Efficient optimization algorithm has been developed with convergence analysis. This approach was first validated on synthetic data and compared to the two-step method and several recent joint clustering methods. We then applied this approach to two real world datasets of gene expression during the pre-implantation embryonic development of the human and mouse. Co-regulated genes consistent between the human and mouse were identified, offering insights into conserved functions, as well as similarities and differences in genome activation timing between the human and mouse embryos. The R package containing the implementation of the proposed method in C ++ is available at: https://github.com/JavonSun/mvbc.git and also at the R platform https://www.r-project.org/ jinbo@engr.uconn.edu. © The Author 2016. Published by Oxford University Press.
Analysis of genetic association in Listeria and Diabetes using Hierarchical Clustering and Silhouette Index

NASA Astrophysics Data System (ADS)

Pagnuco, Inti A.; Pastore, Juan I.; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L.

2016-04-01

It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, where significative groups of genes are defined based on some criteria. This task is usually performed by clustering algorithms, where the whole family of genes, or a subset of them, are clustered into meaningful groups based on their expression values in a set of experiment. In this work we used a methodology based on the Silhouette index as a measure of cluster quality for individual gene groups, and a combination of several variants of hierarchical clustering to generate the candidate groups, to obtain sets of co-expressed genes for two real data examples. We analyzed the quality of the best ranked groups, obtained by the algorithm, using an online bioinformatics tool that provides network information for the selected genes. Moreover, to verify the performance of the algorithm, considering the fact that it doesn’t find all possible subsets, we compared its results against a full search, to determine the amount of good co-regulated sets not detected.
Gene Expression Profiles of Chlamydophila pneumoniae during the Developmental Cycle and Iron Depletion–Mediated Persistence

PubMed Central

Mäurer, André P; Mehlitz, Adrian; Mollenkopf, Hans J; Meyer, Thomas F

2007-01-01

The obligate intracellular, gram-negative bacterium Chlamydophila pneumoniae (Cpn) has impact as a human pathogen. Little is known about changes in the Cpn transcriptome during its biphasic developmental cycle (the acute infection) and persistence. The latter stage has been linked to chronic diseases. To analyze Cpn CWL029 gene expression, we designed a pathogen-specific oligo microarray and optimized the extraction method for pathogen RNA. Throughout the acute infection, ratio expression profiles for each gene were generated using 48 h post infection as a reference. Based on these profiles, significantly expressed genes were separated into 12 expression clusters using self-organizing map clustering and manual sorting into the “early”, “mid”, “late”, and “tardy” cluster classes. The latter two were differentiated because the “tardy” class showed steadily increasing expression at the end of the cycle. The transcriptome of the Cpn elementary body (EB) and published EB proteomics data were compared to the cluster profile of the acute infection. We found an intriguing association between “late” genes and genes coding for EB proteins, whereas “tardy” genes were mainly associated with genes coding for EB mRNA. It has been published that iron depletion leads to Cpn persistence. We compared the gene expression profiles during iron depletion–mediated persistence with the expression clusters of the acute infection. This led to the finding that establishment of iron depletion–mediated persistence is more likely a mid-cycle arrest in development rather than a completely distinct gene expression pattern. Here, we describe the Cpn transcriptome during the acute infection, differentiating “late” genes, which correlate to EB proteins, and “tardy” genes, which lead to EB mRNA. Expression profiles during iron mediated–persistence led us to propose the hypothesis that the transcriptomic “clock” is arrested during acute mid-cycle. PMID:17590080
Spatial expression of Hox cluster genes in the ontogeny of a sea urchin

NASA Technical Reports Server (NTRS)

Arenas-Mena, C.; Cameron, A. R.; Davidson, E. H.

2000-01-01

The Hox cluster of the sea urchin Strongylocentrous purpuratus contains ten genes in a 500 kb span of the genome. Only two of these genes are expressed during embryogenesis, while all of eight genes tested are expressed during development of the adult body plan in the larval stage. We report the spatial expression during larval development of the five 'posterior' genes of the cluster: SpHox7, SpHox8, SpHox9/10, SpHox11/13a and SpHox11/13b. The five genes exhibit a dynamic, largely mesodermal program of expression. Only SpHox7 displays extensive expression within the pentameral rudiment itself. A spatially sequential and colinear arrangement of expression domains is found in the somatocoels, the paired posterior mesodermal structures that will become the adult perivisceral coeloms. No such sequential expression pattern is observed in endodermal, epidermal or neural tissues of either the larva or the presumptive juvenile sea urchin. The spatial expression patterns of the Hox genes illuminate the evolutionary process by which the pentameral echinoderm body plan emerged from a bilateral ancestor.
Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values.

PubMed

Bhattacharya, Anindya; De, Rajat K

2010-08-01

Distance based clustering algorithms can group genes that show similar expression values under multiple experimental conditions. They are unable to identify a group of genes that have similar pattern of variation in their expression values. Previously we developed an algorithm called divisive correlation clustering algorithm (DCCA) to tackle this situation, which is based on the concept of correlation clustering. But this algorithm may also fail for certain cases. In order to overcome these situations, we propose a new clustering algorithm, called average correlation clustering algorithm (ACCA), which is able to produce better clustering solution than that produced by some others. ACCA is able to find groups of genes having more common transcription factors and similar pattern of variation in their expression values. Moreover, ACCA is more efficient than DCCA with respect to the time of execution. Like DCCA, we use the concept of correlation clustering concept introduced by Bansal et al. ACCA uses the correlation matrix in such a way that all genes in a cluster have the highest average correlation values with the genes in that cluster. We have applied ACCA and some well-known conventional methods including DCCA to two artificial and nine gene expression datasets, and compared the performance of the algorithms. The clustering results of ACCA are found to be more significantly relevant to the biological annotations than those of the other methods. Analysis of the results show the superiority of ACCA over some others in determining a group of genes having more common transcription factors and with similar pattern of variation in their expression profiles. Availability of the software: The software has been developed using C and Visual Basic languages, and can be executed on the Microsoft Windows platforms. The software may be downloaded as a zip file from http://www.isical.ac.in/~rajat. Then it needs to be installed. Two word files (included in the zip file) need to be consulted before installation and execution of the software. Copyright 2010 Elsevier Inc. All rights reserved.
Noninvasive analysis of the sputum transcriptome discriminates clinical phenotypes of asthma.

PubMed

Yan, Xiting; Chu, Jen-Hwa; Gomez, Jose; Koenigs, Maria; Holm, Carole; He, Xiaoxuan; Perez, Mario F; Zhao, Hongyu; Mane, Shrikant; Martinez, Fernando D; Ober, Carole; Nicolae, Dan L; Barnes, Kathleen C; London, Stephanie J; Gilliland, Frank; Weiss, Scott T; Raby, Benjamin A; Cohn, Lauren; Chupp, Geoffrey L

2015-05-15

The airway transcriptome includes genes that contribute to the pathophysiologic heterogeneity seen in individuals with asthma. We analyzed sputum gene expression for transcriptomic endotypes of asthma (TEA), gene signatures that discriminate phenotypes of disease. Gene expression in the sputum and blood of patients with asthma was measured using Affymetrix microarrays. Unsupervised clustering analysis based on pathways from the Kyoto Encyclopedia of Genes and Genomes was used to identify TEA clusters. Logistic regression analysis of matched blood samples defined an expression profile in the circulation to determine the TEA cluster assignment in a cohort of children with asthma to replicate clinical phenotypes. Three TEA clusters were identified. TEA cluster 1 had the most subjects with a history of intubation (P = 0.05), a lower prebronchodilator FEV1 (P = 0.006), a higher bronchodilator response (P = 0.03), and higher exhaled nitric oxide levels (P = 0.04) compared with the other TEA clusters. TEA cluster 2, the smallest cluster, had the most subjects that were hospitalized for asthma (P = 0.04). TEA cluster 3, the largest cluster, had normal lung function, low exhaled nitric oxide levels, and lower inhaled steroid requirements. Evaluation of TEA clusters in children confirmed that TEA clusters 1 and 2 are associated with a history of intubation (P = 5.58 × 10(-6)) and hospitalization (P = 0.01), respectively. There are common patterns of gene expression in the sputum and blood of children and adults that are associated with near-fatal, severe, and milder asthma.
Noninvasive Analysis of the Sputum Transcriptome Discriminates Clinical Phenotypes of Asthma

PubMed Central

Yan, Xiting; Chu, Jen-Hwa; Gomez, Jose; Koenigs, Maria; Holm, Carole; He, Xiaoxuan; Perez, Mario F.; Zhao, Hongyu; Mane, Shrikant; Martinez, Fernando D.; Ober, Carole; Nicolae, Dan L.; Barnes, Kathleen C.; London, Stephanie J.; Gilliland, Frank; Weiss, Scott T.; Raby, Benjamin A.; Cohn, Lauren

2015-01-01

Rationale: The airway transcriptome includes genes that contribute to the pathophysiologic heterogeneity seen in individuals with asthma. Objectives: We analyzed sputum gene expression for transcriptomic endotypes of asthma (TEA), gene signatures that discriminate phenotypes of disease. Methods: Gene expression in the sputum and blood of patients with asthma was measured using Affymetrix microarrays. Unsupervised clustering analysis based on pathways from the Kyoto Encyclopedia of Genes and Genomes was used to identify TEA clusters. Logistic regression analysis of matched blood samples defined an expression profile in the circulation to determine the TEA cluster assignment in a cohort of children with asthma to replicate clinical phenotypes. Measurements and Main Results: Three TEA clusters were identified. TEA cluster 1 had the most subjects with a history of intubation (P = 0.05), a lower prebronchodilator FEV1 (P = 0.006), a higher bronchodilator response (P = 0.03), and higher exhaled nitric oxide levels (P = 0.04) compared with the other TEA clusters. TEA cluster 2, the smallest cluster, had the most subjects that were hospitalized for asthma (P = 0.04). TEA cluster 3, the largest cluster, had normal lung function, low exhaled nitric oxide levels, and lower inhaled steroid requirements. Evaluation of TEA clusters in children confirmed that TEA clusters 1 and 2 are associated with a history of intubation (P = 5.58 × 10−6) and hospitalization (P = 0.01), respectively. Conclusions: There are common patterns of gene expression in the sputum and blood of children and adults that are associated with near-fatal, severe, and milder asthma. PMID:25763605
Genome-wide identification of novel expression signatures reveal distinct patterns and prevalence of binding motifs for p53, nuclear factor-κB and other signal transcription factors in head and neck squamous cell carcinoma

PubMed Central

Yan, Bin; Yang, Xinping; Lee, Tin-Lap; Friedman, Jay; Tang, Jun; Van Waes, Carter; Chen, Zhong

2007-01-01

Background Differentially expressed gene profiles have previously been observed among pathologically defined cancers by microarray technologies, including head and neck squamous cell carcinomas (HNSCCs). However, the molecular expression signatures and transcriptional regulatory controls that underlie the heterogeneity in HNSCCs are not well defined. Results Genome-wide cDNA microarray profiling of ten HNSCC cell lines revealed novel gene expression signatures that distinguished cancer cell subsets associated with p53 status. Three major clusters of over-expressed genes (A to C) were defined through hierarchical clustering, Gene Ontology, and statistical modeling. The promoters of genes in these clusters exhibited different patterns and prevalence of transcription factor binding sites for p53, nuclear factor-κB (NF-κB), activator protein (AP)-1, signal transducer and activator of transcription (STAT)3 and early growth response (EGR)1, as compared with the frequency in vertebrate promoters. Cluster A genes involved in chromatin structure and function exhibited enrichment for p53 and decreased AP-1 binding sites, whereas clusters B and C, containing cytokine and antiapoptotic genes, exhibited a significant increase in prevalence of NF-κB binding sites. An increase in STAT3 and EGR1 binding sites was distributed among the over-expressed clusters. Novel regulatory modules containing p53 or NF-κB concomitant with other transcription factor binding motifs were identified, and experimental data supported the predicted transcriptional regulation and binding activity. Conclusion The transcription factors p53, NF-κB, and AP-1 may be important determinants of the heterogeneous pattern of gene expression, whereas STAT3 and EGR1 may broadly enhance gene expression in HNSCCs. Defining these novel gene signatures and regulatory mechanisms will be important for establishing new molecular classifications and subtyping, which in turn will promote development of targeted therapeutics for HNSCC. PMID:17498291
Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

PubMed Central

Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; Taylor, Ronald C.; Weisenhorn, Pamela; Olson, Robert D.; Stevens, Rick L.; Rocha, Miguel; Rocha, Isabel; Best, Aaron A.; DeJongh, Matthew; Tintle, Nathan L.; Parrello, Bruce; Overbeek, Ross; Henry, Christopher S.

2016-01-01

Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. An important step toward meeting the challenge of understanding gene function and regulation is the identification of sets of genes that are always co-expressed. These gene sets, Atomic Regulons (ARs), represent fundamental units of function within a cell and could be used to associate genes of unknown function with cellular processes and to enable rational genetic engineering of cellular systems. Here, we describe an approach for inferring ARs that leverages large-scale expression data sets, gene context, and functional relationships among genes. We computed ARs for Escherichia coli based on 907 gene expression experiments and compared our results with gene clusters produced by two prevalent data-driven methods: Hierarchical clustering and k-means clustering. We compared ARs and purely data-driven gene clusters to the curated set of regulatory interactions for E. coli found in RegulonDB, showing that ARs are more consistent with gold standard regulons than are data-driven gene clusters. We further examined the consistency of ARs and data-driven gene clusters in the context of gene interactions predicted by Context Likelihood of Relatedness (CLR) analysis, finding that the ARs show better agreement with CLR predicted interactions. We determined the impact of increasing amounts of expression data on AR construction and find that while more data improve ARs, it is not necessary to use the full set of gene expression experiments available for E. coli to produce high quality ARs. In order to explore the conservation of co-regulated gene sets across different organisms, we computed ARs for Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus, each of which represents increasing degrees of phylogenetic distance from E. coli. Comparison of the organism-specific ARs showed that the consistency of AR gene membership correlates with phylogenetic distance, but there is clear variability in the regulatory networks of closely related organisms. As large scale expression data sets become increasingly common for model and non-model organisms, comparative analyses of atomic regulons will provide valuable insights into fundamental regulatory modules used across the bacterial domain. PMID:27933038

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liebhaber, S.A.; Weiss, I.; Cash, F.E.

Synthesis of normal human hemoglobin A, {alpha}{sub 2}{beta}{sub 2}, is based upon balanced expression of genes in the {alpha}-globin gene cluster on chromosome 15 and the {beta}-globin gene cluster on chromosome 11. Full levels of erythroid-specific activation of the {beta}-globin cluster depend on sequences located at a considerable distance 5{prime} to the {beta}-globin gene, referred to as the locus-activating or dominant control region. The existence of an analogous element(s) upstream of the {alpha}-globin cluster has been suggested from observations on naturally occurring deletions and experimental studies. The authors have identified an individual with {alpha}-thalassemia in whom structurally normal {alpha}-globin genesmore » have been inactivated in cis by a discrete de novo 35-kilobase deletion located {approximately}30 kilobases 5{prime} from the {alpha}-globin gene cluster. They conclude that this deletion inactivates expression of the {alpha}-globin genes by removing one or more of the previously identified upstream regulatory sequences that are critical to expression of the {alpha}-globin genes.« less
Functional grouping of similar genes using eigenanalysis on minimum spanning tree based neighborhood graph.

PubMed

Jothi, R; Mohanty, Sraban Kumar; Ojha, Aparajita

2016-04-01

Gene expression data clustering is an important biological process in DNA microarray analysis. Although there have been many clustering algorithms for gene expression analysis, finding a suitable and effective clustering algorithm is always a challenging problem due to the heterogeneous nature of gene profiles. Minimum Spanning Tree (MST) based clustering algorithms have been successfully employed to detect clusters of varying shapes and sizes. This paper proposes a novel clustering algorithm using Eigenanalysis on Minimum Spanning Tree based neighborhood graph (E-MST). As MST of a set of points reflects the similarity of the points with their neighborhood, the proposed algorithm employs a similarity graph obtained from k(') rounds of MST (k(')-MST neighborhood graph). By studying the spectral properties of the similarity matrix obtained from k(')-MST graph, the proposed algorithm achieves improved clustering results. We demonstrate the efficacy of the proposed algorithm on 12 gene expression datasets. Experimental results show that the proposed algorithm performs better than the standard clustering algorithms. Copyright © 2016 Elsevier Ltd. All rights reserved.
Querying Co-regulated Genes on Diverse Gene Expression Datasets Via Biclustering.

PubMed

Deveci, Mehmet; Küçüktunç, Onur; Eren, Kemal; Bozdağ, Doruk; Kaya, Kamer; Çatalyürek, Ümit V

2016-01-01

Rapid development and increasing popularity of gene expression microarrays have resulted in a number of studies on the discovery of co-regulated genes. One important way of discovering such co-regulations is the query-based search since gene co-expressions may indicate a shared role in a biological process. Although there exist promising query-driven search methods adapting clustering, they fail to capture many genes that function in the same biological pathway because microarray datasets are fraught with spurious samples or samples of diverse origin, or the pathways might be regulated under only a subset of samples. On the other hand, a class of clustering algorithms known as biclustering algorithms which simultaneously cluster both the items and their features are useful while analyzing gene expression data, or any data in which items are related in only a subset of their samples. This means that genes need not be related in all samples to be clustered together. Because many genes only interact under specific circumstances, biclustering may recover the relationships that traditional clustering algorithms can easily miss. In this chapter, we briefly summarize the literature using biclustering for querying co-regulated genes. Then we present a novel biclustering approach and evaluate its performance by a thorough experimental analysis.
Hierarchical Dirichlet process model for gene expression clustering

PubMed Central

2013-01-01

Clustering is an important data processing tool for interpreting microarray data and genomic network inference. In this article, we propose a clustering algorithm based on the hierarchical Dirichlet processes (HDP). The HDP clustering introduces a hierarchical structure in the statistical model which captures the hierarchical features prevalent in biological data such as the gene express data. We develop a Gibbs sampling algorithm based on the Chinese restaurant metaphor for the HDP clustering. We apply the proposed HDP algorithm to both regulatory network segmentation and gene expression clustering. The HDP algorithm is shown to outperform several popular clustering algorithms by revealing the underlying hierarchical structure of the data. For the yeast cell cycle data, we compare the HDP result to the standard result and show that the HDP algorithm provides more information and reduces the unnecessary clustering fragments. PMID:23587447
Gene expression profiles of breast biopsies from healthy women identify a group with claudin-low features

PubMed Central

2011-01-01

Background Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Methods Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Results Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. Conclusion This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer. PMID:22044755
Alternative Sigma Factor Over-Expression Enables Heterologous Expression of a Type II Polyketide Biosynthetic Pathway in Escherichia coli

PubMed Central

Stevens, David Cole; Conway, Kyle R.; Pearce, Nelson; Villegas-Peñaranda, Luis Roberto; Garza, Anthony G.; Boddy, Christopher N.

2013-01-01

Background Heterologous expression of bacterial biosynthetic gene clusters is currently an indispensable tool for characterizing biosynthetic pathways. Development of an effective, general heterologous expression system that can be applied to bioprospecting from metagenomic DNA will enable the discovery of a wealth of new natural products. Methodology We have developed a new Escherichia coli-based heterologous expression system for polyketide biosynthetic gene clusters. We have demonstrated the over-expression of the alternative sigma factor σ54 directly and positively regulates heterologous expression of the oxytetracycline biosynthetic gene cluster in E. coli. Bioinformatics analysis indicates that σ54 promoters are present in nearly 70% of polyketide and non-ribosomal peptide biosynthetic pathways. Conclusions We have demonstrated a new mechanism for heterologous expression of the oxytetracycline polyketide biosynthetic pathway, where high-level pleiotropic sigma factors from the heterologous host directly and positively regulate transcription of the non-native biosynthetic gene cluster. Our bioinformatics analysis is consistent with the hypothesis that heterologous expression mediated by the alternative sigma factor σ54 may be a viable method for the production of additional polyketide products. PMID:23724102
Polycistronic gene expression in Aspergillus niger.

PubMed

Schuetze, Tabea; Meyer, Vera

2017-09-25

Genome mining approaches predict dozens of biosynthetic gene clusters in each of the filamentous fungal genomes sequenced so far. However, the majority of these gene clusters still remain cryptic because they are not expressed in their natural host. Simultaneous expression of all genes belonging to a biosynthetic pathway in a heterologous host is one approach to activate biosynthetic gene clusters and to screen the metabolites produced for bioactivities. Polycistronic expression of all pathway genes under control of a single and tunable promoter would be the method of choice, as this does not only simplify cloning procedures, but also offers control on timing and strength of expression. However, polycistronic gene expression is a feature not commonly found in eukaryotic host systems, such as Aspergillus niger. In this study, we tested the suitability of the viral P2A peptide for co-expression of three genes in A. niger. Two genes descend from Fusarium oxysporum and are essential to produce the secondary metabolite enniatin (esyn1, ekivR). The third gene (luc) encodes the reporter luciferase which was included to study position effects. Expression of the polycistronic gene cassette was put under control of the Tet-On system to ensure tunable gene expression in A. niger. In total, three polycistronic expression cassettes which differed in the position of luc were constructed and targeted to the pyrG locus in A. niger. This allowed direct comparison of the luciferase activity based on the position of the luciferase gene. Doxycycline-mediated induction of the Tet-On expression cassettes resulted in the production of one long polycistronic mRNA as proven by Northern analyses, and ensured comparable production of enniatin in all three strains. Notably, gene position within the polycistronic expression cassette matters, as, luciferase activity was lowest at position one and had a comparable activity at positions two and three. The P2A peptide can be used to express at least three genes polycistronically in A. niger. This approach can now be applied to heterologously express entire secondary metabolite gene clusters polycistronically or to co-express any genes of interest in equimolar amounts.
Lampreys, the jawless vertebrates, contain only two ParaHox gene clusters.

PubMed

Zhang, Huixian; Ravi, Vydianathan; Tay, Boon-Hui; Tohari, Sumanty; Pillai, Nisha E; Prasad, Aravind; Lin, Qiang; Brenner, Sydney; Venkatesh, Byrappa

2017-08-22

ParaHox genes ( Gsx , Pdx , and Cdx ) are an ancient family of developmental genes closely related to the Hox genes. They play critical roles in the patterning of brain and gut. The basal chordate, amphioxus, contains a single ParaHox cluster comprising one member of each family, whereas nonteleost jawed vertebrates contain four ParaHox genomic loci with six or seven ParaHox genes. Teleosts, which have experienced an additional whole-genome duplication, contain six ParaHox genomic loci with six ParaHox genes. Jawless vertebrates, represented by lampreys and hagfish, are the most ancient group of vertebrates and are crucial for understanding the origin and evolution of vertebrate gene families. We have previously shown that lampreys contain six Hox gene loci. Here we report that lampreys contain only two ParaHox gene clusters (designated as α- and β-clusters) bearing five ParaHox genes ( Gsxα , Pdxα , Cdxα , Gsxβ , and Cdxβ ). The order and orientation of the three genes in the α-cluster are identical to that of the single cluster in amphioxus. However, the orientation of Gsxβ in the β-cluster is inverted. Interestingly, Gsxβ is expressed in the eye, unlike its homologs in jawed vertebrates, which are expressed mainly in the brain. The lamprey Pdxα is expressed in the pancreas similar to jawed vertebrate Pdx genes, indicating that the pancreatic expression of Pdx was acquired before the divergence of jawless and jawed vertebrate lineages. It is likely that the lamprey Pdxα plays a crucial role in pancreas specification and insulin production similar to the Pdx of jawed vertebrates.
CHARACTERIZATION OF INFLAMMATORY GENE EXPRESSION AND GALECTIN-3 FUNCTION AFTER SPINAL CORD INJURY IN MICE

PubMed Central

Pajoohesh-Ganji, Ahdeah; Knoblach, Susan M.; Faden, Alan I.; Byrnes, Kimberly R.

2012-01-01

Inflammation has long been implicated in secondary tissue damage after spinal cord injury (SCI). Our previous studies of inflammatory gene expression in rats after SCI revealed two temporally correlated clusters: the first was expressed early after injury and the second was up-regulated later, with peak expression at 1–2 weeks and persistent up-regulation through 6 months. To further address the role of inflammation after SCI, we examined inflammatory genes in a second species, mice, through 28 days after SCI. Using anchor gene clustering analysis, we found similar expression patterns for both the acute and chronic gene clusters previously identified after rat SCI. The acute group returned to normal expression levels by 7 days post-injury. The chronic group, which included C1qB, p22phox and galectin-3, showed peak expression at 7 days and remained up-regulated through 28 days. Immunohistochemistry and western blot analysis showed that the protein expression of these genes was consistent with the mRNA expression. Further exploration of the role of one of these genes, galectin-3, suggests that galectin-3 may contribute to secondary injury. In summary, our findings extend our prior gene profiling data by demonstrating the chronic expression of a cluster of microglial associated inflammatory genes after SCI in mice. Moreover, by demonstrating that inhibition of one such factor improves recovery, the findings suggest that such chronic up-regulation of inflammatory processes may contribute to secondary tissue damage after SCI, and that there may be a broader therapeutic window for neuroprotection than generally accepted. PMID:22884909
Integrative analyses of conserved WNT clusters and their co-operative behaviour in human breast cancer

PubMed Central

Qurrat-ul-Ain; Seemab, Umair; Nawaz, Sulaman; Rashid, Sajid

2011-01-01

In human, WNT gene clusters are highly conserved at specie level and associated with carcinogenesis. Among them, WNT-10A and WNT-6 genes clustered in chromosome 2q35 are homologous to WNT-10B and WNT-1 located in chromosome 12q13, respectively. In an attempt to study co-regulation, the coordinated expression of these genes was monitored in human breast cancer tissues. As compared to normal tissue, both WNT-10A and WNT-10B genes exhibited lower expression while WNT-6 and WNT-1 showed increased expression in breast cancer tissues. The co-expression pattern was elaborated by detailed phylogenetic and syntenic analyses. Moreover, the intergenic and intragenic regions for these gene clusters were analyzed for studying the transcriptional regulation. In this context, adequate conserved binding sites for SOX and TCF family of transcriptional factors were observed. We propose that SOX9 and TCF4 may compete for binding at the promoters of WNT family genes thus regulating the disease phenotype. PMID:22355234
Nine co-localized cytochrome P450 genes of the CYP2N, CYP2AD, and CYP2P gene families in the mangrove killifish Kryptolebias marmoratus genome: Identification and expression in response to B[α]P, BPA, OP, and NP.

PubMed

Puthumana, Jayesh; Kim, Bo-Mi; Jeong, Chang-Bum; Kim, Duck-Hyun; Kang, Hye-Min; Jung, Jee-Hyun; Kim, Il-Chan; Hwang, Un-Ki; Lee, Jae-Seong

2017-06-01

The CYP2 genes are the largest and most diverse cytochrome P450 (CYP) subfamily in vertebrates. We have identified nine co-localized CYP2 genes (∼55kb) in a new cluster in the genome of the highly resilient ecotoxicological fish model Kryptolebias marmoratus. Molecular characterization, temporal and tissue-specific expression pattern, and response to xenobiotics of these genes were examined. The CYP2 gene clusters were characterized and designated CYP2N22-23, CYP2AD12, and CYP2P16-20. Gene synteny analysis confirmed that the cluster in K. marmoratus is similar to that found in other teleost fishes, including zebrafish. A gene duplication event with diverged catalytic function was observed in CYP2AD12. Moreover, a high level of divergence in expression was observed among the co-localized genes. Phylogeny of the cluster suggested an orthologous relationship with similar genes in zebrafish and Japanese medaka. Gene expression analysis showed that CYP2P19 and CYP2N20 were consecutively expressed throughout embryonic development, whereas CYP2P18 was expressed in all adult tissues, suggesting that members of each CYP2 gene family have different physiological roles even though they are located in the same cluster. Among endocrine-disrupting chemicals (EDCs), benzo[α]pyrene (B[α]P) induced expression of CYP2N23, bisphenol A (BPA) induced CYP2P18 and CYP2P19, and 4-octylphenol (OP) induced CYP2AD12, but there was no significant response to 4-nonylphenol (NP), implying differential catalytic roles of the enzyme. In this paper, we identify and characterize a CYP2 gene cluster in the mangrove killifish K. marmoratus with differing catalytic roles toward EDCs. Our findings provide insights on the roles of nine co-localized CYP2 genes and their catalytic functions for better understanding of chemical-biological interactions in fish. Copyright © 2017 Elsevier B.V. All rights reserved.
Patterning in time and space: HoxB cluster gene expression in the developing chick embryo.

PubMed

Gouveia, Analuce; Marcelino, Hugo M; Gonçalves, Lisa; Palmeirim, Isabel; Andrade, Raquel P

2015-01-01

The developing embryo is a paradigmatic model to study molecular mechanisms of time control in Biology. Hox genes are key players in the specification of tissue identity during embryo development and their expression is under strict temporal regulation. However, the molecular mechanisms underlying timely Hox activation in the early embryo remain unknown. This is hindered by the lack of a rigorous temporal framework of sequential Hox expression within a single cluster. Herein, a thorough characterization of HoxB cluster gene expression was performed over time and space in the early chick embryo. Clear temporal collinearity of HoxB cluster gene expression activation was observed. Spatial collinearity of HoxB expression was evidenced in different stages of development and in multiple tissues. Using embryo explant cultures we showed that HoxB2 is cyclically expressed in the rostral presomitic mesoderm with the same periodicity as somite formation, suggesting a link between timely tissue specification and somite formation. We foresee that the molecular framework herein provided will facilitate experimental approaches aimed at identifying the regulatory mechanisms underlying Hox expression in Time and Space.
Patterning in time and space: HoxB cluster gene expression in the developing chick embryo

PubMed Central

Gouveia, Analuce; Marcelino, Hugo M; Gonçalves, Lisa; Palmeirim, Isabel; Andrade, Raquel P

2015-01-01

The developing embryo is a paradigmatic model to study molecular mechanisms of time control in Biology. Hox genes are key players in the specification of tissue identity during embryo development and their expression is under strict temporal regulation. However, the molecular mechanisms underlying timely Hox activation in the early embryo remain unknown. This is hindered by the lack of a rigorous temporal framework of sequential Hox expression within a single cluster. Herein, a thorough characterization of HoxB cluster gene expression was performed over time and space in the early chick embryo. Clear temporal collinearity of HoxB cluster gene expression activation was observed. Spatial collinearity of HoxB expression was evidenced in different stages of development and in multiple tissues. Using embryo explant cultures we showed that HoxB2 is cyclically expressed in the rostral presomitic mesoderm with the same periodicity as somite formation, suggesting a link between timely tissue specification and somite formation. We foresee that the molecular framework herein provided will facilitate experimental approaches aimed at identifying the regulatory mechanisms underlying Hox expression in Time and Space. PMID:25602523
Gene expression profiles reveal key genes for early diagnosis and treatment of adamantinomatous craniopharyngioma.

PubMed

Yang, Jun; Hou, Ziming; Wang, Changjiang; Wang, Hao; Zhang, Hongbing

2018-04-23

Adamantinomatous craniopharyngioma (ACP) is an aggressive brain tumor that occurs predominantly in the pediatric population. Conventional diagnosis method and standard therapy cannot treat ACPs effectively. In this paper, we aimed to identify key genes for ACP early diagnosis and treatment. Datasets GSE94349 and GSE68015 were obtained from Gene Expression Omnibus database. Consensus clustering was applied to discover the gene clusters in the expression data of GSE94349 and functional enrichment analysis was performed on gene set in each cluster. The protein-protein interaction (PPI) network was built by the Search Tool for the Retrieval of Interacting Genes, and hubs were selected. Support vector machine (SVM) model was built based on the signature genes identified from enrichment analysis and PPI network. Dataset GSE94349 was used for training and testing, and GSE68015 was used for validation. Besides, RT-qPCR analysis was performed to analyze the expression of signature genes in ACP samples compared with normal controls. Seven gene clusters were discovered in the differentially expressed genes identified from GSE94349 dataset. Enrichment analysis of each cluster identified 25 pathways that highly associated with ACP. PPI network was built and 46 hubs were determined. Twenty-five pathway-related genes that overlapped with the hubs in PPI network were used as signatures to establish the SVM diagnosis model for ACP. The prediction accuracy of SVM model for training, testing, and validation data were 94, 85, and 74%, respectively. The expression of CDH1, CCL2, ITGA2, COL8A1, COL6A2, and COL6A3 were significantly upregulated in ACP tumor samples, while CAMK2A, RIMS1, NEFL, SYT1, and STX1A were significantly downregulated, which were consistent with the differentially expressed gene analysis. SVM model is a promising classification tool for screening and early diagnosis of ACP. The ACP-related pathways and signature genes will advance our knowledge of ACP pathogenesis and benefit the therapy improvement.
Comparison of expression of secondary metabolite biosynthesis cluster genes in Aspergillus flavus, A. parasiticus, and A. oryzae.

PubMed

Ehrlich, Kenneth C; Mack, Brian M

2014-06-23

Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity.
Comparison of Expression of Secondary Metabolite Biosynthesis Cluster Genes in Aspergillus flavus, A. parasiticus, and A. oryzae

PubMed Central

Ehrlich, Kenneth C.; Mack, Brian M.

2014-01-01

Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity. PMID:24960201
CORM: An R Package Implementing the Clustering of Regression Models Method for Gene Clustering

PubMed Central

Shi, Jiejun; Qin, Li-Xuan

2014-01-01

We report a new R package implementing the clustering of regression models (CORM) method for clustering genes using gene expression data and provide data examples illustrating each clustering function in the package. The CORM package is freely available at CRAN from http://cran.r-project.org. PMID:25452684
Conditional clustering of temporal expression profiles

PubMed Central

Wang, Ling; Montano, Monty; Rarick, Matt; Sebastiani, Paola

2008-01-01

Background Many microarray experiments produce temporal profiles in different biological conditions but common cluster techniques are not able to analyze the data conditional on the biological conditions. Results This article presents a novel technique to cluster data from time course microarray experiments performed across several experimental conditions. Our algorithm uses polynomial models to describe the gene expression patterns over time, a full Bayesian approach with proper conjugate priors to make the algorithm invariant to linear transformations, and an iterative procedure to identify genes that have a common temporal expression profile across two or more experimental conditions, and genes that have a unique temporal profile in a specific condition. Conclusion We use simulated data to evaluate the effectiveness of this new algorithm in finding the correct number of clusters and in identifying genes with common and unique profiles. We also use the algorithm to characterize the response of human T cells to stimulations of antigen-receptor signaling gene expression temporal profiles measured in six different biological conditions and we identify common and unique genes. These studies suggest that the methodology proposed here is useful in identifying and distinguishing uniquely stimulated genes from commonly stimulated genes in response to variable stimuli. Software for using this clustering method is available from the project home page. PMID:18334028
Genes encoding cuticular proteins are components of the Nimrod gene cluster in Drosophila.

PubMed

Cinege, Gyöngyi; Zsámboki, János; Vidal-Quadras, Maite; Uv, Anne; Csordás, Gábor; Honti, Viktor; Gábor, Erika; Hegedűs, Zoltán; Varga, Gergely I B; Kovács, Attila L; Juhász, Gábor; Williams, Michael J; Andó, István; Kurucz, Éva

2017-08-01

The Nimrod gene cluster, located on the second chromosome of Drosophila melanogaster, is the largest synthenic unit of the Drosophila genome. Nimrod genes show blood cell specific expression and code for phagocytosis receptors that play a major role in fruit fly innate immune functions. We previously identified three homologous genes (vajk-1, vajk-2 and vajk-3) located within the Nimrod cluster, which are unrelated to the Nimrod genes, but are homologous to a fourth gene (vajk-4) located outside the cluster. Here we show that, unlike the Nimrod candidates, the Vajk proteins are expressed in cuticular structures of the late embryo and the late pupa, indicating that they contribute to cuticular barrier functions. Copyright © 2017 Elsevier Ltd. All rights reserved.
Metabolic Maturation during Muscle Stem Cell Differentiation Is Achieved by miR-1/133a-Mediated Inhibition of the Dlk1-Dio3 Mega Gene Cluster.

PubMed

Wüst, Stas; Dröse, Stefan; Heidler, Juliana; Wittig, Ilka; Klockner, Ina; Franko, Andras; Bonke, Erik; Günther, Stefan; Gärtner, Ulrich; Boettger, Thomas; Braun, Thomas

2018-05-01

Muscle stem cells undergo a dramatic metabolic switch to oxidative phosphorylation during differentiation, which is achieved by massively increased mitochondrial activity. Since expression of the muscle-specific miR-1/133a gene cluster correlates with increased mitochondrial activity during muscle stem cell (MuSC) differentiation, we examined the potential role of miR-1/133a in metabolic maturation of skeletal muscles in mice. We found that miR-1/133a downregulate Mef2A in differentiated myocytes, thereby suppressing the Dlk1-Dio3 gene cluster, which encodes multiple microRNAs inhibiting expression of mitochondrial genes. Loss of miR-1/133a in skeletal muscles or increased Mef2A expression causes continuous high-level expression of the Dlk1-Dio3 gene cluster, compromising mitochondrial function. Failure to terminate the stem cell-like metabolic program characterized by high-level Dlk1-Dio3 gene cluster expression initiates profound changes in muscle physiology, essentially abrogating endurance running. Our results suggest a major role of miR-1/133a in metabolic maturation of skeletal muscles but exclude major functions in muscle development and MuSC maintenance. Copyright © 2018 Elsevier Inc. All rights reserved.

Amyotrophic lateral sclerosis, gene deregulation in the anterior horn of the spinal cord and frontal cortex area 8: implications in frontotemporal lobar degeneration

PubMed Central

Andrés-Benito, Pol; Moreno, Jesús; Aso, Ester; Povedano, Mónica; Ferrer, Isidro

2017-01-01

Transcriptome arrays identifies 747 genes differentially expressed in the anterior horn of the spinal cord and 2,300 genes differentially expressed in frontal cortex area 8 in a single group of typical sALS cases without frontotemporal dementia compared with age-matched controls. Main up-regulated clusters in the anterior horn are related to inflammation and apoptosis; down-regulated clusters are linked to axoneme structures and protein synthesis. In contrast, up-regulated gene clusters in frontal cortex area 8 involve neurotransmission, synaptic proteins and vesicle trafficking, whereas main down-regulated genes cluster into oligodendrocyte function and myelin-related proteins. RT-qPCR validates the expression of 58 of 66 assessed genes from different clusters. The present results: a. reveal regional differences in de-regulated gene expression between the anterior horn of the spinal cord and frontal cortex area 8 in the same individuals suffering from sALS; b. validate and extend our knowledge about the complexity of the inflammatory response in the anterior horn of the spinal cord; and c. identify for the first time extensive gene up-regulation of neurotransmission and synaptic-related genes, together with significant down-regulation of oligodendrocyte- and myelin-related genes, as important contributors to the pathogenesis of frontal cortex alterations in the sALS/frontotemporal lobar degeneration spectrum complex at stages with no apparent cognitive impairment. PMID:28283675
Microarray and comparative genomics-based identification of genes and gene regulatory regions of the mouse immune system

PubMed Central

Hutton, John J; Jegga, Anil G; Kong, Sue; Gupta, Ashima; Ebert, Catherine; Williams, Sarah; Katz, Jonathan D; Aronow, Bruce J

2004-01-01

Background In this study we have built and mined a gene expression database composed of 65 diverse mouse tissues for genes preferentially expressed in immune tissues and cell types. Using expression pattern criteria, we identified 360 genes with preferential expression in thymus, spleen, peripheral blood mononuclear cells, lymph nodes (unstimulated or stimulated), or in vitro activated T-cells. Results Gene clusters, formed based on similarity of expression-pattern across either all tissues or the immune tissues only, had highly significant associations both with immunological processes such as chemokine-mediated response, antigen processing, receptor-related signal transduction, and transcriptional regulation, and also with more general processes such as replication and cell cycle control. Within-cluster gene correlations implicated known associations of known genes, as well as immune process-related roles for poorly described genes. To characterize regulatory mechanisms and cis-elements of genes with similar patterns of expression, we used a new version of a comparative genomics-based cis-element analysis tool to identify clusters of cis-elements with compositional similarity among multiple genes. Several clusters contained genes that shared 5–6 cis-elements that included ETS and zinc-finger binding sites. cis-Elements AP2 EGRF ETSF MAZF SP1F ZF5F and AREB ETSF MZF1 PAX5 STAT were shared in a thymus-expressed set; AP4R E2FF EBOX ETSF MAZF SP1F ZF5F and CREB E2FF MAZF PCAT SP1F STAT cis-clusters occurred in activated T-cells; CEBP CREB NFKB SORY and GATA NKXH OCT1 RBIT occurred in stimulated lymph nodes. Conclusion This study demonstrates a series of analytic approaches that have allowed the implication of genes and regulatory elements that participate in the differentiation, maintenance, and function of the immune system. Polymorphism or mutation of these could adversely impact immune system functions. PMID:15504237
Distal regulatory regions restrict the expression of cis-linked genes to the tapetal cells.

PubMed

Franco, Luciana O; de O Manes, Carmem Lara; Hamdi, Said; Sachetto-Martins, Gilberto; de Oliveira, Dulce E

2002-04-24

The oleosin glycine-rich protein genes Atgrp-6, Atgrp-7, and Atgrp-8 occur in clusters in the Arabidopsis genome and are expressed specifically in the tapetum cells. The cis-regulatory regions involved in the tissue-specific gene expression were investigated by fusing different segments of the gene cluster to the uidA reporter gene. Common distal regulatory regions were identified that coordinate expression of the sequential genes. At least two of these genes were regulated spatially by proximal and distal sequences. The cis-acting elements (122 bp upstream of the transcriptional start point) drive the uidA expression to floral tissues, whereas distal 5' upstream regions restrict the gene activity to tapetal cells.
Neuronal cell fate specification in Drosophila.

PubMed

Jan, Y N; Jan, L Y

1994-02-01

Recent work indicates that the Drosophila nervous system develops in a progressive process of cell fate specification. Expression of specific proneural genes in clusters of cells (the proneural clusters) in the cellular blastoderm endows these cells with the potential to form certain types of neural precursors. Intercellular interactions that involve both proneural genes and neurogenic genes then allow the neural precursors to be singled out from the proneural clusters. Expression of neural precursor genes in all neural precursors is likely to account for the universal aspects of neuronal differentiation, such as axonal outgrowth. Selective expression of certain neuronal-type selector genes further specifies the type of neuron(s) that a neural precursor will produce.
Heterologous Production of a Novel Cyclic Peptide Compound, KK-1, in Aspergillus oryzae.

PubMed

Yoshimi, Akira; Yamaguchi, Sigenari; Fujioka, Tomonori; Kawai, Kiyoshi; Gomi, Katsuya; Machida, Masayuki; Abe, Keietsu

2018-01-01

A novel cyclic peptide compound, KK-1, was originally isolated from the plant-pathogenic fungus Curvularia clavata . It consists of 10 amino acid residues, including five N -methylated amino acid residues, and has potent antifungal activity. Recently, the genome-sequencing analysis of C. clavata was completed, and the biosynthetic genes involved in KK-1 production were predicted by using a novel gene cluster mining tool, MIDDAS-M. These genes form an approximately 75-kb cluster, which includes nine open reading frames, containing a non-ribosomal peptide synthetase (NRPS) gene. To determine whether the predicted genes were responsible for the biosynthesis of KK-1, we performed heterologous production of KK-1 in Aspergillus oryzae by introduction of the cluster genes into the genome of A. oryzae . The NRPS gene was split in two fragments and then reconstructed in the A. oryzae genome, because the gene was quite large (approximately 40 kb). The remaining seven genes in the cluster, excluding the regulatory gene kkR , were simultaneously introduced into the strain of A. oryzae in which NRPS had already been incorporated. To evaluate the heterologous production of KK-1 in A. oryzae , gene expression was analyzed by RT-PCR and KK-1 productivity was quantified by HPLC. KK-1 was produced in variable quantities by a number of transformed strains, along with expression of the cluster genes. The amount of KK-1 produced by the strain with the greatest expression of all genes was lower than that produced by the original producer, C. clavata . Therefore, expression of the cluster genes is necessary and sufficient for the heterologous production of KK-1 in A. oryzae , although there may be unknown factors limiting productivity in this species.
Heterologous Production of a Novel Cyclic Peptide Compound, KK-1, in Aspergillus oryzae

PubMed Central

Yoshimi, Akira; Yamaguchi, Sigenari; Fujioka, Tomonori; Kawai, Kiyoshi; Gomi, Katsuya; Machida, Masayuki; Abe, Keietsu

2018-01-01

A novel cyclic peptide compound, KK-1, was originally isolated from the plant-pathogenic fungus Curvularia clavata. It consists of 10 amino acid residues, including five N-methylated amino acid residues, and has potent antifungal activity. Recently, the genome-sequencing analysis of C. clavata was completed, and the biosynthetic genes involved in KK-1 production were predicted by using a novel gene cluster mining tool, MIDDAS-M. These genes form an approximately 75-kb cluster, which includes nine open reading frames, containing a non-ribosomal peptide synthetase (NRPS) gene. To determine whether the predicted genes were responsible for the biosynthesis of KK-1, we performed heterologous production of KK-1 in Aspergillus oryzae by introduction of the cluster genes into the genome of A. oryzae. The NRPS gene was split in two fragments and then reconstructed in the A. oryzae genome, because the gene was quite large (approximately 40 kb). The remaining seven genes in the cluster, excluding the regulatory gene kkR, were simultaneously introduced into the strain of A. oryzae in which NRPS had already been incorporated. To evaluate the heterologous production of KK-1 in A. oryzae, gene expression was analyzed by RT-PCR and KK-1 productivity was quantified by HPLC. KK-1 was produced in variable quantities by a number of transformed strains, along with expression of the cluster genes. The amount of KK-1 produced by the strain with the greatest expression of all genes was lower than that produced by the original producer, C. clavata. Therefore, expression of the cluster genes is necessary and sufficient for the heterologous production of KK-1 in A. oryzae, although there may be unknown factors limiting productivity in this species. PMID:29686660
Whole Blood Gene Expression Profiling Predicts Severe Morbidity and Mortality in Cystic Fibrosis: A 5-Year Follow-Up Study.

PubMed

Saavedra, Milene T; Quon, Bradley S; Faino, Anna; Caceres, Silvia M; Poch, Katie R; Sanders, Linda A; Malcolm, Kenneth C; Nichols, David P; Sagel, Scott D; Taylor-Cousar, Jennifer L; Leach, Sonia M; Strand, Matthew; Nick, Jerry A

2018-05-01

Cystic fibrosis pulmonary exacerbations accelerate pulmonary decline and increase mortality. Previously, we identified a 10-gene leukocyte panel measured directly from whole blood, which indicates response to exacerbation treatment. We hypothesized that molecular characteristics of exacerbations could also predict future disease severity. We tested whether a 10-gene panel measured from whole blood could identify patient cohorts at increased risk for severe morbidity and mortality, beyond standard clinical measures. Transcript abundance for the 10-gene panel was measured from whole blood at the beginning of exacerbation treatment (n = 57). A hierarchical cluster analysis of subjects based on their gene expression was performed, yielding four molecular clusters. An analysis of cluster membership and outcomes incorporating an independent cohort (n = 21) was completed to evaluate robustness of cluster partitioning of genes to predict severe morbidity and mortality. The four molecular clusters were analyzed for differences in forced expiratory volume in 1 second, C-reactive protein, return to baseline forced expiratory volume in 1 second after treatment, time to next exacerbation, and time to morbidity or mortality events (defined as lung transplant referral, lung transplant, intensive care unit admission for respiratory insufficiency, or death). Clustering based on gene expression discriminated between patient groups with significant differences in forced expiratory volume in 1 second, admission frequency, and overall morbidity and mortality. At 5 years, all subjects in cluster 1 (very low risk) were alive and well, whereas 90% of subjects in cluster 4 (high risk) had suffered a major event (P = 0.0001). In multivariable analysis, the ability of gene expression to predict clinical outcomes remained significant, despite adjustment for forced expiratory volume in 1 second, sex, and admission frequency. The robustness of gene clustering to categorize patients appropriately in terms of clinical characteristics, and short- and long-term clinical outcomes, remained consistent, even when adding in a secondary population with significantly different clinical outcomes. Whole blood gene expression profiling allows molecular classification of acute pulmonary exacerbations, beyond standard clinical measures, providing a predictive tool for identifying subjects at increased risk for mortality and disease progression.
Unusual Gene Order and Organization of the Sea Urchin Hox Cluster

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cameron, R A; Rowen, L; Nesbitt, R

2005-10-11

The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3 gene is Hox5. (The gene order is :more » 5-Hox1, 2, 3, 11/13c, 11/13b, 11/13a, 9/10, 8, 7, 6, 5 - 3). The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.« less
Unusual Gene Order and Organization of the Sea Urchin HoxCluster

DOE Office of Scientific and Technical Information (OSTI.GOV)

Richardson, Paul M.; Lucas, Susan; Cameron, R. Andrew

2005-05-10

The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3' gene is Hox5. (The gene order is :more » 5'-Hox1,2, 3, 11/13c, 11/13b, '11/13a, 9/10, 8, 7, 6, 5 - 3)'. The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.« less
The biosynthetic gene cluster for the cyanogenic glucoside dhurrin in Sorghum bicolor contains its co-expressed vacuolar MATE transporter

PubMed Central

Darbani, Behrooz; Motawia, Mohammed Saddik; Olsen, Carl Erik; Nour-Eldin, Hussam H.; Møller, Birger Lindberg; Rook, Fred

2016-01-01

Genomic gene clusters for the biosynthesis of chemical defence compounds are increasingly identified in plant genomes. We previously reported the independent evolution of biosynthetic gene clusters for cyanogenic glucoside biosynthesis in three plant lineages. Here we report that the gene cluster for the cyanogenic glucoside dhurrin in Sorghum bicolor additionally contains a gene, SbMATE2, encoding a transporter of the multidrug and toxic compound extrusion (MATE) family, which is co-expressed with the biosynthetic genes. The predicted localisation of SbMATE2 to the vacuolar membrane was demonstrated experimentally by transient expression of a SbMATE2-YFP fusion protein and confocal microscopy. Transport studies in Xenopus laevis oocytes demonstrate that SbMATE2 is able to transport dhurrin. In addition, SbMATE2 was able to transport non-endogenous cyanogenic glucosides, but not the anthocyanin cyanidin 3-O-glucoside or the glucosinolate indol-3-yl-methyl glucosinolate. The genomic co-localisation of a transporter gene with the biosynthetic genes producing the transported compound is discussed in relation to the role self-toxicity of chemical defence compounds may play in the formation of gene clusters. PMID:27841372
Transcriptome profiling analysis reveals biomarkers in colon cancer samples of various differentiation

PubMed Central

Yu, Tonghu; Zhang, Huaping; Qi, Hong

2018-01-01

The aim of the present study was to investigate more colon cancer-related genes in different stages. Gene expression profile E-GEOD-62932 was extracted for differentially expressed gene (DEG) screening. Series test of cluster analysis was used to obtain significant trending models. Based on the Gene Ontology and Kyoto Encyclopedia of Genes and Genomes databases, functional and pathway enrichment analysis were processed and a pathway relation network was constructed. Gene co-expression network and gene signal network were constructed for common DEGs. The DEGs with the same trend were clustered and in total, 16 clusters with statistical significance were obtained. The screened DEGs were enriched into small molecule metabolic process and metabolic pathways. The pathway relation network was constructed with 57 nodes. A total of 328 common DEGs were obtained. Gene signal network was constructed with 71 nodes. Gene co-expression network was constructed with 161 nodes and 211 edges. ABCD3, CPT2, AGL and JAM2 are potential biomarkers for the diagnosis of colon cancer. PMID:29928385
GABRA2 Alcohol Dependence Risk Allele is Associated with Reduced Expression of Chromosome 4p12 GABAA Subunit Genes in Human Neural Cultures.

PubMed

Lieberman, Richard; Kranzler, Henry R; Joshi, Pujan; Shin, Dong-Guk; Covault, Jonathan

2015-09-01

Genetic variation in a region of chromosome 4p12 that includes the GABAA subunit gene GABRA2 has been reproducibly associated with alcohol dependence (AD). However, the molecular mechanisms underlying the association are unknown. This study examined correlates of in vitro gene expression of the AD-associated GABRA2 rs279858*C-allele in human neural cells using an induced pluripotent stem cell (iPSC) model system. We examined mRNA expression of chromosome 4p12 GABAA subunit genes (GABRG1, GABRA2, GABRA4, and GABRB1) in 36 human neural cell lines differentiated from iPSCs using quantitative polymerase chain reaction and next-generation RNA sequencing. mRNA expression in adult human brain was examined using the BrainCloud and BRAINEAC data sets. We found significantly lower levels of GABRA2 mRNA in neural cell cultures derived from rs279858*C-allele carriers. Levels of GABRA2 RNA were correlated with those of the other 3 chromosome 4p12 GABAA genes, but not other neural genes. Cluster analysis based on the relative RNA levels of the 4 chromosome 4p12 GABAA genes identified 2 distinct clusters of cell lines, a low-expression cluster associated with rs279858*C-allele carriers and a high-expression cluster enriched for the rs279858*T/T genotype. In contrast, there was no association of genotype with chromosome 4p12 GABAA gene expression in postmortem adult cortex in either the BrainCloud or BRAINEAC data sets. AD-associated variation in GABRA2 is associated with differential expression of the entire cluster of GABAA subunit genes on chromosome 4p12 in human iPSC-derived neural cell cultures. The absence of a parallel effect in postmortem human adult brain samples suggests that AD-associated genotype effects on GABAA expression, although not present in mature cortex, could have effects on regulation of the chromosome 4p12 GABAA cluster during neural development. Copyright © 2015 by the Research Society on Alcoholism.
Annotation of gene function in citrus using gene expression information and co-expression networks

PubMed Central

2014-01-01

Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. Results We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Conclusions Integration of citrus gene co-expression networks, functional enrichment analysis and gene expression information provide opportunities to infer gene function in citrus. We present a publicly accessible tool, Network Inference for Citrus Co-Expression (NICCE, http://citrus.adelaide.edu.au/nicce/home.aspx), for the gene co-expression analysis in citrus. PMID:25023870
Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets.

PubMed

Salem, Saeed; Ozcaglar, Cagri

2014-01-01

Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways.
Innate responses to gene knockouts impact overlapping gene networks and vary with respect to resistance to viral infection.

PubMed

Liu, Yonghong; Liu, Yuanyuan; Wu, Jiaming; Roizman, Bernard; Zhou, Grace Guoying

2018-04-03

Analyses of the levels of mRNAs encoding IFIT1, IFI16, RIG-1, MDA5, CXCL10, LGP2, PUM1, LSD1, STING, and IFNβ in cell lines from which the gene encoding LGP2, LSD1, PML, HDAC4, IFI16, PUM1, STING, MDA5, IRF3, or HDAC 1 had been knocked out, as well as the ability of these cell lines to support the replication of HSV-1, revealed the following: ( i ) Cell lines lacking the gene encoding LGP2, PML, or HDAC4 (cluster 1) exhibited increased levels of expression of partially overlapping gene networks. Concurrently, these cell lines produced from 5 fold to 12 fold lower yields of HSV-1 than the parental cells. ( ii ) Cell lines lacking the genes encoding STING, LSD1, MDA5, IRF3, or HDAC 1 (cluster 2) exhibited decreased levels of mRNAs of partially overlapping gene networks. Concurrently, these cell lines produced virus yields that did not differ from those produced by the parental cell line. The genes up-regulated in cell lines forming cluster 1, overlapped in part with genes down-regulated in cluster 2. The key conclusions are that gene knockouts and subsequent selection for growth causes changes in expression of multiple genes, and hence the phenotype of the cell lines cannot be ascribed to a single gene; the patterns of gene expression may be shared by multiple knockouts; and the enhanced immunity to viral replication by cluster 1 knockout cell lines but not by cluster 2 cell lines suggests that in parental cells, the expression of innate resistance to infection is specifically repressed.
Heterochromatin influences the secondary metabolite profile in the plant pathogen Fusarium graminearum

PubMed Central

Reyes-Dominguez, Yazmid; Boedi, Stefan; Sulyok, Michael; Wiesenberger, Gerlinde; Stoppacher, Norbert; Krska, Rudolf; Strauss, Joseph

2012-01-01

Chromatin modifications and heterochromatic marks have been shown to be involved in the regulation of secondary metabolism gene clusters in the fungal model system Aspergillus nidulans. We examine here the role of HEP1, the heterochromatin protein homolog of Fusarium graminearum, for the production of secondary metabolites. Deletion of Hep1 in a PH-1 background strongly influences expression of genes required for the production of aurofusarin and the main tricothecene metabolite DON. In the Hep1 deletion strains AUR genes are highly up-regulated and aurofusarin production is greatly enhanced suggesting a repressive role for heterochromatin on gene expression of this cluster. Unexpectedly, gene expression and metabolites are lower for the trichothecene cluster suggesting a positive function of Hep1 for DON biosynthesis. However, analysis of histone modifications in chromatin of AUR and DON gene promoters reveals that in both gene clusters the H3K9me3 heterochromatic mark is strongly reduced in the Hep1 deletion strain. This, and the finding that a DON-cluster flanking gene is up-regulated, suggests that the DON biosynthetic cluster is repressed by HEP1 directly and indirectly. Results from this study point to a conserved mode of secondary metabolite (SM) biosynthesis regulation in fungi by chromatin modifications and the formation of facultative heterochromatin. PMID:22100541
Transcriptome analysis of cattle muscle identifies potential markers for skeletal muscle growth rate and major cell types.

PubMed

Guo, Bing; Greenwood, Paul L; Cafe, Linda M; Zhou, Guanghong; Zhang, Wangang; Dalrymple, Brian P

2015-03-13

This study aimed to identify markers for muscle growth rate and the different cellular contributors to cattle muscle and to link the muscle growth rate markers to specific cell types. The expression of two groups of genes in the longissimus muscle (LM) of 48 Brahman steers of similar age, significantly enriched for "cell cycle" and "ECM (extracellular matrix) organization" Gene Ontology (GO) terms was correlated with average daily gain/kg liveweight (ADG/kg) of the animals. However, expression of the same genes was only partly related to growth rate across a time course of postnatal LM development in two cattle genotypes, Piedmontese x Hereford (high muscling) and Wagyu x Hereford (high marbling). The deposition of intramuscular fat (IMF) altered the relationship between the expression of these genes and growth rate. K-means clustering across the development time course with a large set of genes (5,596) with similar expression profiles to the ECM genes was undertaken. The locations in the clusters of published markers of different cell types in muscle were identified and used to link clusters of genes to the cell type most likely to be expressing them. Overall correspondence between published cell type expression of markers and predicted major cell types of expression in cattle LM was high. However, some exceptions were identified: expression of SOX8 previously attributed to muscle satellite cells was correlated with angiogenesis. Analysis of the clusters and cell types suggested that the "cell cycle" and "ECM" signals were from the fibro/adipogenic lineage. Significant contributions to these signals from the muscle satellite cells, angiogenic cells and adipocytes themselves were not as strongly supported. Based on the clusters and cell type markers, sets of five genes predicted to be representative of fibro/adipogenic precursors (FAPs) and endothelial cells, and/or ECM remodelling and angiogenesis were identified. Gene sets and gene markers for the analysis of many of the major processes/cell populations contributing to muscle composition and growth have been proposed, enabling a consistent interpretation of gene expression datasets from cattle LM. The same gene sets are likely to be applicable in other cattle muscles and in other species.
Immature MEF2C-dysregulated T-cell leukemia patients have an early T-cell precursor acute lymphoblastic leukemia gene signature and typically have non-rearranged T-cell receptors

PubMed Central

Zuurbier, Linda; Gutierrez, Alejandro; Mullighan, Charles G.; Canté-Barrett, Kirsten; Gevaert, A. Olivier; de Rooi, Johan; Li, Yunlei; Smits, Willem K.; Buijs-Gladdines, Jessica G.C.A.M.; Sonneveld, Edwin; Look, A. Thomas; Horstmann, Martin; Pieters, Rob; Meijerink, Jules P.P.

2014-01-01

Three distinct immature T-cell acute lymphoblastic leukemia entities have been described including cases that express an early T-cell precursor immunophenotype or expression profile, immature MEF2C-dysregulated T-cell acute lymphoblastic leukemia cluster cases based on gene expression analysis (immature cluster) and cases that retain non-rearranged TRG@ loci. Early T-cell precursor acute lymphoblastic leukemia cases exclusively overlap with immature cluster samples based on the expression of early T-cell precursor acute lymphoblastic leukemia signature genes, indicating that both are featuring a single disease entity. Patients lacking TRG@ rearrangements represent only 40% of immature cluster cases, but no further evidence was found to suggest that cases with absence of bi-allelic TRG@ deletions reflect a distinct and even more immature disease entity. Immature cluster/early T-cell precursor acute lymphoblastic leukemia cases are strongly enriched for genes expressed in hematopoietic stem cells as well as genes expressed in normal early thymocyte progenitor or double negative-2A T-cell subsets. Identification of early T-cell precursor acute lymphoblastic leukemia cases solely by defined immunophenotypic criteria strongly underestimates the number of cases that have a corresponding gene signature. However, early T-cell precursor acute lymphoblastic leukemia samples correlate best with a CD1 negative, CD4 and CD8 double negative immunophenotype with expression of CD34 and/or myeloid markers CD13 or CD33. Unlike various other studies, immature cluster/early T-cell precursor acute lymphoblastic leukemia patients treated on the COALL-97 protocol did not have an overall inferior outcome, and demonstrated equal sensitivity levels to most conventional therapeutic drugs compared to other pediatric T-cell acute lymphoblastic leukemia patients. PMID:23975177
Hox cluster polarity in early transcriptional availability: a high order regulatory level of clustered Hox genes in the mouse.

PubMed

Roelen, Bernard A J; de Graaff, Wim; Forlani, Sylvie; Deschamps, Jacqueline

2002-11-01

The molecular mechanism underlying the 3' to 5' polarity of induction of mouse Hox genes is still elusive. While relief from a cluster-encompassing repression was shown to lead to all Hoxd genes being expressed like the 3'most of them, Hoxd1 (Kondo and Duboule, 1999), the molecular basis of initial activation of this 3'most gene, is not understood yet. We show that, already before primitive streak formation, prior to initial expression of the first Hox gene, a dramatic transcriptional stimulation of the 3'most genes, Hoxb1 and Hoxb2, is observed upon a short pulse of exogenous retinoic acid (RA), whereas it is not in the case for more 5', cluster-internal, RA-responsive Hoxb genes. In contrast, the RA-responding Hoxb1lacZ transgene that faithfully mimics the endogenous gene (Marshall et al., 1994) did not exhibit the sensitivity of Hoxb1 to precocious activation. We conclude that polarity in initial activation of Hoxb genes reflects a greater availability of 3'Hox genes for transcription, suggesting a pre-existing (susceptibility to) opening of the chromatin structure at the 3' extremity of the cluster. We discuss the data in the context of prevailing models involving differential chromatin opening in the directionality of clustered Hox gene transcription, and regarding the importance of the cluster context for correct timing of initial Hox gene expression.Interestingly, Cdx1 manifested the same early transcriptional availability as Hoxb1. Copyright 2002 Elsevier Science Ireland Ltd.
Evolution of homeobox genes.

PubMed

Holland, Peter W H

2013-01-01

Many homeobox genes encode transcription factors with regulatory roles in animal and plant development. Homeobox genes are found in almost all eukaryotes, and have diversified into 11 gene classes and over 100 gene families in animal evolution, and 10 to 14 gene classes in plants. The largest group in animals is the ANTP class which includes the well-known Hox genes, plus other genes implicated in development including ParaHox (Cdx, Xlox, Gsx), Evx, Dlx, En, NK4, NK3, Msx, and Nanog. Genomic data suggest that the ANTP class diversified by extensive tandem duplication to generate a large array of genes, including an NK gene cluster and a hypothetical ProtoHox gene cluster that duplicated to generate Hox and ParaHox genes. Expression and functional data suggest that NK, Hox, and ParaHox gene clusters acquired distinct roles in patterning the mesoderm, nervous system, and gut. The PRD class is also diverse and includes Pax2/5/8, Pax3/7, Pax4/6, Gsc, Hesx, Otx, Otp, and Pitx genes. PRD genes are not generally arranged in ancient genomic clusters, although the Dux, Obox, and Rhox gene clusters arose in mammalian evolution as did several non-clustered PRD genes. Tandem duplication and genome duplication expanded the number of homeobox genes, possibly contributing to the evolution of developmental complexity, but homeobox gene loss must not be ignored. Evolutionary changes to homeobox gene expression have also been documented, including Hox gene expression patterns shifting in concert with segmental diversification in vertebrates and crustaceans, and deletion of a Pitx1 gene enhancer in pelvic-reduced sticklebacks. WIREs Dev Biol 2013, 2:31-45. doi: 10.1002/wdev.78 For further resources related to this article, please visit the WIREs website. The author declares that he has no conflicts of interest. Copyright © 2012 Wiley Periodicals, Inc.

LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.

PubMed

Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun

2012-01-01

Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene ontology (GO) annotation, promoter identification, gene expression (co-expression), and evolutionary analysis. This database not only provides a way to define lineage-specific and species-specific gene clusters but also facilitates future studies on gene co-regulation, epigenetic control of gene expression (DNA methylation and histone marks), and chromosomal structures in a context of gene clusters and species evolution. LCGbase is freely available at http://lcgbase.big.ac.cn/LCGbase.
Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury.

PubMed

Ryge, Jesper; Winther, Ole; Wienecke, Jacob; Sandelin, Albin; Westerdahl, Ann-Charlotte; Hultborn, Hans; Kiehn, Ole

2010-06-09

Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper-excitability, the manipulation of which potentially could be used to alter the transcriptional response to prevent the motor neurons from entering a state of hyper-excitability.
Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury

PubMed Central

2010-01-01

Background Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. Results Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. Conclusions This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper-excitability, the manipulation of which potentially could be used to alter the transcriptional response to prevent the motor neurons from entering a state of hyper-excitability. PMID:20534130
Multiscale mutation clustering algorithm identifies pan-cancer mutational clusters associated with pathway-level changes in gene expression

PubMed Central

Poole, William; Leinonen, Kalle; Shmulevich, Ilya

2017-01-01

Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C. PMID:28170390
Multiscale mutation clustering algorithm identifies pan-cancer mutational clusters associated with pathway-level changes in gene expression.

PubMed

Poole, William; Leinonen, Kalle; Shmulevich, Ilya; Knijnenburg, Theo A; Bernard, Brady

2017-02-01

Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C.
Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data

PubMed Central

Zeng, Beiyan; Chen, Yiping P.; Smith, Oscar H.

2003-01-01

Kernel density smoothing techniques have been used in classification or supervised learning of gene expression profile (GEP) data, but their applications to clustering or unsupervised learning of those data have not been explored and assessed. Here we report a kernel density clustering method for analysing GEP data and compare its performance with the three most widely-used clustering methods: hierarchical clustering, K-means clustering, and multivariate mixture model-based clustering. Using several methods to measure agreement, between-cluster isolation, and withincluster coherence, such as the Adjusted Rand Index, the Pseudo F test, the r2 test, and the profile plot, we have assessed the effectiveness of kernel density clustering for recovering clusters, and its robustness against noise on clustering both simulated and real GEP data. Our results show that the kernel density clustering method has excellent performance in recovering clusters from simulated data and in grouping large real expression profile data sets into compact and well-isolated clusters, and that it is the most robust clustering method for analysing noisy expression profile data compared to the other three methods assessed. PMID:18629292
Microarray Analysis Reveals Characteristic Changes of Host Cell Gene Expression in Response to Attenuated Modified Vaccinia Virus Ankara Infection of Human HeLa Cells

PubMed Central

Guerra, Susana; López-Fernández, Luis A.; Conde, Raquel; Pascual-Montano, Alberto; Harshman, Keith; Esteban, Mariano

2004-01-01

The potential use of the modified vaccinia virus Ankara (MVA) strain as a live recombinant vector to deliver antigens and elicit protective immune responses against infectious diseases demands a comprehensive understanding of the effect of MVA infection on human host gene expression. We used microarrays containing more than 15,000 human cDNAs to identify gene expression changes in human HeLa cell cultures at 2, 6, and 16 h postinfection. Clustering of the 410 differentially regulated genes identified 11 discrete gene clusters with altered expression patterns after MVA infection. Clusters 1 and 2 (accounting for 16.59% [68 of 410] of the genes) contained 68 transcripts showing a robust induction pattern that was maintained during the course of infection. Changes in cellular gene transcription detected by microarrays after MVA infection were confirmed for selected genes by Northern blot analysis and by real-time reverse transcription-PCR. Upregulated transcripts in clusters 1 and 2 included 20 genes implicated in immune responses, including interleukin 1A (IL-1A), IL-6, IL-7, IL-8, and IL-15 genes. MVA infection also stimulated the expression of NF-κB and components of the NF-κB signal transduction pathway, including p50 and TRAF-interacting protein. A marked increase in the expression of histone family members was also induced during MVA infection. Expression of the Wiskott-Aldrich syndrome family members WAS, WASF1, and the small GTP-binding protein RAC-1, which are involved in actin cytoskeleton reorganization, was enhanced after MVA infection. This study demonstrates that MVA infection triggered the induction of groups of genes, some of which may be involved in host resistance and immune modulation during virus infection. PMID:15140980
Implementation of plaid model biclustering method on microarray of carcinoma and adenoma tumor gene expression data

NASA Astrophysics Data System (ADS)

Ardaneswari, Gianinna; Bustamam, Alhadi; Sarwinda, Devvi

2017-10-01

A Tumor is an abnormal growth of cells that serves no purpose. Carcinoma is a tumor that grows from the top of the cell membrane and the organ adenoma is a benign tumor of the gland-like cells or epithelial tissue. In the field of molecular biology, the development of microarray technology is used in the data store of disease genetic expression. For each of microarray gene, an amount of information is stored for each trait or condition. In gene expression data clustering can be done with a bicluster algorithm, thats clustering method which not only the objects to be clustered, but also the properties or condition of the object. This research proposed Plaid Model Biclustering as one of biclustering method. In this study, we discuss the implementation of Plaid Model Biclustering Method on microarray of Carcinoma and Adenoma tumor gene expression data. From the experimental results, we found three biclusters are formed by Carcinoma gene expression data and four biclusters are formed by Adenoma gene expression data.
Differential regulation of ParaHox genes by retinoic acid in the invertebrate chordate amphioxus (Branchiostoma floridae).

PubMed

Osborne, Peter W; Benoit, Gérard; Laudet, Vincent; Schubert, Michael; Ferrier, David E K

2009-03-01

The ParaHox cluster is the evolutionary sister to the Hox cluster. Like the Hox cluster, the ParaHox cluster displays spatial and temporal regulation of the component genes along the anterior/posterior axis in a manner that correlates with the gene positions within the cluster (a feature called collinearity). The ParaHox cluster is however a simpler system to study because it is composed of only three genes. We provide a detailed analysis of the amphioxus ParaHox cluster and, for the first time in a single species, examine the regulation of the cluster in response to a single developmental signalling molecule, retinoic acid (RA). Embryos treated with either RA or RA antagonist display altered ParaHox gene expression: AmphiGsx expression shifts in the neural tube, and the endodermal boundary between AmphiXlox and AmphiCdx shifts its anterior/posterior position. We identified several putative retinoic acid response elements and in vitro assays suggest some may participate in RA regulation of the ParaHox genes. By comparison to vertebrate ParaHox gene regulation we explore the evolutionary implications. This work highlights how insights into the regulation and evolution of more complex vertebrate arrangements can be obtained through studies of a simpler, unduplicated amphioxus gene cluster.
Single-cell gene expression profiling reveals functional heterogeneity of undifferentiated human epidermal cells

PubMed Central

Tan, David W. M.; Jensen, Kim B.; Trotter, Matthew W. B.; Connelly, John T.; Broad, Simon; Watt, Fiona M.

2013-01-01

Human epidermal stem cells express high levels of β1 integrins, delta-like 1 (DLL1) and the EGFR antagonist LRIG1. However, there is cell-to-cell variation in the relative abundance of DLL1 and LRIG1 mRNA transcripts. Single-cell global gene expression profiling showed that undifferentiated cells fell into two clusters delineated by expression of DLL1 and its binding partner syntenin. The DLL1+ cluster had elevated expression of genes associated with endocytosis, integrin-mediated adhesion and receptor tyrosine kinase signalling. Differentially expressed genes were not independently regulated, as overexpression of DLL1 alone or together with LRIG1 led to the upregulation of other genes in the DLL1+ cluster. Overexpression of DLL1 and LRIG1 resulted in enhanced extracellular matrix adhesion and increased caveolin-dependent EGFR endocytosis. Further characterisation of CD46, one of the genes upregulated in the DLL1+ cluster, revealed it to be a novel cell surface marker of human epidermal stem cells. Cells with high endogenous levels of CD46 expressed high levels of β1 integrin and DLL1 and were highly adhesive and clonogenic. Knockdown of CD46 decreased proliferative potential and β1 integrin-mediated adhesion. Thus, the previously unknown heterogeneity revealed by our studies results in differences in the interaction of undifferentiated basal keratinocytes with their environment. PMID:23482486
Complex regulation of the aflatoxin biosynthesis gene cluster of Aspergillus flavus in relation to various combinations of water activity and temperature.

PubMed

Schmidt-Heydt, Markus; Abdel-Hadi, Ahmed; Magan, Naresh; Geisen, Rolf

2009-11-15

A microarray analysis was performed to study the effect of varying combinations of water activity and temperature on the activation of aflatoxin biosynthesis genes in Aspergillusflavus grown on YES medium. Generally A. flavus showed expression of the aflatoxin biosynthetic genes at all parameter combinations tested. Certain combinations of a(w) and temperature, especially combinations which imposed stress on the fungus resulted in a significant reduction of the growth rate. At these conditions induction of the whole aflatoxin biosynthesis gene cluster occurred, however the produced aflatoxin B(1) was low. At all other combinations (25 degrees C/0.95 and 0.99; 30 degrees C/0.95 and 0.99; 35 degrees C/0.95 and 0.99) a reduced basal level of cluster gene expression occurred. At these combinations a high growth rate was obtained as well as high aflatoxin production. When single genes were compared, two groups with different expression profiles in relation to water activity/temperature combinations occurred. These two groups were co-ordinately localized within the aflatoxin gene cluster. The ratio of aflR/aflJ expression was correlated with increased aflatoxin biosynthesis.
Transcriptional interference networks coordinate the expression of functionally related genes clustered in the same genomic loci

PubMed Central

Boldogköi, Zsolt

2012-01-01

The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organization, transcription, various post-transcriptional processes, and translation. In this study, the Transcriptional Interference Network (TIN) hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighboring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronized cascade of gene expression in functionally linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular organisms too. PMID:22783276
Transcriptional interference networks coordinate the expression of functionally related genes clustered in the same genomic loci.

PubMed

Boldogköi, Zsolt

2012-01-01

The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organization, transcription, various post-transcriptional processes, and translation. In this study, the Transcriptional Interference Network (TIN) hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighboring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronized cascade of gene expression in functionally linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular organisms too.
UNCLES: method for the identification of genes differentially consistently co-expressed in a specific subset of datasets.

PubMed

Abu-Jamous, Basel; Fa, Rui; Roberts, David J; Nandi, Asoke K

2015-06-04

Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.
Dynamic network reconstruction from gene expression data applied to immune response during bacterial infection.

PubMed

Guthke, Reinhard; Möller, Ulrich; Hoffmann, Martin; Thies, Frank; Töpfer, Susanne

2005-04-15

The immune response to bacterial infection represents a complex network of dynamic gene and protein interactions. We present an optimized reverse engineering strategy aimed at a reconstruction of this kind of interaction networks. The proposed approach is based on both microarray data and available biological knowledge. The main kinetics of the immune response were identified by fuzzy clustering of gene expression profiles (time series). The number of clusters was optimized using various evaluation criteria. For each cluster a representative gene with a high fuzzy-membership was chosen in accordance with available physiological knowledge. Then hypothetical network structures were identified by seeking systems of ordinary differential equations, whose simulated kinetics could fit the gene expression profiles of the cluster-representative genes. For the construction of hypothetical network structures singular value decomposition (SVD) based methods and a newly introduced heuristic Network Generation Method here were compared. It turned out that the proposed novel method could find sparser networks and gave better fits to the experimental data. Reinhard.Guthke@hki-jena.de.
A Role for Iron-Sulfur Clusters in the Regulation of Transcription Factor Yap5-dependent High Iron Transcriptional Responses in Yeast*

PubMed Central

Li, Liangtao; Miao, Ren; Bertram, Sophie; Jia, Xuan; Ward, Diane M.; Kaplan, Jerry

2012-01-01

Yeast respond to increased cytosolic iron by activating the transcription factor Yap5 increasing transcription of CCC1, which encodes a vacuolar iron importer. Using a genetic screen to identify genes involved in Yap5 iron sensing, we discovered that a mutation in SSQ1, which encodes a mitochondrial chaperone involved in iron-sulfur cluster synthesis, prevented expression of Yap5 target genes. We demonstrated that mutation or reduced expression of other genes involved in mitochondrial iron-sulfur cluster synthesis (YFH1, ISU1) prevented induction of the Yap5 response. We took advantage of the iron-dependent catalytic activity of Pseudaminobacter salicylatoxidans gentisate 1,2-dioxygenase expressed in yeast to measure changes in cytosolic iron. We determined that reductions in iron-sulfur cluster synthesis did not affect the activity of cytosolic gentisate 1,2-dioxygenase. We show that loss of activity of the cytosolic iron-sulfur cluster assembly complex proteins or deletion of cytosolic glutaredoxins did not reduce expression of Yap5 target genes. These results suggest that the high iron transcriptional response, as well as the low iron transcriptional response, senses iron-sulfur clusters. PMID:22915593
Neighboring Genes Show Correlated Evolution in Gene Expression

PubMed Central

Ghanbarian, Avazeh T.; Hurst, Laurence D.

2015-01-01

When considering the evolution of a gene’s expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. PMID:25743543
ctsGE-clustering subgroups of expression data.

PubMed

Sharabi-Schwager, Michal; Or, Etti; Ophir, Ron

2017-07-01

A pre-requisite to clustering noisy data, such as gene-expression data, is the filtering step. As an alternative to this step, the ctsGE R-package applies a sorting step in which all of the data are divided into small groups. The groups are divided according to how the time points are related to the time-series median. Then clustering is performed separately on each group. Thus, the clustering is done in two steps. First, an expression index (i.e. a sequence of 1, -1 and 0) is defined and genes with the same index are grouped together, and then each group of genes is clustered by k-means to create subgroups. The ctsGE package also provides an interactive tool to visualize and explore the gene-expression patterns and their subclusters. ctsGE proposes a way of organizing and exploring expression data without eliminating valuable information. Freely available as part of the Bioconductor project at https://bioconductor.org/packages/ctsGE/ . ron@agri.gov.il. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Tissue-specific impact of FADS cluster variants on FADS1 and FADS2 gene expression.

PubMed

Reynolds, Lindsay M; Howard, Timothy D; Ruczinski, Ingo; Kanchan, Kanika; Seeds, Michael C; Mathias, Rasika A; Chilton, Floyd H

2018-01-01

Omega-6 (n-6) and omega-3 (n-3) long (≥ 20 carbon) chain polyunsaturated fatty acids (LC-PUFAs) play a critical role in human health and disease. Biosynthesis of LC-PUFAs from dietary 18 carbon PUFAs in tissues such as the liver is highly associated with genetic variation within the fatty acid desaturase (FADS) gene cluster, containing FADS1 and FADS2 that encode the rate-limiting desaturation enzymes in the LC-PUFA biosynthesis pathway. However, the molecular mechanisms by which FADS genetic variants affect LC-PUFA biosynthesis, and in which tissues, are unclear. The current study examined associations between common single nucleotide polymorphisms (SNPs) within the FADS gene cluster and FADS1 and FADS2 gene expression in 44 different human tissues (sample sizes ranging 70-361) from the Genotype-Tissue Expression (GTEx) Project. FADS1 and FADS2 expression were detected in all 44 tissues. Significant cis-eQTLs (within 1 megabase of each gene, False Discovery Rate, FDR<0.05, as defined by GTEx) were identified in 12 tissues for FADS1 gene expression and 23 tissues for FADS2 gene expression. Six tissues had significant (FDR< 0.05) eQTLs associated with both FADS1 and FADS2 (including artery, esophagus, heart, muscle, nerve, and thyroid). Interestingly, the identified eQTLs were consistently found to be associated in opposite directions for FADS1 and FADS2 expression. Taken together, findings from this study suggest common SNPs within the FADS gene cluster impact the transcription of FADS1 and FADS2 in numerous tissues and raise important questions about how the inverse expression of these two genes impact intermediate molecular (such a LC-PUFA and LC-PUFA-containing glycerolipid levels) and ultimately clinical phenotypes associated with inflammatory diseases and brain health.
GEM2Net: from gene expression modeling to -omics networks, a new CATdb module to investigate Arabidopsis thaliana genes involved in stress response.

PubMed

Zaag, Rim; Tamby, Jean Philippe; Guichard, Cécile; Tariq, Zakia; Rigaill, Guillem; Delannoy, Etienne; Renou, Jean-Pierre; Balzergue, Sandrine; Mary-Huard, Tristan; Aubourg, Sébastien; Martin-Magniette, Marie-Laure; Brunaud, Véronique

2015-01-01

CATdb (http://urgv.evry.inra.fr/CATdb) is a database providing a public access to a large collection of transcriptomic data, mainly for Arabidopsis but also for other plants. This resource has the rare advantage to contain several thousands of microarray experiments obtained with the same technical protocol and analyzed by the same statistical pipelines. In this paper, we present GEM2Net, a new module of CATdb that takes advantage of this homogeneous dataset to mine co-expression units and decipher Arabidopsis gene functions. GEM2Net explores 387 stress conditions organized into 18 biotic and abiotic stress categories. For each one, a model-based clustering is applied on expression differences to identify clusters of co-expressed genes. To characterize functions associated with these clusters, various resources are analyzed and integrated: Gene Ontology, subcellular localization of proteins, Hormone Families, Transcription Factor Families and a refined stress-related gene list associated to publications. Exploiting protein-protein interactions and transcription factors-targets interactions enables to display gene networks. GEM2Net presents the analysis of the 18 stress categories, in which 17,264 genes are involved and organized within 681 co-expression clusters. The meta-data analyses were stored and organized to compose a dynamic Web resource. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Distinct mechanism of activation of two transcription factors, AmyR and MalR, involved in amylolytic enzyme production in Aspergillus oryzae.

PubMed

Suzuki, Kuta; Tanaka, Mizuki; Konno, Yui; Ichikawa, Takanori; Ichinose, Sakurako; Hasegawa-Shiro, Sachiko; Shintani, Takahiro; Gomi, Katsuya

2015-02-01

The production of amylolytic enzymes in Aspergillus oryzae is induced in the presence of starch or maltose, and two Zn2Cys6-type transcription factors, AmyR and MalR, are involved in this regulation. AmyR directly regulates the expression of amylase genes, and MalR controls the expression of maltose-utilizing (MAL) cluster genes. Deletion of malR gene resulted in poor growth on starch medium and reduction in α-amylase production level. To elucidate the activation mechanisms of these two transcription factors in amylase production, the expression profiles of amylases and MAL cluster genes under carbon catabolite derepression condition and subcellular localization of these transcription factors fused with a green fluorescent protein (GFP) were examined. Glucose, maltose, and isomaltose induced the expression of amylase genes, and GFP-AmyR was translocated from the cytoplasm to nucleus after the addition of these sugars. Rapid induction of amylase gene expression and nuclear localization of GFP-AmyR by isomaltose suggested that this sugar was the strongest inducer for AmyR activation. In contrast, GFP-MalR was constitutively localized in the nucleus and the expression of MAL cluster genes was induced by maltose, but not by glucose or isomaltose. In the presence of maltose, the expression of amylase genes was preceded by MAL cluster gene expression. Furthermore, deletion of the malR gene resulted in a significant decrease in the α-amylase activity induced by maltose, but had apparently no effect on the expression of α-amylase genes in the presence of isomaltose. These results suggested that activation of AmyR and MalR is regulated in a different manner, and the preceding activation of MalR is essential for the utilization of maltose as an inducer for AmyR activation.
Imprinted gene expression in fetal growth and development.

PubMed

Lambertini, L; Marsit, C J; Sharma, P; Maccani, M; Ma, Y; Hu, J; Chen, J

2012-06-01

Experimental studies showed that genomic imprinting is fundamental in fetoplacental development by timely regulating the expression of the imprinted genes to overlook a set of events determining placenta implantation, growth and embryogenesis. We examined the expression profile of 22 imprinted genes which have been linked to pregnancy abnormalities that may ultimately influence childhood development. The study was conducted in a subset of 106 placenta samples, overrepresented with small and large for gestational age cases, from the Rhode Island Child Health Study. We investigated associations between imprinted gene expression and three fetal development parameters: newborn head circumference, birth weight, and size for gestational age. Results from our investigation show that the maternally imprinted/paternally expressed gene ZNF331 inversely associates with each parameter to drive smaller fetal size, while paternally imprinted/maternally expressed gene SLC22A18 directly associates with the newborn head circumference promoting growth. Multidimensional Scaling analysis revealed two clusters within the 22 imprinted genes which are independently associated with fetoplacental development. Our data suggest that cluster 1 genes work by assuring cell growth and tissue development, while cluster 2 genes act by coordinating these processes. Results from this epidemiologic study offer solid support for the key role of imprinting in fetoplacental development. Copyright © 2012 Elsevier Ltd. All rights reserved.
A ground truth based comparative study on clustering of gene expression data.

PubMed

Zhu, Yitan; Wang, Zuyi; Miller, David J; Clarke, Robert; Xuan, Jianhua; Hoffman, Eric P; Wang, Yue

2008-05-01

Given the variety of available clustering methods for gene expression data analysis, it is important to develop an appropriate and rigorous validation scheme to assess the performance and limitations of the most widely used clustering algorithms. In this paper, we present a ground truth based comparative study on the functionality, accuracy, and stability of five data clustering methods, namely hierarchical clustering, K-means clustering, self-organizing maps, standard finite normal mixture fitting, and a caBIG toolkit (VIsual Statistical Data Analyzer--VISDA), tested on sample clustering of seven published microarray gene expression datasets and one synthetic dataset. We examined the performance of these algorithms in both data-sufficient and data-insufficient cases using quantitative performance measures, including cluster number detection accuracy and mean and standard deviation of partition accuracy. The experimental results showed that VISDA, an interactive coarse-to-fine maximum likelihood fitting algorithm, is a solid performer on most of the datasets, while K-means clustering and self-organizing maps optimized by the mean squared compactness criterion generally produce more stable solutions than the other methods.
Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets

PubMed Central

2014-01-01

Background Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. Results We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways. PMID:25221624
Hox cluster disintegration with persistent anteroposterior order of expression in Oikopleura dioica.

PubMed

Seo, Hee-Chan; Edvardsen, Rolf Brudvik; Maeland, Anne Dorthea; Bjordal, Marianne; Jensen, Marit Flo; Hansen, Anette; Flaat, Mette; Weissenbach, Jean; Lehrach, Hans; Wincker, Patrick; Reinhardt, Richard; Chourrout, Daniel

2004-09-02

Tunicate embryos and larvae have small cell numbers and simple anatomical features in comparison with other chordates, including vertebrates. Although they branch near the base of chordate phylogenetic trees, their degree of divergence from the common chordate ancestor remains difficult to evaluate. Here we show that the tunicate Oikopleura dioica has a complement of nine Hox genes in which all central genes are lacking but a full vertebrate-like set of posterior genes is present. In contrast to all bilaterians studied so far, Hox genes are not clustered in the Oikopleura genome. Their expression occurs mostly in the tail, with some tissue preference, and a strong partition of expression domains in the nerve cord, in the notochord and in the muscle. In each tissue of the tail, the anteroposterior order of Hox gene expression evokes spatial collinearity, with several alterations. We propose a relationship between the Hox cluster breakdown, the separation of Hox expression domains, and a transition to a determinative mode of development.
Comparison of two schemes for automatic keyword extraction from MEDLINE for functional gene clustering.

PubMed

Liu, Ying; Ciliax, Brian J; Borges, Karin; Dasigi, Venu; Ram, Ashwin; Navathe, Shamkant B; Dingledine, Ray

2004-01-01

One of the key challenges of microarray studies is to derive biological insights from the unprecedented quatities of data on gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the nature of the functional links among genes within the derived clusters. However, the quality of the keyword lists extracted from biomedical literature for each gene significantly affects the clustering results. We extracted keywords from MEDLINE that describes the most prominent functions of the genes, and used the resulting weights of the keywords as feature vectors for gene clustering. By analyzing the resulting cluster quality, we compared two keyword weighting schemes: normalized z-score and term frequency-inverse document frequency (TFIDF). The best combination of background comparison set, stop list and stemming algorithm was selected based on precision and recall metrics. In a test set of four known gene groups, a hierarchical algorithm correctly assigned 25 of 26 genes to the appropriate clusters based on keywords extracted by the TDFIDF weighting scheme, but only 23 og 26 with the z-score method. To evaluate the effectiveness of the weighting schemes for keyword extraction for gene clusters from microarray profiles, 44 yeast genes that are differentially expressed during the cell cycle were used as a second test set. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords had higher purity, lower entropy, and higher mutual information than those produced from normalized z-score weighted keywords. The optimized algorithms should be useful for sorting genes from microarray lists into functionally discrete clusters.
Functional Analysis of Mating Type Genes and Transcriptome Analysis during Fruiting Body Development of Botrytis cinerea

PubMed Central

2018-01-01

ABSTRACT Botrytis cinerea is a plant-pathogenic fungus producing apothecia as sexual fruiting bodies. To study the function of mating type (MAT) genes, single-gene deletion mutants were generated in both genes of the MAT1-1 locus and both genes of the MAT1-2 locus. Deletion mutants in two MAT genes were entirely sterile, while mutants in the other two MAT genes were able to develop stipes but never formed an apothecial disk. Little was known about the reprogramming of gene expression during apothecium development. We analyzed transcriptomes of sclerotia, three stages of apothecium development (primordia, stipes, and apothecial disks), and ascospores by RNA sequencing. Ten secondary metabolite gene clusters were upregulated at the onset of sexual development and downregulated in ascospores released from apothecia. Notably, more than 3,900 genes were differentially expressed in ascospores compared to mature apothecial disks. Among the genes that were upregulated in ascospores were numerous genes encoding virulence factors, which reveals that ascospores are transcriptionally primed for infection prior to their arrival on a host plant. Strikingly, the massive transcriptional changes at the initiation and completion of the sexual cycle often affected clusters of genes, rather than randomly dispersed genes. Thirty-five clusters of genes were jointly upregulated during the onset of sexual reproduction, while 99 clusters of genes (comprising >900 genes) were jointly downregulated in ascospores. These transcriptional changes coincided with changes in expression of genes encoding enzymes participating in chromatin organization, hinting at the occurrence of massive epigenetic regulation of gene expression during sexual reproduction. PMID:29440571
Genome-Wide Transcriptional Profiling of Clostridium perfringens SM101 during Sporulation Extends the Core of Putative Sporulation Genes and Genes Determining Spore Properties and Germination Characteristics.

PubMed

Xiao, Yinghua; van Hijum, Sacha A F T; Abee, Tjakko; Wells-Bennik, Marjon H J

2015-01-01

The formation of bacterial spores is a highly regulated process and the ultimate properties of the spores are determined during sporulation and subsequent maturation. A wide variety of genes that are expressed during sporulation determine spore properties such as resistance to heat and other adverse environmental conditions, dormancy and germination responses. In this study we characterized the sporulation phases of C. perfringens enterotoxic strain SM101 based on morphological characteristics, biomass accumulation (OD600), the total viable counts of cells plus spores, the viable count of heat resistant spores alone, the pH of the supernatant, enterotoxin production and dipicolinic acid accumulation. Subsequently, whole-genome expression profiling during key phases of the sporulation process was performed using DNA microarrays, and genes were clustered based on their time-course expression profiles during sporulation. The majority of previously characterized C. perfringens germination genes showed upregulated expression profiles in time during sporulation and belonged to two main clusters of genes. These clusters with up-regulated genes contained a large number of C. perfringens genes which are homologs of Bacillus genes with roles in sporulation and germination; this study therefore suggests that those homologs are functional in C. perfringens. A comprehensive homology search revealed that approximately half of the upregulated genes in the two clusters are conserved within a broad range of sporeforming Firmicutes. Another 30% of upregulated genes in the two clusters were found only in Clostridium species, while the remaining 20% appeared to be specific for C. perfringens. These newly identified genes may add to the repertoire of genes with roles in sporulation and determining spore properties including germination behavior. Their exact roles remain to be elucidated in future studies.
Genome-Wide Transcriptional Profiling of Clostridium perfringens SM101 during Sporulation Extends the Core of Putative Sporulation Genes and Genes Determining Spore Properties and Germination Characteristics

PubMed Central

Xiao, Yinghua; van Hijum, Sacha A. F. T.; Abee, Tjakko; Wells-Bennik, Marjon H. J.

2015-01-01

The formation of bacterial spores is a highly regulated process and the ultimate properties of the spores are determined during sporulation and subsequent maturation. A wide variety of genes that are expressed during sporulation determine spore properties such as resistance to heat and other adverse environmental conditions, dormancy and germination responses. In this study we characterized the sporulation phases of C. perfringens enterotoxic strain SM101 based on morphological characteristics, biomass accumulation (OD600), the total viable counts of cells plus spores, the viable count of heat resistant spores alone, the pH of the supernatant, enterotoxin production and dipicolinic acid accumulation. Subsequently, whole-genome expression profiling during key phases of the sporulation process was performed using DNA microarrays, and genes were clustered based on their time-course expression profiles during sporulation. The majority of previously characterized C. perfringens germination genes showed upregulated expression profiles in time during sporulation and belonged to two main clusters of genes. These clusters with up-regulated genes contained a large number of C. perfringens genes which are homologs of Bacillus genes with roles in sporulation and germination; this study therefore suggests that those homologs are functional in C. perfringens. A comprehensive homology search revealed that approximately half of the upregulated genes in the two clusters are conserved within a broad range of sporeforming Firmicutes. Another 30% of upregulated genes in the two clusters were found only in Clostridium species, while the remaining 20% appeared to be specific for C. perfringens. These newly identified genes may add to the repertoire of genes with roles in sporulation and determining spore properties including germination behavior. Their exact roles remain to be elucidated in future studies. PMID:25978838
Identification of the Regulator Gene Responsible for the Acetone-Responsive Expression of the Binuclear Iron Monooxygenase Gene Cluster in Mycobacteria ▿

PubMed Central

Furuya, Toshiki; Hirose, Satomi; Semba, Hisashi; Kino, Kuniki

2011-01-01

The mimABCD gene cluster encodes the binuclear iron monooxygenase that oxidizes propane and phenol in Mycobacterium smegmatis strain MC2 155 and Mycobacterium goodii strain 12523. Interestingly, expression of the mimABCD gene cluster is induced by acetone. In this study, we investigated the regulator gene responsible for this acetone-responsive expression. In the genome sequence of M. smegmatis strain MC2 155, the mimABCD gene cluster is preceded by a gene designated mimR, which is divergently transcribed. Sequence analysis revealed that MimR exhibits amino acid similarity with the NtrC family of transcriptional activators, including AcxR and AcoR, which are involved in acetone and acetoin metabolism, respectively. Unexpectedly, many homologs of the mimR gene were also found in the sequenced genomes of actinomycetes. A plasmid carrying a transcriptional fusion of the intergenic region between the mimR and mimA genes with a promoterless green fluorescent protein (GFP) gene was constructed and introduced into M. smegmatis strain MC2 155. Using a GFP reporter system, we confirmed by deletion and complementation analyses that the mimR gene product is the positive regulator of the mimABCD gene cluster expression that is responsive to acetone. M. goodii strain 12523 also utilized the same regulatory system as M. smegmatis strain MC2 155. Although transcriptional activators of the NtrC family generally control transcription using the σ54 factor, a gene encoding the σ54 factor was absent from the genome sequence of M. smegmatis strain MC2 155. These results suggest the presence of a novel regulatory system in actinomycetes, including mycobacteria. PMID:21856847
Finding new pathway-specific regulators by clustering method using threshold standard deviation based on DNA chip data of Streptomyces coelicolor.

PubMed

Yang, Yung-Hun; Kim, Ji-Nu; Song, Eunjung; Kim, Eunjung; Oh, Min-Kyu; Kim, Byung-Gee

2008-09-01

In order to identify the regulators involved in antibiotic production or time-specific cellular events, the messenger ribonucleic acid (mRNA) expression data of the two gene clusters, actinorhodin (ACT) and undecylprodigiosin (RED) biosynthetic genes, were clustered with known mRNA expression data of regulators from S. coelicolor using a filtering method based on standard deviation and clustering analysis. The result identified five regulators including two well-known regulators namely, SCO3579 (WlbA) and SCO6722 (SsgD). Using overexpression and deletion of the regulator genes, we were able to identify two regulators, i.e., SCO0608 and SCO6808, playing roles as repressors in antibiotics production and sporulation. This approach can be easily applied to mapping out new regulators related to any interesting target gene clusters showing characteristic expression patterns. The result can also be used to provide insightful information on the selection rules among a large number of regulators.
Association between differential gene expression and body mass index among endometrial cancers from The Cancer Genome Atlas Project.

PubMed

Roque, Dario R; Makowski, Liza; Chen, Ting-Huei; Rashid, Naim; Hayes, D Neil; Bae-Jump, Victoria

2016-08-01

The Cancer Genome Atlas (TCGA) identified four integrated clusters for endometrial cancer (EC): POLE, MSI, CNL and CNH. We evaluated differences in gene expression profiles of obese and non-obese women with EC and examined the association of body mass index (BMI) within the clusters identified in TCGA. TCGA RNAseq data was used to identify genes related to increasing BMI among ECs. The POLE, MSI and CNL clusters were composed mostly of endometrioid EC. Patient BMI was compared between these three clusters with one-way ANOVA. Association between gene expression and BMI was also assessed while adjusting for confounding effects of potential confounding factors. p-Values testing the association between gene expression and BMI were adjusted for multiple hypothesis testing over the 20,531 genes considered. Mean BMI was statistically different between the ECs in the CNL (35.8) versus POLE (29.8) cluster (p=0.006) and approached significance for the MSI (33.0) versus CNL (35.8) cluster (p=0.05). 181 genes were significantly up- or down-regulated with increasing BMI in endometrioid EC (q-value<0.01), including LPL, IRS-1, IGFBP4, IGFBP7 and the progesterone receptor. DAVID functional annotation analysis revealed significant enrichment in "cell cycle" (adjusted p-value=1.5E-5) and "DNA metabolic processes" (adjusted p-value=1E-3) for the identified genes. Obesity related genes were found to be upregulated with increasing BMI among endometrioid ECs. Patients with POLE tumors have the lowest median BMI when compared to MSI and CNL. Given the heterogeneity among endometrioid EC, consideration should be given to abandoning the Type I and II classification of EC tumors. Copyright © 2016 Elsevier Inc. All rights reserved.
The Genome of Tolypocladium inflatum: Evolution, Organization, and Expression of the Cyclosporin Biosynthetic Gene Cluster

PubMed Central

Bushley, Kathryn E.; Raja, Rajani; Jaiswal, Pankaj; Cumbie, Jason S.; Nonogaki, Mariko; Boyd, Alexander E.; Owensby, C. Alisha; Knaus, Brian J.; Elser, Justin; Miller, Daniel; Di, Yanming; McPhail, Kerry L.; Spatafora, Joseph W.

2013-01-01

The ascomycete fungus Tolypocladium inflatum, a pathogen of beetle larvae, is best known as the producer of the immunosuppressant drug cyclosporin. The draft genome of T. inflatum strain NRRL 8044 (ATCC 34921), the isolate from which cyclosporin was first isolated, is presented along with comparative analyses of the biosynthesis of cyclosporin and other secondary metabolites in T. inflatum and related taxa. Phylogenomic analyses reveal previously undetected and complex patterns of homology between the nonribosomal peptide synthetase (NRPS) that encodes for cyclosporin synthetase (simA) and those of other secondary metabolites with activities against insects (e.g., beauvericin, destruxins, etc.), and demonstrate the roles of module duplication and gene fusion in diversification of NRPSs. The secondary metabolite gene cluster responsible for cyclosporin biosynthesis is described. In addition to genes necessary for cyclosporin biosynthesis, it harbors a gene for a cyclophilin, which is a member of a family of immunophilins known to bind cyclosporin. Comparative analyses support a lineage specific origin of the cyclosporin gene cluster rather than horizontal gene transfer from bacteria or other fungi. RNA-Seq transcriptome analyses in a cyclosporin-inducing medium delineate the boundaries of the cyclosporin cluster and reveal high levels of expression of the gene cluster cyclophilin. In medium containing insect hemolymph, weaker but significant upregulation of several genes within the cyclosporin cluster, including the highly expressed cyclophilin gene, was observed. T. inflatum also represents the first reference draft genome of Ophiocordycipitaceae, a third family of insect pathogenic fungi within the fungal order Hypocreales, and supports parallel and qualitatively distinct radiations of insect pathogens. The T. inflatum genome provides additional insight into the evolution and biosynthesis of cyclosporin and lays a foundation for further investigations of the role of secondary metabolite gene clusters and their metabolites in fungal biology. PMID:23818858
The role of HPV RNA transcription, immune response-related gene expression and disruptive TP53 mutations in diagnostic and prognostic profiling of head and neck cancer.

PubMed

Wichmann, Gunnar; Rosolowski, Maciej; Krohn, Knut; Kreuz, Markus; Boehm, Andreas; Reiche, Anett; Scharrer, Ulrike; Halama, Dirk; Bertolini, Julia; Bauer, Ulrike; Holzinger, Dana; Pawlita, Michael; Hess, Jochen; Engel, Christoph; Hasenclever, Dirk; Scholz, Markus; Ahnert, Peter; Kirsten, Holger; Hemprich, Alexander; Wittekind, Christian; Herbarth, Olf; Horn, Friedemann; Dietz, Andreas; Loeffler, Markus

2015-12-15

Stratification of head and neck squamous cell carcinomas (HNSCC) based on HPV16 DNA and RNA status, gene expression patterns, and mutated candidate genes may facilitate patient treatment decision. We characterize head and neck squamous cell carcinomas (HNSCC) with different HPV16 DNA and RNA (E6*I) status from 290 consecutively recruited patients by gene expression profiling and targeted sequencing of 50 genes. We show that tumors with transcriptionally inactive HPV16 (DNA+ RNA-) are similar to HPV-negative (DNA-) tumors regarding gene expression and frequency of TP53 mutations (47%, 8/17 and 43%, 72/167, respectively). We also find that an immune response-related gene expression cluster is associated with lymph node metastasis, independent of HPV16 status and that disruptive TP53 mutations are associated with lymph node metastasis in HPV16 DNA- tumors. We validate each of these associations in another large data set. Four gene expression clusters which we identify differ moderately but significantly in overall survival. Our findings underscore the importance of measuring the HPV16 RNA (E6*I) and TP53-mutation status for patient stratification and identify associations of an immune response-related gene expression cluster and TP53 mutations with lymph node metastasis in HNSCC. © 2015 UICC.
A brain-specific gene cluster isolated from the region of the mouse obesity locus is expressed in the adult hypothalamus and during mouse development

DOE Office of Scientific and Technical Information (OSTI.GOV)

Laig-Webster, M.; Lim, M.E.; Chehab, F.F.

1994-09-01

The molecular defect underlying an autosomal recessive form of genetic obesity in a classical mouse model C57 BL/6J-ob/ob has not yet been elucidated. Whereas metabolic and physiological disturbances such as diabetes and hypertension are associated with obesity, the site of expression and the nature of the primary lesion responsible for this cascade of events remains elusive. Our efforts aimed at the positional cloning of the ob gene by YAC contig mapping and gene identification have resulted in the cloning of a brain-specific gene cluster from the ob critical region. The expression of this gene cluster is remarkably complex owing tomore » the multitude of brain-specific mRNA transcripts detected on Northern blots. cDNA cloning of these transcripts suggests that they are expressed from different genes as well as by alternate splicing mechanisms. Furthermore, the genomic organization of the cluster appears to consist of at least two identical promoters displaying CpG islands characteristic of housekeeping genes, yet clearly involving tissue-specific expression. Sense and anti-sense synthetic RNA probes were derived from a common DNA sequence on 3 cDNA clones and hybridized to 8-16 days mouse embryonic stages and mouse adult brain sections. Expression in development was noticeable as of the 11th day of gestation and confined to the central nervous system mainly in the telencephalon and spinal cord. Coronal and sagittal sections of the adult mouse brain showed expression only in 3 different regions of the brain stem. In situ hybridization to mouse hypothalamus sections revealed the presence of a localized and specialized group of cells expressing high levels of mRNA, suggesting that this gene cluster may also be involved in the regulation of hypothalamic activities. The hypothalamus has long been hypothesized as a primary candidate tissue for the expression of the obesity gene mainly because of its well-established role in the regulation of energy metabolism and food intake.« less
Neighboring Genes Show Correlated Evolution in Gene Expression.

PubMed

Ghanbarian, Avazeh T; Hurst, Laurence D

2015-07-01

When considering the evolution of a gene's expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Use of keyword hierarchies to interpret gene expression patterns.

PubMed

Masys, D R; Welsh, J B; Lynn Fink, J; Gribskov, M; Klacansky, I; Corbeil, J

2001-04-01

High-density microarray technology permits the quantitative and simultaneous monitoring of thousands of genes. The interpretation challenge is to extract relevant information from this large amount of data. A growing variety of statistical analysis approaches are available to identify clusters of genes that share common expression characteristics, but provide no information regarding the biological similarities of genes within clusters. The published literature provides a potential source of information to assist in interpretation of clustering results. We describe a data mining method that uses indexing terms ('keywords') from the published literature linked to specific genes to present a view of the conceptual similarity of genes within a cluster or group of interest. The method takes advantage of the hierarchical nature of Medical Subject Headings used to index citations in the MEDLINE database, and the registry numbers applied to enzymes.
Cloning and heterologous expression of genes from the kinamycin biosynthetic pathway of Streptomyces murayamaensis.

PubMed

Gould, S J; Hong, S T; Carney, J R

1998-01-01

The genes for most of the biosynthesis of the kinamycin antibiotics have been cloned and heterologously expressed. Genomic DNA of Streptomyces murayamaensis was partially digested with MboI and a library of approximately 40 kb fragments in E. coli XL1-BlueMR was prepared using the cosmid vector pOJ446. Hybridization with the actI probe from the actinorhodin polyketide synthase genes identified two clusters of polyketide genes. After transferal of these clusters to S. lividans ZX7, expression of one cluster was established by HPLC with photodiode array detection. Peaks were identified from the kin cluster for dehydrorabelomycin, kinobscurinone, and stealthin C, which are known intermediates in kinamycin biosynthesis. Two shunt metabolites, kinafluorenone and seongomycin were also identified. The structure of the latter was determined from a quantity obtained from large-scale fermentation of one of the clones.
An efficient method to identify differentially expressed genes in microarray experiments

PubMed Central

Qin, Huaizhen; Feng, Tao; Harding, Scott A.; Tsai, Chung-Jui; Zhang, Shuanglin

2013-01-01

Motivation Microarray experiments typically analyze thousands to tens of thousands of genes from small numbers of biological replicates. The fact that genes are normally expressed in functionally relevant patterns suggests that gene-expression data can be stratified and clustered into relatively homogenous groups. Cluster-wise dimensionality reduction should make it feasible to improve screening power while minimizing information loss. Results We propose a powerful and computationally simple method for finding differentially expressed genes in small microarray experiments. The method incorporates a novel stratification-based tight clustering algorithm, principal component analysis and information pooling. Comprehensive simulations show that our method is substantially more powerful than the popular SAM and eBayes approaches. We applied the method to three real microarray datasets: one from a Populus nitrogen stress experiment with 3 biological replicates; and two from public microarray datasets of human cancers with 10 to 40 biological replicates. In all three analyses, our method proved more robust than the popular alternatives for identification of differentially expressed genes. Availability The C++ code to implement the proposed method is available upon request for academic use. PMID:18453554
Analysis of the Nicotiana tabacum Stigma/Style Transcriptome Reveals Gene Expression Differences between Wet and Dry Stigma Species1[W][OA

PubMed Central

Quiapim, Andréa C.; Brito, Michael S.; Bernardes, Luciano A.S.; daSilva, Idalete; Malavazi, Iran; DePaoli, Henrique C.; Molfetta-Machado, Jeanne B.; Giuliatti, Silvana; Goldman, Gustavo H.; Goldman, Maria Helena S.

2009-01-01

The success of plant reproduction depends on pollen-pistil interactions occurring at the stigma/style. These interactions vary depending on the stigma type: wet or dry. Tobacco (Nicotiana tabacum) represents a model of wet stigma, and its stigmas/styles express genes to accomplish the appropriate functions. For a large-scale study of gene expression during tobacco pistil development and preparation for pollination, we generated 11,216 high-quality expressed sequence tags (ESTs) from stigmas/styles and created the TOBEST database. These ESTs were assembled in 6,177 clusters, from which 52.1% are pistil transcripts/genes of unknown function. The 21 clusters with the highest number of ESTs (putative higher expression levels) correspond to genes associated with defense mechanisms or pollen-pistil interactions. The database analysis unraveled tobacco sequences homologous to the Arabidopsis (Arabidopsis thaliana) genes involved in specifying pistil identity or determining normal pistil morphology and function. Additionally, 782 independent clusters were examined by macroarray, revealing 46 stigma/style preferentially expressed genes. Real-time reverse transcription-polymerase chain reaction experiments validated the pistil-preferential expression for nine out of 10 genes tested. A search for these 46 genes in the Arabidopsis pistil data sets demonstrated that only 11 sequences, with putative equivalent molecular functions, are expressed in this dry stigma species. The reverse search for the Arabidopsis pistil genes in the TOBEST exposed a partial overlap between these dry and wet stigma transcriptomes. The TOBEST represents the most extensive survey of gene expression in the stigmas/styles of wet stigma plants, and our results indicate that wet and dry stigmas/styles express common as well as distinct genes in preparation for the pollination process. PMID:19052150

Delayed inflammatory mRNA and protein expression after spinal cord injury

PubMed Central

2011-01-01

Background Spinal cord injury (SCI) induces secondary tissue damage that is associated with inflammation. We have previously demonstrated that inflammation-related gene expression after SCI occurs in two waves - an initial cluster that is acutely and transiently up-regulated within 24 hours, and a more delayed cluster that peaks between 72 hours and 7 days. Here we extend the microarray analysis of these gene clusters up to 6 months post-SCI. Methods Adult male rats were subjected to mild, moderate or severe spinal cord contusion injury at T9 using a well-characterized weight-drop model. Tissue from the lesion epicenter was obtained 4 hours, 24 hours, 7 days, 28 days, 3 months or 6 months post-injury and processed for microarray analysis and protein expression. Results Anchor gene analysis using C1qB revealed a cluster of genes that showed elevated expression through 6 months post-injury, including galectin-3, p22PHOX, gp91PHOX, CD53 and progranulin. The expression of these genes occurred primarily in microglia/macrophage cells and was confirmed at the protein level using both immunohistochemistry and western blotting. As p22PHOX and gp91PHOX are components of the NADPH oxidase enzyme, enzymatic activity and its role in SCI were assessed and NADPH oxidase activity was found to be significantly up-regulated through 6 months post-injury. Further, treating rats with the nonspecific, irreversible NADPH oxidase inhibitor diphenylene iodinium (DPI) reduced both lesion volume and expression of chronic gene cluster proteins one month after trauma. Conclusions These data demonstrate that inflammation-related genes are chronically up-regulated after SCI and may contribute to further tissue loss. PMID:21975064
Transcriptome database resource and gene expression atlas for the rose

PubMed Central

2012-01-01

Background For centuries roses have been selected based on a number of traits. Little information exists on the genetic and molecular basis that contributes to these traits, mainly because information on expressed genes for this economically important ornamental plant is scarce. Results Here, we used a combination of Illumina and 454 sequencing technologies to generate information on Rosa sp. transcripts using RNA from various tissues and in response to biotic and abiotic stresses. A total of 80714 transcript clusters were identified and 76611 peptides have been predicted among which 20997 have been clustered into 13900 protein families. BLASTp hits in closely related Rosaceae species revealed that about half of the predicted peptides in the strawberry and peach genomes have orthologs in Rosa dataset. Digital expression was obtained using RNA samples from organs at different development stages and under different stress conditions. qPCR validated the digital expression data for a selection of 23 genes with high or low expression levels. Comparative gene expression analyses between the different tissues and organs allowed the identification of clusters that are highly enriched in given tissues or under particular conditions, demonstrating the usefulness of the digital gene expression analysis. A web interface ROSAseq was created that allows data interrogation by BLAST, subsequent analysis of DNA clusters and access to thorough transcript annotation including best BLAST matches on Fragaria vesca, Prunus persica and Arabidopsis. The rose peptides dataset was used to create the ROSAcyc resource pathway database that allows access to the putative genes and enzymatic pathways. Conclusions The study provides useful information on Rosa expressed genes, with thorough annotation and an overview of expression patterns for transcripts with good accuracy. PMID:23164410
Oxidative stress gene expression profile in inbred mouse after ischemia/reperfusion small bowel injury.

PubMed

Bertoletto, Paulo Roberto; Ikejiri, Adauto Tsutomu; Somaio Neto, Frederico; Chaves, José Carlos; Teruya, Roberto; Bertoletto, Eduardo Rodrigues; Taha, Murched Omar; Fagundes, Djalma José

2012-11-01

To determine the profile of gene expressions associated with oxidative stress and thereby contribute to establish parameters about the role of enzyme clusters related to the ischemia/reperfusion intestinal injury. Twelve male inbred mice (C57BL/6) were randomly assigned: Control Group (CG) submitted to anesthesia, laparotomy and observed by 120 min; Ischemia/reperfusion Group (IRG) submitted to anesthesia, laparotomy, 60 min of small bowel ischemia and 60 min of reperfusion. A pool of six samples was submitted to the qPCR-RT protocol (six clusters) for mouse oxidative stress and antioxidant defense pathways. On the 84 genes investigated, 64 (76.2%) had statistic significant expression and 20 (23.8%) showed no statistical difference to the control group. From these 64 significantly expressed genes, 60 (93.7%) were up-regulated and 04 (6.3%) were down-regulated. From the group with no statistical significantly expression, 12 genes were up-regulated and 8 genes were down-regulated. Surprisingly, 37 (44.04%) showed a higher than threefold up-regulation and then arbitrarily the values was considered as a very significant. Thus, 37 genes (44.04%) were expressed very significantly up-regulated. The remained 47 (55.9%) genes were up-regulated less than three folds (35 genes - 41.6%) or down-regulated less than three folds (12 genes - 14.3%). The intestinal ischemia and reperfusion promote a global hyper-expression profile of six different clusters genes related to antioxidant defense and oxidative stress.
Transcription factor clusters regulate genes in eukaryotic cells

PubMed Central

Hedlund, Erik G; Friemann, Rosmarie; Hohmann, Stefan

2017-01-01

Transcription is regulated through binding factors to gene promoters to activate or repress expression, however, the mechanisms by which factors find targets remain unclear. Using single-molecule fluorescence microscopy, we determined in vivo stoichiometry and spatiotemporal dynamics of a GFP tagged repressor, Mig1, from a paradigm signaling pathway of Saccharomyces cerevisiae. We find the repressor operates in clusters, which upon extracellular signal detection, translocate from the cytoplasm, bind to nuclear targets and turnover. Simulations of Mig1 configuration within a 3D yeast genome model combined with a promoter-specific, fluorescent translation reporter confirmed clusters are the functional unit of gene regulation. In vitro and structural analysis on reconstituted Mig1 suggests that clusters are stabilized by depletion forces between intrinsically disordered sequences. We observed similar clusters of a co-regulatory activator from a different pathway, supporting a generalized cluster model for transcription factors that reduces promoter search times through intersegment transfer while stabilizing gene expression. PMID:28841133
Clustering gene expression data based on predicted differential effects of GV interaction.

PubMed

Pan, Hai-Yan; Zhu, Jun; Han, Dan-Fu

2005-02-01

Microarray has become a popular biotechnology in biological and medical research. However, systematic and stochastic variabilities in microarray data are expected and unavoidable, resulting in the problem that the raw measurements have inherent "noise" within microarray experiments. Currently, logarithmic ratios are usually analyzed by various clustering methods directly, which may introduce bias interpretation in identifying groups of genes or samples. In this paper, a statistical method based on mixed model approaches was proposed for microarray data cluster analysis. The underlying rationale of this method is to partition the observed total gene expression level into various variations caused by different factors using an ANOVA model, and to predict the differential effects of GV (gene by variety) interaction using the adjusted unbiased prediction (AUP) method. The predicted GV interaction effects can then be used as the inputs of cluster analysis. We illustrated the application of our method with a gene expression dataset and elucidated the utility of our approach using an external validation.
A transversal approach to predict gene product networks from ontology-based similarity

PubMed Central

Chabalier, Julie; Mosser, Jean; Burgun, Anita

2007-01-01

Background Interpretation of transcriptomic data is usually made through a "standard" approach which consists in clustering the genes according to their expression patterns and exploiting Gene Ontology (GO) annotations within each expression cluster. This approach makes it difficult to underline functional relationships between gene products that belong to different expression clusters. To address this issue, we propose a transversal analysis that aims to predict functional networks based on a combination of GO processes and data expression. Results The transversal approach presented in this paper consists in computing the semantic similarity between gene products in a Vector Space Model. Through a weighting scheme over the annotations, we take into account the representativity of the terms that annotate a gene product. Comparing annotation vectors results in a matrix of gene product similarities. Combined with expression data, the matrix is displayed as a set of functional gene networks. The transversal approach was applied to 186 genes related to the enterocyte differentiation stages. This approach resulted in 18 functional networks proved to be biologically relevant. These results were compared with those obtained through a standard approach and with an approach based on information content similarity. Conclusion Complementary to the standard approach, the transversal approach offers new insight into the cellular mechanisms and reveals new research hypotheses by combining gene product networks based on semantic similarity, and data expression. PMID:17605807
Transcriptional profiles of Arabidopsis stomataless mutants reveal developmental and physiological features of life in the absence of stomata

PubMed Central

de Marcos, Alberto; Triviño, Magdalena; Pérez-Bueno, María Luisa; Ballesteros, Isabel; Barón, Matilde; Mena, Montaña; Fenoll, Carmen

2015-01-01

Loss of function of the positive stomata development regulators SPCH or MUTE in Arabidopsis thaliana renders stomataless plants; spch-3 and mute-3 mutants are extreme dwarfs, but produce cotyledons and tiny leaves, providing a system to interrogate plant life in the absence of stomata. To this end, we compared their cotyledon transcriptomes with that of wild-type plants. K-means clustering of differentially expressed genes generated four clusters: clusters 1 and 2 grouped genes commonly regulated in the mutants, while clusters 3 and 4 contained genes distinctively regulated in mute-3. Classification in functional categories and metabolic pathways of genes in clusters 1 and 2 suggested that both mutants had depressed secondary, nitrogen and sulfur metabolisms, while only a few photosynthesis-related genes were down-regulated. In situ quenching analysis of chlorophyll fluorescence revealed limited inhibition of photosynthesis. This and other fluorescence measurements matched the mutant transcriptomic features. Differential transcriptomes of both mutants were enriched in growth-related genes, including known stomata development regulators, which paralleled their epidermal phenotypes. Analysis of cluster 3 was not informative for developmental aspects of mute-3. Cluster 4 comprised genes differentially up−regulated in mute−3, 35% of which were direct targets for SPCH and may relate to the unique cell types of mute−3. A screen of T-DNA insertion lines in genes differentially expressed in the mutants identified a gene putatively involved in stomata development. A collection of lines for conditional overexpression of transcription factors differentially expressed in the mutants rendered distinct epidermal phenotypes, suggesting that these proteins may be novel stomatal development regulators. Thus, our transcriptome analysis represents a useful source of new genes for the study of stomata development and for characterizing physiology and growth in the absence of stomata. PMID:26157447
A 6-gene signature identifies four molecular subgroups of neuroblastoma

PubMed Central

2011-01-01

Background There are currently three postulated genomic subtypes of the childhood tumour neuroblastoma (NB); Type 1, Type 2A, and Type 2B. The most aggressive forms of NB are characterized by amplification of the oncogene MYCN (MNA) and low expression of the favourable marker NTRK1. Recently, mutations or high expression of the familial predisposition gene Anaplastic Lymphoma Kinase (ALK) was associated to unfavourable biology of sporadic NB. Also, various other genes have been linked to NB pathogenesis. Results The present study explores subgroup discrimination by gene expression profiling using three published microarray studies on NB (47 samples). Four distinct clusters were identified by Principal Components Analysis (PCA) in two separate data sets, which could be verified by an unsupervised hierarchical clustering in a third independent data set (101 NB samples) using a set of 74 discriminative genes. The expression signature of six NB-associated genes ALK, BIRC5, CCND1, MYCN, NTRK1, and PHOX2B, significantly discriminated the four clusters (p < 0.05, one-way ANOVA test). PCA clusters p1, p2, and p3 were found to correspond well to the postulated subtypes 1, 2A, and 2B, respectively. Remarkably, a fourth novel cluster was detected in all three independent data sets. This cluster comprised mainly 11q-deleted MNA-negative tumours with low expression of ALK, BIRC5, and PHOX2B, and was significantly associated with higher tumour stage, poor outcome and poor survival compared to the Type 1-corresponding favourable group (INSS stage 4 and/or dead of disease, p < 0.05, Fisher's exact test). Conclusions Based on expression profiling we have identified four molecular subgroups of neuroblastoma, which can be distinguished by a 6-gene signature. The fourth subgroup has not been described elsewhere, and efforts are currently made to further investigate this group's specific characteristics. PMID:21492432
Mining microbial metatranscriptomes for expression of antibiotic resistance genes under natural conditions.

PubMed

Versluis, Dennis; D'Andrea, Marco Maria; Ramiro Garcia, Javier; Leimena, Milkha M; Hugenholtz, Floor; Zhang, Jing; Öztürk, Başak; Nylund, Lotta; Sipkema, Detmer; van Schaik, Willem; de Vos, Willem M; Kleerebezem, Michiel; Smidt, Hauke; van Passel, Mark W J

2015-07-08

Antibiotic resistance genes are found in a broad range of ecological niches associated with complex microbiota. Here we investigated if resistance genes are not only present, but also transcribed under natural conditions. Furthermore, we examined the potential for antibiotic production by assessing the expression of associated secondary metabolite biosynthesis gene clusters. Metatranscriptome datasets from intestinal microbiota of four human adults, one human infant, 15 mice and six pigs, of which only the latter have received antibiotics prior to the study, as well as from sea bacterioplankton, a marine sponge, forest soil and sub-seafloor sediment, were investigated. We found that resistance genes are expressed in all studied ecological niches, albeit with niche-specific differences in relative expression levels and diversity of transcripts. For example, in mice and human infant microbiota predominantly tetracycline resistance genes were expressed while in human adult microbiota the spectrum of expressed genes was more diverse, and also included β-lactam, aminoglycoside and macrolide resistance genes. Resistance gene expression could result from the presence of natural antibiotics in the environment, although we could not link it to expression of corresponding secondary metabolites biosynthesis clusters. Alternatively, resistance gene expression could be constitutive, or these genes serve alternative roles besides antibiotic resistance.
Mining microbial metatranscriptomes for expression of antibiotic resistance genes under natural conditions

NASA Astrophysics Data System (ADS)

Versluis, Dennis; D'Andrea, Marco Maria; Ramiro Garcia, Javier; Leimena, Milkha M.; Hugenholtz, Floor; Zhang, Jing; Öztürk, Başak; Nylund, Lotta; Sipkema, Detmer; Schaik, Willem Van; de Vos, Willem M.; Kleerebezem, Michiel; Smidt, Hauke; Passel, Mark W. J. Van

2015-07-01

Antibiotic resistance genes are found in a broad range of ecological niches associated with complex microbiota. Here we investigated if resistance genes are not only present, but also transcribed under natural conditions. Furthermore, we examined the potential for antibiotic production by assessing the expression of associated secondary metabolite biosynthesis gene clusters. Metatranscriptome datasets from intestinal microbiota of four human adults, one human infant, 15 mice and six pigs, of which only the latter have received antibiotics prior to the study, as well as from sea bacterioplankton, a marine sponge, forest soil and sub-seafloor sediment, were investigated. We found that resistance genes are expressed in all studied ecological niches, albeit with niche-specific differences in relative expression levels and diversity of transcripts. For example, in mice and human infant microbiota predominantly tetracycline resistance genes were expressed while in human adult microbiota the spectrum of expressed genes was more diverse, and also included β-lactam, aminoglycoside and macrolide resistance genes. Resistance gene expression could result from the presence of natural antibiotics in the environment, although we could not link it to expression of corresponding secondary metabolites biosynthesis clusters. Alternatively, resistance gene expression could be constitutive, or these genes serve alternative roles besides antibiotic resistance.
Computational genetic neuroanatomy of the developing mouse brain: dimensionality reduction, visualization, and clustering.

PubMed

Ji, Shuiwang

2013-07-11

The structured organization of cells in the brain plays a key role in its functional efficiency. This delicate organization is the consequence of unique molecular identity of each cell gradually established by precise spatiotemporal gene expression control during development. Currently, studies on the molecular-structural association are beginning to reveal how the spatiotemporal gene expression patterns are related to cellular differentiation and structural development. In this article, we aim at a global, data-driven study of the relationship between gene expressions and neuroanatomy in the developing mouse brain. To enable visual explorations of the high-dimensional data, we map the in situ hybridization gene expression data to a two-dimensional space by preserving both the global and the local structures. Our results show that the developing brain anatomy is largely preserved in the reduced gene expression space. To provide a quantitative analysis, we cluster the reduced data into groups and measure the consistency with neuroanatomy at multiple levels. Our results show that the clusters in the low-dimensional space are more consistent with neuroanatomy than those in the original space. Gene expression patterns and developing brain anatomy are closely related. Dimensionality reduction and visual exploration facilitate the study of this relationship.
A novel harmony search-K means hybrid algorithm for clustering gene expression data

PubMed Central

Nazeer, KA Abdul; Sebastian, MP; Kumar, SD Madhu

2013-01-01

Recent progress in bioinformatics research has led to the accumulation of huge quantities of biological data at various data sources. The DNA microarray technology makes it possible to simultaneously analyze large number of genes across different samples. Clustering of microarray data can reveal the hidden gene expression patterns from large quantities of expression data that in turn offers tremendous possibilities in functional genomics, comparative genomics, disease diagnosis and drug development. The k- ¬means clustering algorithm is widely used for many practical applications. But the original k-¬means algorithm has several drawbacks. It is computationally expensive and generates locally optimal solutions based on the random choice of the initial centroids. Several methods have been proposed in the literature for improving the performance of the k-¬means algorithm. A meta-heuristic optimization algorithm named harmony search helps find out near-global optimal solutions by searching the entire solution space. Low clustering accuracy of the existing algorithms limits their use in many crucial applications of life sciences. In this paper we propose a novel Harmony Search-K means Hybrid (HSKH) algorithm for clustering the gene expression data. Experimental results show that the proposed algorithm produces clusters with better accuracy in comparison with the existing algorithms. PMID:23390351
A novel harmony search-K means hybrid algorithm for clustering gene expression data.

PubMed

Nazeer, Ka Abdul; Sebastian, Mp; Kumar, Sd Madhu

2013-01-01

Recent progress in bioinformatics research has led to the accumulation of huge quantities of biological data at various data sources. The DNA microarray technology makes it possible to simultaneously analyze large number of genes across different samples. Clustering of microarray data can reveal the hidden gene expression patterns from large quantities of expression data that in turn offers tremendous possibilities in functional genomics, comparative genomics, disease diagnosis and drug development. The k- ¬means clustering algorithm is widely used for many practical applications. But the original k-¬means algorithm has several drawbacks. It is computationally expensive and generates locally optimal solutions based on the random choice of the initial centroids. Several methods have been proposed in the literature for improving the performance of the k-¬means algorithm. A meta-heuristic optimization algorithm named harmony search helps find out near-global optimal solutions by searching the entire solution space. Low clustering accuracy of the existing algorithms limits their use in many crucial applications of life sciences. In this paper we propose a novel Harmony Search-K means Hybrid (HSKH) algorithm for clustering the gene expression data. Experimental results show that the proposed algorithm produces clusters with better accuracy in comparison with the existing algorithms.
Muscle transcriptome profiling in divergent phenotype swine breeds during growth using microarray and RT-PCR tools.

PubMed

D'Andrea, M; Dal Monego, S; Pallavicini, A; Modonut, M; Dreos, R; Stefanon, B; Pilla, F

2011-10-01

Using an array consisting of 10 665 70-mer oligonucleotide probes, the longissimus dorsi muscle tissue expression during growth in nine pigs belonging to Casertana (CT), an autochthonous breed characterized by slow growth and a massive accumulation of backfat, was compared with that of two cosmopolitan breeds, Large White (LW) and a crossbreed (CB; Duroc × Landrace × Large White). The results were validated by real-time PCR. All animals were of the same age and were raised under the same environmental conditions. Muscle tissues were collected at 3, 6, 9 and 11 months of age, and a total of 173 genes showed significant differential expression between CT and the cosmopolitan genetic types at 3 months of age. Time series cluster analysis indicated that the CT breed had a different pattern of gene expression compared with that of the LW and the CB. Four of the eight clusters highlighted the gene differences between CT and the other two breeds, which were further supported by statistical analyses: clusters 4 and 5 contained a total of 71 genes that were underexpressed at 3 months of age, and cluster 3 and cluster 7 included 28 and 42 genes respectively that were overexpressed at 3 months of age. As expected, differentially expressed genes belonged to the category of genes coding for contractile fibres and transcription factors involved in muscle development and differentiation. These findings highlight muscle expression genes during pig growth and are useful to understand the genetic meaning of the different developmental rates. © 2011 The Authors, Animal Genetics © 2011 Stichting International Foundation for Animal Genetics.
Identification of an unusual type II thioesterase in the dithiolopyrrolone antibiotics biosynthetic pathway

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhai, Ying; Bai, Silei; Liu, Jingjing

Dithiolopyrrolone group antibiotics characterized by an electronically unique dithiolopyrrolone heterobicyclic core are known for their antibacterial, antifungal, insecticidal and antitumor activities. Recently the biosynthetic gene clusters for two dithiolopyrrolone compounds, holomycin and thiomarinol, have been identified respectively in different bacterial species. Here, we report a novel dithiolopyrrolone biosynthetic gene cluster (aut) isolated from Streptomyces thioluteus DSM 40027 which produces two pyrrothine derivatives, aureothricin and thiolutin. By comparison with other characterized dithiolopyrrolone clusters, eight genes in the aut cluster were verified to be responsible for the assembly of dithiolopyrrolone core. The aut cluster was further confirmed by heterologous expression and in-framemore » gene deletion experiments. Intriguingly, we found that the heterogenetic thioesterase HlmK derived from the holomycin (hlm) gene cluster in Streptomyces clavuligerus significantly improved heterologous biosynthesis of dithiolopyrrolones in Streptomyces albus through coexpression with the aut cluster. In the previous studies, HlmK was considered invalid because it has a Ser to Gly point mutation within the canonical Ser-His-Asp catalytic triad of thioesterases. However, gene inactivation and complementation experiments in our study unequivocally demonstrated that HlmK is an active distinctive type II thioesterase that plays a beneficial role in dithiolopyrrolone biosynthesis. - Highlights: • Cloning of the aureothricin biosynthetic gene cluster from Streptomyces thioluteus DSM 40027. • Identification of the aureothricin gene cluster by heterologous expression and in-frame gene deletion. • The heterogenetic thioesterase HlmK significantly improved dithiolopyrrolones production of the aureothricin gene cluster. • Identification of HlmK as an unusual type II thioesterase.« less
A cluster merging method for time series microarray with production values.

PubMed

Chira, Camelia; Sedano, Javier; Camara, Monica; Prieto, Carlos; Villar, Jose R; Corchado, Emilio

2014-09-01

A challenging task in time-course microarray data analysis is to cluster genes meaningfully combining the information provided by multiple replicates covering the same key time points. This paper proposes a novel cluster merging method to accomplish this goal obtaining groups with highly correlated genes. The main idea behind the proposed method is to generate a clustering starting from groups created based on individual temporal series (representing different biological replicates measured in the same time points) and merging them by taking into account the frequency by which two genes are assembled together in each clustering. The gene groups at the level of individual time series are generated using several shape-based clustering methods. This study is focused on a real-world time series microarray task with the aim to find co-expressed genes related to the production and growth of a certain bacteria. The shape-based clustering methods used at the level of individual time series rely on identifying similar gene expression patterns over time which, in some models, are further matched to the pattern of production/growth. The proposed cluster merging method is able to produce meaningful gene groups which can be naturally ranked by the level of agreement on the clustering among individual time series. The list of clusters and genes is further sorted based on the information correlation coefficient and new problem-specific relevant measures. Computational experiments and results of the cluster merging method are analyzed from a biological perspective and further compared with the clustering generated based on the mean value of time series and the same shape-based algorithm.
G-NEST: A gene neighborhood scoring tool to identify co-conserved, co-expressed genes

USDA-ARS?s Scientific Manuscript database

In previous studies, gene neighborhoods--spatial clusters of co-expressed genes in the genome--have been defined using arbitrary rules such as requiring adjacency, a minimum number of genes, a fixed window size, or a minimum expression level. In the current study, we developed a Gene Neighborhood Sc...
Gene expression pattern recognition algorithm inferences to classify samples exposed to chemical agents

NASA Astrophysics Data System (ADS)

Bushel, Pierre R.; Bennett, Lee; Hamadeh, Hisham; Green, James; Ableson, Alan; Misener, Steve; Paules, Richard; Afshari, Cynthia

2002-06-01

We present an analysis of pattern recognition procedures used to predict the classes of samples exposed to pharmacologic agents by comparing gene expression patterns from samples treated with two classes of compounds. Rat liver mRNA samples following exposure for 24 hours with phenobarbital or peroxisome proliferators were analyzed using a 1700 rat cDNA microarray platform. Sets of genes that were consistently differentially expressed in the rat liver samples following treatment were stored in the MicroArray Project System (MAPS) database. MAPS identified 238 genes in common that possessed a low probability (P < 0.01) of being randomly detected as differentially expressed at the 95% confidence level. Hierarchical cluster analysis on the 238 genes clustered specific gene expression profiles that separated samples based on exposure to a particular class of compound.
Complementary striped expression patterns of NK homeobox genes during segment formation in the annelid Platynereis.

PubMed

Saudemont, Alexandra; Dray, Nicolas; Hudry, Bruno; Le Gouar, Martine; Vervoort, Michel; Balavoine, Guillaume

2008-05-15

NK genes are related pan-metazoan homeobox genes. In the fruitfly, NK genes are clustered and involved in patterning various mesodermal derivatives during embryogenesis. It was therefore suggested that the NK cluster emerged in evolution as an ancestral mesodermal patterning cluster. To test this hypothesis, we cloned and analysed the expression patterns of the homologues of NK cluster genes Msx, NK4, NK3, Lbx, Tlx, NK1 and NK5 in the marine annelid Platynereis dumerilii, a representative of trochozoans, the third great branch of bilaterian animals alongside deuterostomes and ecdysozoans. We found that most of these genes are involved, as they are in the fly, in the specification of distinct mesodermal derivatives, notably subsets of muscle precursors. The expression of the homologue of NK4/tinman in the pulsatile dorsal vessel of Platynereis strongly supports the hypothesis that the vertebrate heart derived from a dorsal vessel relocated to a ventral position by D/V axis inversion in a chordate ancestor. Additionally and more surprisingly, NK4, Lbx, Msx, Tlx and NK1 orthologues are expressed in complementary sets of stripes in the ectoderm and/or mesoderm of forming segments, suggesting an involvement in the segment formation process. A potentially ancient role of the NK cluster genes in segment formation, unsuspected from vertebrate and fruitfly studies so far, now deserves to be investigated in other bilaterian species, especially non-insect arthropods and onychophorans.
Applying Multivariate Adaptive Splines to Identify Genes With Expressions Varying After Diagnosis in Microarray Experiments.

PubMed

Duan, Fenghai; Xu, Ye

2017-01-01

To analyze a microarray experiment to identify the genes with expressions varying after the diagnosis of breast cancer. A total of 44 928 probe sets in an Affymetrix microarray data publicly available on Gene Expression Omnibus from 249 patients with breast cancer were analyzed by the nonparametric multivariate adaptive splines. Then, the identified genes with turning points were grouped by K-means clustering, and their network relationship was subsequently analyzed by the Ingenuity Pathway Analysis. In total, 1640 probe sets (genes) were reliably identified to have turning points along with the age at diagnosis in their expression profiling, of which 927 expressed lower after turning points and 713 expressed higher after the turning points. K-means clustered them into 3 groups with turning points centering at 54, 62.5, and 72, respectively. The pathway analysis showed that the identified genes were actively involved in various cancer-related functions or networks. In this article, we applied the nonparametric multivariate adaptive splines method to a publicly available gene expression data and successfully identified genes with expressions varying before and after breast cancer diagnosis.

An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks.

PubMed

Botía, Juan A; Vandrovcova, Jana; Forabosco, Paola; Guelfi, Sebastian; D'Sa, Karishma; Hardy, John; Lewis, Cathryn M; Ryten, Mina; Weale, Michael E

2017-04-12

Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn ). We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices. The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.
Assembly and features of secondary metabolite biosynthetic gene clusters in Streptomyces ansochromogenes.

PubMed

Zhong, Xingyu; Tian, Yuqing; Niu, Guoqing; Tan, Huarong

2013-07-01

A draft genome sequence of Streptomyces ansochromogenes 7100 was generated using 454 sequencing technology. In combination with local BLAST searches and gap filling techniques, a comprehensive antiSMASH-based method was adopted to assemble the secondary metabolite biosynthetic gene clusters in the draft genome of S. ansochromogenes. A total of at least 35 putative gene clusters were identified and assembled. Transcriptional analysis showed that 20 of the 35 gene clusters were expressed in either or all of the three different media tested, whereas the other 15 gene clusters were silent in all three different media. This study provides a comprehensive method to identify and assemble secondary metabolite biosynthetic gene clusters in draft genomes of Streptomyces, and will significantly promote functional studies of these secondary metabolite biosynthetic gene clusters.
Microarray characterization of gene expression changes in blood during acute ethanol exposure

PubMed Central

2013-01-01

Background As part of the civil aviation safety program to define the adverse effects of ethanol on flying performance, we performed a DNA microarray analysis of human whole blood samples from a five-time point study of subjects administered ethanol orally, followed by breathalyzer analysis, to monitor blood alcohol concentration (BAC) to discover significant gene expression changes in response to the ethanol exposure. Methods Subjects were administered either orange juice or orange juice with ethanol. Blood samples were taken based on BAC and total RNA was isolated from PaxGene™ blood tubes. The amplified cDNA was used in microarray and quantitative real-time polymerase chain reaction (RT-qPCR) analyses to evaluate differential gene expression. Microarray data was analyzed in a pipeline fashion to summarize and normalize and the results evaluated for relative expression across time points with multiple methods. Candidate genes showing distinctive expression patterns in response to ethanol were clustered by pattern and further analyzed for related function, pathway membership and common transcription factor binding within and across clusters. RT-qPCR was used with representative genes to confirm relative transcript levels across time to those detected in microarrays. Results Microarray analysis of samples representing 0%, 0.04%, 0.08%, return to 0.04%, and 0.02% wt/vol BAC showed that changes in gene expression could be detected across the time course. The expression changes were verified by qRT-PCR. The candidate genes of interest (GOI) identified from the microarray analysis and clustered by expression pattern across the five BAC points showed seven coordinately expressed groups. Analysis showed function-based networks, shared transcription factor binding sites and signaling pathways for members of the clusters. These include hematological functions, innate immunity and inflammation functions, metabolic functions expected of ethanol metabolism, and pancreatic and hepatic function. Five of the seven clusters showed links to the p38 MAPK pathway. Conclusions The results of this study provide a first look at changing gene expression patterns in human blood during an acute rise in blood ethanol concentration and its depletion because of metabolism and excretion, and demonstrate that it is possible to detect changes in gene expression using total RNA isolated from whole blood. The analysis approach for this study serves as a workflow to investigate the biology linked to expression changes across a time course and from these changes, to identify target genes that could serve as biomarkers linked to pilot performance. PMID:23883607
Biased immunoglobulin light chain gene usage in the shark1

PubMed Central

Iacoangeli, Anna; Lui, Anita; Naik, Ushma; Ohta, Yuko; Flajnik, Martin; Hsu, Ellen

2015-01-01

This study of a large family of kappa light (L) chain clusters in nurse shark completes the characterization of its classical immunoglobulin (Ig) gene content (two heavy chain classes, mu and omega, and four L chain isotopes, kappa, lambda, sigma, and sigma-2). The shark kappa clusters are minigenes consisting of a simple VL-JL-CL array, where V to J recombination occurs over a ~500 bp interval, and functional clusters are widely separated by at least 100 kb. Six out of ca. 39 kappa clusters are pre-rearranged in the germline (GL-joined). Unlike the complex gene organization and multistep assembly process of Ig in mammals, each shark Ig rearrangement, somatic or in the germline, appears to be an independent event localized to the minigene. This study examined the expression of functional, non-productive, and sterile transcripts of the kappa clusters compared to the other three L chain isotypes. Kappa cluster usage was investigated in young sharks, and a skewed pattern of split gene expression was observed, one similar in functional and non-productive rearrangements. These results show that the individual activation of the spatially distant kappa clusters is non-random. Although both split and GL-joined kappa genes are expressed, the latter are prominent in young animals and wane with age. We speculate that, in the shark, the differential activation of the multiple isotypes can be advantageously used in receptor editing. PMID:26342033
Biased Immunoglobulin Light Chain Gene Usage in the Shark.

PubMed

Iacoangeli, Anna; Lui, Anita; Naik, Ushma; Ohta, Yuko; Flajnik, Martin; Hsu, Ellen

2015-10-15

This study of a large family of κ L chain clusters in nurse shark completes the characterization of its classical Ig gene content (two H chain isotypes, μ and ω, and four L chain isotypes, κ, λ, σ, and σ-2). The shark κ clusters are minigenes consisting of a simple VL-JL-CL array, where V to J recombination occurs over an ~500-bp interval, and functional clusters are widely separated by at least 100 kb. Six out of ~39 κ clusters are prerearranged in the germline (germline joined). Unlike the complex gene organization and multistep assembly process of Ig in mammals, each shark Ig rearrangement, somatic or in the germline, appears to be an independent event localized to the minigene. This study examined the expression of functional, nonproductive, and sterile transcripts of the κ clusters compared with the other three L chain isotypes. κ cluster usage was investigated in young sharks, and a skewed pattern of split gene expression was observed, one similar in functional and nonproductive rearrangements. These results show that the individual activation of the spatially distant κ clusters is nonrandom. Although both split and germline-joined κ genes are expressed, the latter are prominent in young animals and wane with age. We speculate that, in the shark, the differential activation of the multiple isotypes can be advantageously used in receptor editing. Copyright © 2015 by The American Association of Immunologists, Inc.
Molecular analysis of SCARECROW genes expressed in white lupin cluster roots

PubMed Central

Sbabou, Laila; Bucciarelli, Bruna; Miller, Susan; Liu, Junqi; Berhada, Fatiha; Filali-Maltouf, Abdelkarim; Allan, Deborah; Vance, Carroll

2010-01-01

The Scarecrow (SCR) transcription factor plays a crucial role in root cell radial patterning and is required for maintenance of the quiescent centre and differentiation of the endodermis. In response to phosphorus (P) deficiency, white lupin (Lupinus albus L.) root surface area increases some 50-fold to 70-fold due to the development of cluster (proteoid) roots. Previously it was reported that SCR-like expressed sequence tags (ESTs) were expressed during early cluster root development. Here the cloning of two white lupin SCR genes, LaSCR1 and LaSCR2, is reported. The predicted amino acid sequences of both LaSCR gene products are highly similar to AtSCR and contain C-terminal conserved GRAS family domains. LaSCR1 and LaSCR2 transcript accumulation localized to the endodermis of both normal and cluster roots as shown by in situ hybridization and gene promoter::reporter staining. Transcript analysis as evaluated by quantitative real-time-PCR (qRT-PCR) and RNA gel hybridization indicated that the two LaSCR genes are expressed predominantly in roots. Expression of LaSCR genes was not directly responsive to the P status of the plant but was a function of cluster root development. Suppression of LaSCR1 in transformed roots of lupin and Medicago via RNAi (RNA interference) delivered through Agrobacterium rhizogenes resulted in decreased root numbers, reflecting the potential role of LaSCR1 in maintaining root growth in these species. The results suggest that the functional orthologues of AtSCR have been characterized. PMID:20167612
An improved Pearson's correlation proximity-based hierarchical clustering for mining biological association between genes.

PubMed

Booma, P M; Prabhakaran, S; Dhanalakshmi, R

2014-01-01

Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality.
An Improved Pearson's Correlation Proximity-Based Hierarchical Clustering for Mining Biological Association between Genes

PubMed Central

Booma, P. M.; Prabhakaran, S.; Dhanalakshmi, R.

2014-01-01

Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality. PMID:25136661
Gene Discovery in Bladder Cancer Progression using cDNA Microarrays

PubMed Central

Sanchez-Carbayo, Marta; Socci, Nicholas D.; Lozano, Juan Jose; Li, Wentian; Charytonowicz, Elizabeth; Belbin, Thomas J.; Prystowsky, Michael B.; Ortiz, Angel R.; Childs, Geoffrey; Cordon-Cardo, Carlos

2003-01-01

To identify gene expression changes along progression of bladder cancer, we compared the expression profiles of early-stage and advanced bladder tumors using cDNA microarrays containing 17,842 known genes and expressed sequence tags. The application of bootstrapping techniques to hierarchical clustering segregated early-stage and invasive transitional carcinomas into two main clusters. Multidimensional analysis confirmed these clusters and more importantly, it separated carcinoma in situ from papillary superficial lesions and subgroups within early-stage and invasive tumors displaying different overall survival. Additionally, it recognized early-stage tumors showing gene profiles similar to invasive disease. Different techniques including standard t-test, single-gene logistic regression, and support vector machine algorithms were applied to identify relevant genes involved in bladder cancer progression. Cytokeratin 20, neuropilin-2, p21, and p33ING1 were selected among the top ranked molecular targets differentially expressed and validated by immunohistochemistry using tissue microarrays (n = 173). Their expression patterns were significantly associated with pathological stage, tumor grade, and altered retinoblastoma (RB) expression. Moreover, p33ING1 expression levels were significantly associated with overall survival. Analysis of the annotation of the most significant genes revealed the relevance of critical genes and pathways during bladder cancer progression, including the overexpression of oncogenic genes such as DEK in superficial tumors or immune response genes such as Cd86 antigen in invasive disease. Gene profiling successfully classified bladder tumors based on their progression and clinical outcome. The present study has identified molecular biomarkers of potential clinical significance and critical molecular targets associated with bladder cancer progression. PMID:12875971
Platypus globin genes and flanking loci suggest a new insertional model for beta-globin evolution in birds and mammals.

PubMed

Patel, Vidushi S; Cooper, Steven J B; Deakin, Janine E; Fulton, Bob; Graves, Tina; Warren, Wesley C; Wilson, Richard K; Graves, Jennifer A M

2008-07-25

Vertebrate alpha (alpha)- and beta (beta)-globin gene families exemplify the way in which genomes evolve to produce functional complexity. From tandem duplication of a single globin locus, the alpha- and beta-globin clusters expanded, and then were separated onto different chromosomes. The previous finding of a fossil beta-globin gene (omega) in the marsupial alpha-cluster, however, suggested that duplication of the alpha-beta cluster onto two chromosomes, followed by lineage-specific gene loss and duplication, produced paralogous alpha- and beta-globin clusters in birds and mammals. Here we analyse genomic data from an egg-laying monotreme mammal, the platypus (Ornithorhynchus anatinus), to explore haemoglobin evolution at the stem of the mammalian radiation. The platypus alpha-globin cluster (chromosome 21) contains embryonic and adult alpha- globin genes, a beta-like omega-globin gene, and the GBY globin gene with homology to cytoglobin, arranged as 5'-zeta-zeta'-alphaD-alpha3-alpha2-alpha1-omega-GBY-3'. The platypus beta-globin cluster (chromosome 2) contains single embryonic and adult globin genes arranged as 5'-epsilon-beta-3'. Surprisingly, all of these globin genes were expressed in some adult tissues. Comparison of flanking sequences revealed that all jawed vertebrate alpha-globin clusters are flanked by MPG-C16orf35 and LUC7L, whereas all bird and mammal beta-globin clusters are embedded in olfactory genes. Thus, the mammalian alpha- and beta-globin clusters are orthologous to the bird alpha- and beta-globin clusters respectively. We propose that alpha- and beta-globin clusters evolved from an ancient MPG-C16orf35-alpha-beta-GBY-LUC7L arrangement 410 million years ago. A copy of the original beta (represented by omega in marsupials and monotremes) was inserted into an array of olfactory genes before the amniote radiation (>315 million years ago), then duplicated and diverged to form orthologous clusters of beta-globin genes with different expression profiles in different lineages.
Cloning and heterologous expression of the entire gene clusters for PD 116740 from Streptomyces strain WP 4669 and tetrangulol and tetrangomycin from Streptomyces rimosus NRRL 3016.

PubMed Central

Hong, S T; Carney, J R; Gould, S J

1997-01-01

The genes for the complete pathways for two polycyclic aromatic polyketides of the angucyclinone class have been cloned and heterologously expressed. Genomic DNAs of Streptomyces rimosus NRRL 3016 and Streptomyces strain WP 4669 were partially digested with MboI, and libraries (ca. 40-kb fragments) in Escherichia coli XL1-Blue MR were prepared with the cosmid vector pOJ446. Hybridization with the actI probe from the actinorhodin polyketide synthase genes identified two clusters of polyketide genes from each organism. After transfer of the four clusters to Streptomyces lividans TK24, expression of one cluster from each organism was established through the identification of pathway-specific products by high-performance liquid chromatography with photodiode array detection. Peaks were identified from the S. rimosus cluster (pksRIM-1) for tetrangulol, tetrangomycin, and fridamycin E. Peaks were identified from the WP 4669 cluster (pksWP-2) for tetrangulol, 19-hydroxytetrangulol, 8-O-methyltetrangulol, 19-hydroxy-8-O-methyltetrangulol, and PD 116740. Structures were confirmed by 1H nuclear magnetic resonance spectroscopy and high-resolution mass spectrometry. PMID:8990300
Cloning and heterologous expression of the entire gene clusters for PD 116740 from Streptomyces strain WP 4669 and tetrangulol and tetrangomycin from Streptomyces rimosus NRRL 3016.

PubMed

Hong, S T; Carney, J R; Gould, S J

1997-01-01

The genes for the complete pathways for two polycyclic aromatic polyketides of the angucyclinone class have been cloned and heterologously expressed. Genomic DNAs of Streptomyces rimosus NRRL 3016 and Streptomyces strain WP 4669 were partially digested with MboI, and libraries (ca. 40-kb fragments) in Escherichia coli XL1-Blue MR were prepared with the cosmid vector pOJ446. Hybridization with the actI probe from the actinorhodin polyketide synthase genes identified two clusters of polyketide genes from each organism. After transfer of the four clusters to Streptomyces lividans TK24, expression of one cluster from each organism was established through the identification of pathway-specific products by high-performance liquid chromatography with photodiode array detection. Peaks were identified from the S. rimosus cluster (pksRIM-1) for tetrangulol, tetrangomycin, and fridamycin E. Peaks were identified from the WP 4669 cluster (pksWP-2) for tetrangulol, 19-hydroxytetrangulol, 8-O-methyltetrangulol, 19-hydroxy-8-O-methyltetrangulol, and PD 116740. Structures were confirmed by 1H nuclear magnetic resonance spectroscopy and high-resolution mass spectrometry.
MO-DE-207B-03: Improved Cancer Classification Using Patient-Specific Biological Pathway Information Via Gene Expression Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Young, M; Craft, D

Purpose: To develop an efficient, pathway-based classification system using network biology statistics to assist in patient-specific response predictions to radiation and drug therapies across multiple cancer types. Methods: We developed PICS (Pathway Informed Classification System), a novel two-step cancer classification algorithm. In PICS, a matrix m of mRNA expression values for a patient cohort is collapsed into a matrix p of biological pathways. The entries of p, which we term pathway scores, are obtained from either principal component analysis (PCA), normal tissue centroid (NTC), or gene expression deviation (GED). The pathway score matrix is clustered using both k-means and hierarchicalmore » clustering, and a clustering is judged by how well it groups patients into distinct survival classes. The most effective pathway scoring/clustering combination, per clustering p-value, thus generates various ‘signatures’ for conventional and functional cancer classification. Results: PICS successfully regularized large dimension gene data, separated normal and cancerous tissues, and clustered a large patient cohort spanning six cancer types. Furthermore, PICS clustered patient cohorts into distinct, statistically-significant survival groups. For a suboptimally-debulked ovarian cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00127) showed significant improvement over that of a prior gene expression-classified study (p = .0179). For a pancreatic cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00141) showed significant improvement over that of a prior gene expression-classified study (p = .04). Pathway-based classification confirmed biomarkers for the pyrimidine, WNT-signaling, glycerophosphoglycerol, beta-alanine, and panthothenic acid pathways for ovarian cancer. Despite its robust nature, PICS requires significantly less run time than current pathway scoring methods. Conclusion: This work validates the PICS method to improve cancer classification using biological pathways. Patients are classified with greater specificity and physiological relevance as compared to current gene-specific approaches. Focus now moves to utilizing PICS for pan-cancer patient-specific treatment response prediction.« less
Intact cluster and chordate-like expression of ParaHox genes in a sea star

PubMed Central

2013-01-01

Background The ParaHox genes are thought to be major players in patterning the gut of several bilaterian taxa. Though this is a fundamental role that these transcription factors play, their activities are not limited to the endoderm and extend to both ectodermal and mesodermal tissues. Three genes compose the ParaHox group: Gsx, Xlox and Cdx. In some taxa (mostly chordates but to some degree also in protostomes) the three genes are arranged into a genomic cluster, in a similar fashion to what has been shown for the better-known Hox genes. Sea urchins possess the full complement of ParaHox genes but they are all dispersed throughout the genome, an arrangement that, perhaps, represented the primitive condition for all echinoderms. In order to understand the evolutionary history of this group of genes we cloned and characterized all ParaHox genes, studied their expression patterns and identified their genomic loci in a member of an earlier branching group of echinoderms, the asteroid Patiria miniata. Results We identified the three ParaHox orthologs in the genome of P. miniata. While one of them, PmGsx is provided as maternal message, with no zygotic activation afterwards, the other two, PmLox and PmCdx are expressed during embryogenesis, within restricted domains of both endoderm and ectoderm. Screening of a Patiria bacterial artificial chromosome (BAC) library led to the identification of a clone containing the three genes. The transcriptional directions of PmGsx and PmLox are opposed to that of the PmCdx gene within the cluster. Conclusions The identification of P. miniata ParaHox genes has revealed the fact that these genes are clustered in the genome, in contrast to what has been reported for echinoids. Since the presence of an intact cluster, or at least a partial cluster, has been reported in chordates and polychaetes respectively, it becomes clear that within echinoderms, sea urchins have modified the original bilaterian arrangement. Moreover, the sea star ParaHox domains of expression show chordate-like features not found in the sea urchin, confirming that the dynamics of gene expression for the respective genes and their putative regulatory interactions have clearly changed over evolutionary time within the echinoid lineage. PMID:23803323
Chassis optimization as a cornerstone for the application of synthetic biology based strategies in microbial secondary metabolism.

PubMed

Beites, Tiago; Mendes, Marta V

2015-01-01

The increased number of bacterial genome sequencing projects has generated over the last years a large reservoir of genomic information. In silico analysis of this genomic data has renewed the interest in bacterial bioprospecting for bioactive compounds by unveiling novel biosynthetic gene clusters of unknown or uncharacterized metabolites. However, only a small fraction of those metabolites is produced under laboratory-controlled conditions; the remaining clusters represent a pool of novel metabolites that are waiting to be "awaken". Activation of the biosynthetic gene clusters that present reduced or no expression (known as cryptic or silent clusters) by heterologous expression has emerged as a strategy for the identification and production of novel bioactive molecules. Synthetic biology, with engineering principles at its core, provides an excellent framework for the development of efficient heterologous systems for the expression of biosynthetic gene clusters. However, a common problem in its application is the host-interference problem, i.e., the unpredictable interactions between the device and the host that can hamper the desired output. Although an effort has been made to develop orthogonal devices, the most proficient way to overcome the host-interference problem is through genome simplification. In this review we present an overview on the strategies and tools used in the development of hosts/chassis for the heterologous expression of specialized metabolites biosynthetic gene clusters. Finally, we introduce the concept of specialized host as the next step of development of expression hosts.
Identification of an Imprinted Gene Cluster in the X-Inactivation Center

PubMed Central

Kobayashi, Shin; Totoki, Yasushi; Soma, Miki; Matsumoto, Kazuya; Fujihara, Yoshitaka; Toyoda, Atsushi; Sakaki, Yoshiyuki; Okabe, Masaru; Ishino, Fumitoshi

2013-01-01

Mammalian development is strongly influenced by the epigenetic phenomenon called genomic imprinting, in which either the paternal or the maternal allele of imprinted genes is expressed. Paternally expressed Xist, an imprinted gene, has been considered as a single cis-acting factor to inactivate the paternally inherited X chromosome (Xp) in preimplantation mouse embryos. This means that X-chromosome inactivation also entails gene imprinting at a very early developmental stage. However, the precise mechanism of imprinted X-chromosome inactivation remains unknown and there is little information about imprinted genes on X chromosomes. In this study, we examined whether there are other imprinted genes than Xist expressed from the inactive paternal X chromosome and expressed in female embryos at the preimplantation stage. We focused on small RNAs and compared their expression patterns between sexes by tagging the female X chromosome with green fluorescent protein. As a result, we identified two micro (mi)RNAs–miR-374-5p and miR-421-3p–mapped adjacent to Xist that were predominantly expressed in female blastocysts. Allelic expression analysis revealed that these miRNAs were indeed imprinted and expressed from the Xp. Further analysis of the imprinting status of adjacent locus led to the discovery of a large cluster of imprinted genes expressed from the Xp: Jpx, Ftx and Zcchc13. To our knowledge, this is the first identified cluster of imprinted genes in the cis-acting regulatory region termed the X-inactivation center. This finding may help in understanding the molecular mechanisms regulating imprinted X-chromosome inactivation during early mammalian development. PMID:23940725
Identification of an imprinted gene cluster in the X-inactivation center.

PubMed

Kobayashi, Shin; Totoki, Yasushi; Soma, Miki; Matsumoto, Kazuya; Fujihara, Yoshitaka; Toyoda, Atsushi; Sakaki, Yoshiyuki; Okabe, Masaru; Ishino, Fumitoshi

2013-01-01

Mammalian development is strongly influenced by the epigenetic phenomenon called genomic imprinting, in which either the paternal or the maternal allele of imprinted genes is expressed. Paternally expressed Xist, an imprinted gene, has been considered as a single cis-acting factor to inactivate the paternally inherited X chromosome (Xp) in preimplantation mouse embryos. This means that X-chromosome inactivation also entails gene imprinting at a very early developmental stage. However, the precise mechanism of imprinted X-chromosome inactivation remains unknown and there is little information about imprinted genes on X chromosomes. In this study, we examined whether there are other imprinted genes than Xist expressed from the inactive paternal X chromosome and expressed in female embryos at the preimplantation stage. We focused on small RNAs and compared their expression patterns between sexes by tagging the female X chromosome with green fluorescent protein. As a result, we identified two micro (mi)RNAs-miR-374-5p and miR-421-3p-mapped adjacent to Xist that were predominantly expressed in female blastocysts. Allelic expression analysis revealed that these miRNAs were indeed imprinted and expressed from the Xp. Further analysis of the imprinting status of adjacent locus led to the discovery of a large cluster of imprinted genes expressed from the Xp: Jpx, Ftx and Zcchc13. To our knowledge, this is the first identified cluster of imprinted genes in the cis-acting regulatory region termed the X-inactivation center. This finding may help in understanding the molecular mechanisms regulating imprinted X-chromosome inactivation during early mammalian development.
Mapping in an apple (Malus x domestica) F1 segregating population based on physical clustering of differentially expressed genes.

PubMed

Jensen, Philip J; Fazio, Gennaro; Altman, Naomi; Praul, Craig; McNellis, Timothy W

2014-04-04

Apple tree breeding is slow and difficult due to long generation times, self-incompatibility, and complex genetics. The identification of molecular markers linked to traits of interest is a way to expedite the breeding process. In the present study, we aimed to identify genes whose steady-state transcript abundance was associated with inheritance of specific traits segregating in an apple (Malus × domestica) rootstock F1 breeding population, including resistance to powdery mildew (Podosphaera leucotricha) disease and woolly apple aphid (Eriosoma lanigerum). Transcription profiling was performed for 48 individual F1 apple trees from a cross of two highly heterozygous parents, using RNA isolated from healthy, actively-growing shoot tips and a custom apple DNA oligonucleotide microarray representing 26,000 unique transcripts. Genome-wide expression profiles were not clear indicators of powdery mildew or woolly apple aphid resistance phenotype. However, standard differential gene expression analysis between phenotypic groups of trees revealed relatively small sets of genes with trait-associated expression levels. For example, thirty genes were identified that were differentially expressed between trees resistant and susceptible to powdery mildew. Interestingly, the genes encoding twenty-four of these transcripts were physically clustered on chromosome 12. Similarly, seven genes were identified that were differentially expressed between trees resistant and susceptible to woolly apple aphid, and the genes encoding five of these transcripts were also clustered, this time on chromosome 17. In each case, the gene clusters were in the vicinity of previously identified major quantitative trait loci for the corresponding trait. Similar results were obtained for a series of molecular traits. Several of the differentially expressed genes were used to develop DNA polymorphism markers linked to powdery mildew disease and woolly apple aphid resistance. Gene expression profiling and trait-associated transcript analysis using an apple F1 population readily identified genes physically linked to powdery mildew disease resistance and woolly apple aphid resistance loci. This result was especially useful in apple, where extreme levels of heterozygosity make the development of reliable DNA markers quite difficult. The results suggest that this approach could prove effective in crops with complicated genetics, or for which few genomic information resources are available.
Mining microbial metatranscriptomes for expression of antibiotic resistance genes under natural conditions

PubMed Central

Versluis, Dennis; D’Andrea, Marco Maria; Ramiro Garcia, Javier; Leimena, Milkha M.; Hugenholtz, Floor; Zhang, Jing; Öztürk, Başak; Nylund, Lotta; Sipkema, Detmer; Schaik, Willem van; de Vos, Willem M.; Kleerebezem, Michiel; Smidt, Hauke; Passel, Mark W.J. van

2015-01-01

Antibiotic resistance genes are found in a broad range of ecological niches associated with complex microbiota. Here we investigated if resistance genes are not only present, but also transcribed under natural conditions. Furthermore, we examined the potential for antibiotic production by assessing the expression of associated secondary metabolite biosynthesis gene clusters. Metatranscriptome datasets from intestinal microbiota of four human adults, one human infant, 15 mice and six pigs, of which only the latter have received antibiotics prior to the study, as well as from sea bacterioplankton, a marine sponge, forest soil and sub-seafloor sediment, were investigated. We found that resistance genes are expressed in all studied ecological niches, albeit with niche-specific differences in relative expression levels and diversity of transcripts. For example, in mice and human infant microbiota predominantly tetracycline resistance genes were expressed while in human adult microbiota the spectrum of expressed genes was more diverse, and also included β-lactam, aminoglycoside and macrolide resistance genes. Resistance gene expression could result from the presence of natural antibiotics in the environment, although we could not link it to expression of corresponding secondary metabolites biosynthesis clusters. Alternatively, resistance gene expression could be constitutive, or these genes serve alternative roles besides antibiotic resistance. PMID:26153129
Non-Small-Cell Lung Cancer Molecular Signatures Recapitulate Lung Developmental Pathways

PubMed Central

Borczuk, Alain C.; Gorenstein, Lyall; Walter, Kristin L.; Assaad, Adel A.; Wang, Liqun; Powell, Charles A.

2003-01-01

Current paradigms hold that lung carcinomas arise from pleuripotent stem cells capable of differentiation into one or several histological types. These paradigms suggest lung tumor cell ontogeny is determined by consequences of gene expression that recapitulate events important in embryonic lung development. Using oligonucleotide microarrays, we acquired gene profiles from 32 microdissected non-small-cell lung tumors. We determined the 100 top-ranked marker genes for adenocarcinoma, squamous cell, large cell, and carcinoid using nearest neighbor analysis. Results were validated by immunostaining for 11 selected proteins using a tissue microarray representing 80 tumors. Gene expression data of lung development were accessed from a publicly available dataset generated with the murine Mu11k genome microarray. Self-organized mapping identified two temporally distinct clusters of murine orthologues. Supervised clustering of lung development data showed large-cell carcinoma gene orthologues were in a cluster expressed in pseudoglandular and canalicular stages whereas adenocarcinoma homologues were predominantly in a cluster expressed later in the terminal sac and alveolar stages of murine lung development. Representative large-cell genes (E2F3, MYBL2, HDAC2, CDK4, PCNA) are expressed in the nucleus and are associated with cell cycle and proliferation. In contrast, adenocarcinoma genes are associated with lung-specific transcription pathways (SFTPB, TTF-1), cell adhesion, and signal transduction. In sum, non-small-cell lung tumors histology gene profiles suggest mechanisms relevant to ontogeny and clinical course. Adenocarcinoma genes are associated with differentiation and glandular formation whereas large-cell genes are associated with proliferation and differentiation arrest. The identification of developmentally regulated pathways active in tumorigenesis provides insights into lung carcinogenesis and suggests early steps may differ according to the eventual tumor morphology. PMID:14578194

A mesh generation and machine learning framework for Drosophila gene expression pattern image analysis

PubMed Central

2013-01-01

Background Multicellular organisms consist of cells of many different types that are established during development. Each type of cell is characterized by the unique combination of expressed gene products as a result of spatiotemporal gene regulation. Currently, a fundamental challenge in regulatory biology is to elucidate the gene expression controls that generate the complex body plans during development. Recent advances in high-throughput biotechnologies have generated spatiotemporal expression patterns for thousands of genes in the model organism fruit fly Drosophila melanogaster. Existing qualitative methods enhanced by a quantitative analysis based on computational tools we present in this paper would provide promising ways for addressing key scientific questions. Results We develop a set of computational methods and open source tools for identifying co-expressed embryonic domains and the associated genes simultaneously. To map the expression patterns of many genes into the same coordinate space and account for the embryonic shape variations, we develop a mesh generation method to deform a meshed generic ellipse to each individual embryo. We then develop a co-clustering formulation to cluster the genes and the mesh elements, thereby identifying co-expressed embryonic domains and the associated genes simultaneously. Experimental results indicate that the gene and mesh co-clusters can be correlated to key developmental events during the stages of embryogenesis we study. The open source software tool has been made available at http://compbio.cs.odu.edu/fly/. Conclusions Our mesh generation and machine learning methods and tools improve upon the flexibility, ease-of-use and accuracy of existing methods. PMID:24373308
The expression of native and cultured human retinal pigment epithelial cells grown in different culture conditions.

PubMed

Tian, J; Ishibashi, K; Honda, S; Boylan, S A; Hjelmeland, L M; Handa, J T

2005-11-01

To determine the transcriptional proximity of retinal pigment epithelium (RPE) cells grown under different culture conditions and native RPE. ARPE-19 cells were grown under five conditions in 10% CO(2): "subconfluent" in DMEM/F12+10% FBS, "confluent" in serum and serum withdrawn, and "differentiated" for 2.5 months in serum and serum withdrawn medium. Native RPE was laser microdissected. Total RNA was extracted, reverse transcribed, and radiolabelled probes were hybridised to an array containing 5,353 genes. Arrays were evaluated by hierarchical cluster analysis and significance analysis of microarrays. 78% of genes were expressed by native RPE while 45.3--47.7% were expressed by ARPE-19 cells, depending on culture condition. While the most abundant genes were expressed by native and cultured cells, significant differences in low abundance genes were seen. Hierarchical cluster analysis showed that confluent and differentiated, serum withdrawn cultures clustered closest to native RPE, and that serum segregated cultured cells from native RPE. The number of differentially expressed genes and their function, and profile of expressed and unexpressed genes, demonstrate differences between native and cultured cells. While ARPE-19 cells have significant value for studying RPE behaviour, investigators must be aware of how culture conditions can influence the mRNA phenotype of the cell.
Computational genetic neuroanatomy of the developing mouse brain: dimensionality reduction, visualization, and clustering

PubMed Central

2013-01-01

Background The structured organization of cells in the brain plays a key role in its functional efficiency. This delicate organization is the consequence of unique molecular identity of each cell gradually established by precise spatiotemporal gene expression control during development. Currently, studies on the molecular-structural association are beginning to reveal how the spatiotemporal gene expression patterns are related to cellular differentiation and structural development. Results In this article, we aim at a global, data-driven study of the relationship between gene expressions and neuroanatomy in the developing mouse brain. To enable visual explorations of the high-dimensional data, we map the in situ hybridization gene expression data to a two-dimensional space by preserving both the global and the local structures. Our results show that the developing brain anatomy is largely preserved in the reduced gene expression space. To provide a quantitative analysis, we cluster the reduced data into groups and measure the consistency with neuroanatomy at multiple levels. Our results show that the clusters in the low-dimensional space are more consistent with neuroanatomy than those in the original space. Conclusions Gene expression patterns and developing brain anatomy are closely related. Dimensionality reduction and visual exploration facilitate the study of this relationship. PMID:23845024
Penicillin production in industrial strain Penicillium chrysogenum P2niaD18 is not dependent on the copy number of biosynthesis genes.

PubMed

Ziemons, Sandra; Koutsantas, Katerina; Becker, Kordula; Dahlmann, Tim; Kück, Ulrich

2017-02-16

Multi-copy gene integration into microbial genomes is a conventional tool for obtaining improved gene expression. For Penicillium chrysogenum, the fungal producer of the beta-lactam antibiotic penicillin, many production strains carry multiple copies of the penicillin biosynthesis gene cluster. This discovery led to the generally accepted view that high penicillin titers are the result of multiple copies of penicillin genes. Here we investigated strain P2niaD18, a production line that carries only two copies of the penicillin gene cluster. We performed pulsed-field gel electrophoresis (PFGE), quantitative qRT-PCR, and penicillin bioassays to investigate production, deletion and overexpression strains generated in the P. chrysogenum P2niaD18 background, in order to determine the copy number of the penicillin biosynthesis gene cluster, and study the expression of one penicillin biosynthesis gene, and the penicillin titer. Analysis of production and recombinant strain showed that the enhanced penicillin titer did not depend on the copy number of the penicillin gene cluster. Our assumption was strengthened by results with a penicillin null strain lacking pcbC encoding isopenicillin N synthase. Reintroduction of one or two copies of the cluster into the pcbC deletion strain restored transcriptional high expression of the pcbC gene, but recombinant strains showed no significantly different penicillin titer compared to parental strains. Here we present a molecular genetic analysis of production and recombinant strains in the P2niaD18 background carrying different copy numbers of the penicillin biosynthesis gene cluster. Our analysis shows that the enhanced penicillin titer does not strictly depend on the copy number of the cluster. Based on these overall findings, we hypothesize that instead, complex regulatory mechanisms are prominently implicated in increased penicillin biosynthesis in production strains.
Contribution of the Pmra Promoter to Expression of Genes in the Escherichia coli mra Cluster of Cell Envelope Biosynthesis and Cell Division Genes

PubMed Central

Mengin-Lecreulx, Dominique; Ayala, Juan; Bouhss, Ahmed; van Heijenoort, Jean; Parquet, Claudine; Hara, Hiroshi

1998-01-01

Recently, a promoter for the essential gene ftsI, which encodes penicillin-binding protein 3 of Escherichia coli, was precisely localized 1.9 kb upstream from this gene, at the beginning of the mra cluster of cell division and cell envelope biosynthesis genes (H. Hara, S. Yasuda, K. Horiuchi, and J. T. Park, J. Bacteriol. 179:5802–5811, 1997). Disruption of this promoter (Pmra) on the chromosome and its replacement by the lac promoter (Pmra::Plac) led to isopropyl-β-d-thiogalactopyranoside (IPTG)-dependent cells that lysed in the absence of inducer, a defect which was complemented only when the whole region from Pmra to ftsW, the fifth gene downstream from ftsI, was provided in trans on a plasmid. In the present work, the levels of various proteins involved in peptidoglycan synthesis and cell division were precisely determined in cells in which Pmra::Plac promoter expression was repressed or fully induced. It was confirmed that the Pmra promoter is required for expression of the first nine genes of the mra cluster: mraZ (orfC), mraW (orfB), ftsL (mraR), ftsI, murE, murF, mraY, murD, and ftsW. Interestingly, three- to sixfold-decreased levels of MurG and MurC enzymes were observed in uninduced Pmra::Plac cells. This was correlated with an accumulation of the nucleotide precursors UDP–N-acetylglucosamine and UDP–N-acetylmuramic acid, substrates of these enzymes, and with a depletion of the pool of UDP–N-acetylmuramyl pentapeptide, resulting in decreased cell wall peptidoglycan synthesis. Moreover, the expression of ftsZ, the penultimate gene from this cluster, was significantly reduced when Pmra expression was repressed. It was concluded that the transcription of the genes located downstream from ftsW in the mra cluster, from murG to ftsZ, is also mainly (but not exclusively) dependent on the Pmra promoter. PMID:9721276
A formal concept analysis approach to consensus clustering of multi-experiment expression data

PubMed Central

2014-01-01

Background Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based on a larger number of samples and the effects of the individual study-specific biases are diminished. This is supported by recent studies suggesting that important biological signals are often preserved or enhanced by multiple experiments. An approach to combining data from different experiments is the aggregation of their clusterings into a consensus or representative clustering solution which increases the confidence in the common features of all the datasets and reveals the important differences among them. Results We propose a novel generic consensus clustering technique that applies Formal Concept Analysis (FCA) approach for the consolidation and analysis of clustering solutions derived from several microarray datasets. These datasets are initially divided into groups of related experiments with respect to a predefined criterion. Subsequently, a consensus clustering algorithm is applied to each group resulting in a clustering solution per group. These solutions are pooled together and further analysed by employing FCA which allows extracting valuable insights from the data and generating a gene partition over all the experiments. In order to validate the FCA-enhanced approach two consensus clustering algorithms are adapted to incorporate the FCA analysis. Their performance is evaluated on gene expression data from multi-experiment study examining the global cell-cycle control of fission yeast. The FCA results derived from both methods demonstrate that, although both algorithms optimize different clustering characteristics, FCA is able to overcome and diminish these differences and preserve some relevant biological signals. Conclusions The proposed FCA-enhanced consensus clustering technique is a general approach to the combination of clustering algorithms with FCA for deriving clustering solutions from multiple gene expression matrices. The experimental results presented herein demonstrate that it is a robust data integration technique able to produce good quality clustering solution that is representative for the whole set of expression matrices. PMID:24885407
Finding genes discriminating smokers from non-smokers by applying a growing self-organizing clustering method to large airway epithelium cell microarray data.

PubMed

Shahdoust, Maryam; Hajizadeh, Ebrahim; Mozdarani, Hossein; Chehrei, Ali

2013-01-01

Cigarette smoking is the major risk factor for development of lung cancer. Identification of effects of tobacco on airway gene expression may provide insight into the causes. This research aimed to compare gene expression of large airway epithelium cells in normal smokers (n=13) and non-smokers (n=9) in order to find genes which discriminate the two groups and assess cigarette smoking effects on large airway epithelium cells. Genes discriminating smokers from non-smokers were identified by applying a neural network clustering method, growing self-organizing maps (GSOM), to microarray data according to class discrimination scores. An index was computed based on differentiation between each mean of gene expression in the two groups. This clustering approach provided the possibility of comparing thousands of genes simultaneously. The applied approach compared the mean of 7,129 genes in smokers and non-smokers simultaneously and classified the genes of large airway epithelium cells which had differently expressed in smokers comparing with non-smokers. Seven genes were identified which had the highest different expression in smokers compared with the non-smokers group: NQO1, H19, ALDH3A1, AKR1C1, ABHD2, GPX2 and ADH7. Most (NQO1, ALDH3A1, AKR1C1, H19 and GPX2) are known to be clinically notable in lung cancer studies. Furthermore, statistical discriminate analysis showed that these genes could classify samples in smokers and non-smokers correctly with 100% accuracy. With the performed GSOM map, other nodes with high average discriminate scores included genes with alterations strongly related to the lung cancer such as AKR1C3, CYP1B1, UCHL1 and AKR1B10. This clustering by comparing expression of thousands of genes at the same time revealed alteration in normal smokers. Most of the identified genes were strongly relevant to lung cancer in the existing literature. The genes may be utilized to identify smokers with increased risk for lung cancer. A large sample study is now recommended to determine relations between the genes ABHD2 and ADH7 and smoking.
The structure of a gene co-expression network reveals biological functions underlying eQTLs.

PubMed

Villa-Vialaneix, Nathalie; Liaubet, Laurence; Laurent, Thibault; Cherel, Pierre; Gamot, Adrien; SanCristobal, Magali

2013-01-01

What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology.
Functional Genome Mining for Metabolites Encoded by Large Gene Clusters through Heterologous Expression of a Whole-Genome Bacterial Artificial Chromosome Library in Streptomyces spp.

PubMed Central

Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin

2016-01-01

ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including streptothricins, borrelidin, two novel lipopeptides, and one unknown antibiotic from Streptomyces rochei Sal35. The transfer, expression, and screening of the library were all performed in a high-throughput way, so that this approach is scalable and adaptable to industrial automation for next-generation antibiotic discovery. PMID:27451447
Development of a gene cloning system in a fast-growing and moderately thermophilic Streptomyces species and heterologous expression of Streptomyces antibiotic biosynthetic gene clusters

PubMed Central

2011-01-01

Background Streptomyces species are a major source of antibiotics. They usually grow slowly at their optimal temperature and fermentation of industrial strains in a large scale often takes a long time, consuming more energy and materials than some other bacterial industrial strains (e.g., E. coli and Bacillus). Most thermophilic Streptomyces species grow fast, but no gene cloning systems have been developed in such strains. Results We report here the isolation of 41 fast-growing (about twice the rate of S. coelicolor), moderately thermophilic (growing at both 30°C and 50°C) Streptomyces strains, detection of one linear and three circular plasmids in them, and sequencing of a 6996-bp plasmid, pTSC1, from one of them. pTSC1-derived pCWH1 could replicate in both thermophilic and mesophilic Streptomyces strains. On the other hand, several Streptomyces replicons function in thermophilic Streptomyces species. By examining ten well-sporulating strains, we found two promising cloning hosts, 2C and 4F. A gene cloning system was established by using the two strains. The actinorhodin and anthramycin biosynthetic gene clusters from mesophilic S. coelicolor A3(2) and thermophilic S. refuineus were heterologously expressed in one of the hosts. Conclusions We have developed a gene cloning and expression system in a fast-growing and moderately thermophilic Streptomyces species. Although just a few plasmids and one antibiotic biosynthetic gene cluster from mesophilic Streptomyces were successfully expressed in thermophilic Streptomyces species, we expect that by utilizing thermophilic Streptomyces-specific promoters, more genes and especially antibiotic genes clusters of mesophilic Streptomyces should be heterologously expressed. PMID:22032628
Clustered Xenopus keratin genes: A genomic, transcriptomic, and proteomic analysis.

PubMed

Suzuki, Ken-Ichi T; Suzuki, Miyuki; Shigeta, Mitsuki; Fortriede, Joshua D; Takahashi, Shuji; Mawaribuchi, Shuuji; Yamamoto, Takashi; Taira, Masanori; Fukui, Akimasa

2017-06-15

Keratin genes belong to the intermediate filament superfamily and their expression is altered following morphological and physiological changes in vertebrate epithelial cells. Keratin genes are divided into two groups, type I and II, and are clustered on vertebrate genomes, including those of Xenopus species. Various keratin genes have been identified and characterized by their unique expression patterns throughout ontogeny in Xenopus laevis; however, compilation of previously reported and newly identified keratin genes in two Xenopus species is required for our further understanding of keratin gene evolution, not only in amphibians but also in all terrestrial vertebrates. In this study, 120 putative type I and II keratin genes in total were identified based on the genome data from two Xenopus species. We revealed that most of these genes are highly clustered on two homeologous chromosomes, XLA9_10 and XLA2 in X. laevis, and XTR10 and XTR2 in X. tropicalis, which are orthologous to those of human, showing conserved synteny among tetrapods. RNA-Seq data from various embryonic stages and adult tissues highlighted the unique expression profiles of orthologous and homeologous keratin genes in developmental stage- and tissue-specific manners. Moreover, we identified dozens of epidermal keratin proteins from the whole embryo, larval skin, tail, and adult skin using shotgun proteomics. In light of our results, we discuss the radiation, diversification, and unique expression of the clustered keratin genes, which are closely related to epidermal development and terrestrial adaptation during amphibian evolution, including Xenopus speciation. Copyright © 2016 Elsevier Inc. All rights reserved.
Discovery of Gene Cluster for Mycosporine-Like Amino Acid Biosynthesis from Actinomycetales Microorganisms and Production of a Novel Mycosporine-Like Amino Acid by Heterologous Expression

PubMed Central

Miyamoto, Kiyoko T.; Komatsu, Mamoru

2014-01-01

Mycosporines and mycosporine-like amino acids (MAAs), including shinorine (mycosporine-glycine-serine) and porphyra-334 (mycosporine-glycine-threonine), are UV-absorbing compounds produced by cyanobacteria, fungi, and marine micro- and macroalgae. These MAAs have the ability to protect these organisms from damage by environmental UV radiation. Although no reports have described the production of MAAs and the corresponding genes involved in MAA biosynthesis from Gram-positive bacteria to date, genome mining of the Gram-positive bacterial database revealed that two microorganisms belonging to the order Actinomycetales, Actinosynnema mirum DSM 43827 and Pseudonocardia sp. strain P1, possess a gene cluster homologous to the biosynthetic gene clusters identified from cyanobacteria. When the two strains were grown in liquid culture, Pseudonocardia sp. accumulated a very small amount of MAA-like compound in a medium-dependent manner, whereas A. mirum did not produce MAAs under any culture conditions, indicating that the biosynthetic gene cluster of A. mirum was in a cryptic state in this microorganism. In order to characterize these biosynthetic gene clusters, each biosynthetic gene cluster was heterologously expressed in an engineered host, Streptomyces avermitilis SUKA22. Since the resultant transformants carrying the entire biosynthetic gene cluster controlled by an alternative promoter produced mainly shinorine, this is the first confirmation of a biosynthetic gene cluster for MAA from Gram-positive bacteria. Furthermore, S. avermitilis SUKA22 transformants carrying the biosynthetic gene cluster for MAA of A. mirum accumulated not only shinorine and porphyra-334 but also a novel MAA. Structure elucidation revealed that the novel MAA is mycosporine-glycine-alanine, which substitutes l-alanine for the l-serine of shinorine. PMID:24907338
Discovery of gene cluster for mycosporine-like amino acid biosynthesis from Actinomycetales microorganisms and production of a novel mycosporine-like amino acid by heterologous expression.

PubMed

Miyamoto, Kiyoko T; Komatsu, Mamoru; Ikeda, Haruo

2014-08-01

Mycosporines and mycosporine-like amino acids (MAAs), including shinorine (mycosporine-glycine-serine) and porphyra-334 (mycosporine-glycine-threonine), are UV-absorbing compounds produced by cyanobacteria, fungi, and marine micro- and macroalgae. These MAAs have the ability to protect these organisms from damage by environmental UV radiation. Although no reports have described the production of MAAs and the corresponding genes involved in MAA biosynthesis from Gram-positive bacteria to date, genome mining of the Gram-positive bacterial database revealed that two microorganisms belonging to the order Actinomycetales, Actinosynnema mirum DSM 43827 and Pseudonocardia sp. strain P1, possess a gene cluster homologous to the biosynthetic gene clusters identified from cyanobacteria. When the two strains were grown in liquid culture, Pseudonocardia sp. accumulated a very small amount of MAA-like compound in a medium-dependent manner, whereas A. mirum did not produce MAAs under any culture conditions, indicating that the biosynthetic gene cluster of A. mirum was in a cryptic state in this microorganism. In order to characterize these biosynthetic gene clusters, each biosynthetic gene cluster was heterologously expressed in an engineered host, Streptomyces avermitilis SUKA22. Since the resultant transformants carrying the entire biosynthetic gene cluster controlled by an alternative promoter produced mainly shinorine, this is the first confirmation of a biosynthetic gene cluster for MAA from Gram-positive bacteria. Furthermore, S. avermitilis SUKA22 transformants carrying the biosynthetic gene cluster for MAA of A. mirum accumulated not only shinorine and porphyra-334 but also a novel MAA. Structure elucidation revealed that the novel MAA is mycosporine-glycine-alanine, which substitutes l-alanine for the l-serine of shinorine. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Text analysis of MEDLINE for discovering functional relationships among genes: evaluation of keyword extraction weighting schemes.

PubMed

Liu, Ying; Navathe, Shamkant B; Pivoshenko, Alex; Dasigi, Venu G; Dingledine, Ray; Ciliax, Brian J

2006-01-01

One of the key challenges of microarray studies is to derive biological insights from the gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the functional links among genes. However, the quality of the keyword lists significantly affects the clustering results. We compared two keyword weighting schemes: normalised z-score and term frequency-inverse document frequency (TFIDF). Two gene sets were tested to evaluate the effectiveness of the weighting schemes for keyword extraction for gene clustering. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords outperformed those produced from normalised z-score weighted keywords. The optimised algorithms should be useful for partitioning genes from microarray lists into functionally discrete clusters.
Key role of LaeA and velvet complex proteins on expression of β-lactam and PR-toxin genes in Penicillium chrysogenum: cross-talk regulation of secondary metabolite pathways.

PubMed

Martín, Juan F

2017-05-01

Penicillium chrysogenum is an excellent model fungus to study the molecular mechanisms of control of expression of secondary metabolite genes. A key global regulator of the biosynthesis of secondary metabolites is the LaeA protein that interacts with other components of the velvet complex (VelA, VelB, VelC, VosA). These components interact with LaeA and regulate expression of penicillin and PR-toxin biosynthetic genes in P. chrysogenum. Both LaeA and VelA are positive regulators of the penicillin and PR-toxin biosynthesis, whereas VelB acts as antagonist of the effect of LaeA and VelA. Silencing or deletion of the laeA gene has a strong negative effect on penicillin biosynthesis and overexpression of laeA increases penicillin production. Expression of the laeA gene is enhanced by the P. chrysogenum autoinducers 1,3 diaminopropane and spermidine. The PR-toxin gene cluster is very poorly expressed in P. chrysogenum under penicillin-production conditions (i.e. it is a near-silent gene cluster). Interestingly, the downregulation of expression of the PR-toxin gene cluster in the high producing strain P. chrysogenum DS17690 was associated with mutations in both the laeA and velA genes. Analysis of the laeA and velA encoding genes in this high penicillin producing strain revealed that both laeA and velA acquired important mutations during the strain improvement programs thus altering the ratio of different secondary metabolites (e.g. pigments, PR-toxin) synthesized in the high penicillin producing mutants when compared to the parental wild type strain. Cross-talk of different secondary metabolite pathways has also been found in various Penicillium spp.: P. chrysogenum mutants lacking the penicillin gene cluster produce increasing amounts of PR-toxin, and mutants of P. roqueforti silenced in the PR-toxin genes produce large amounts of mycophenolic acid. The LaeA-velvet complex mediated regulation and the pathway cross-talk phenomenon has great relevance for improving the production of novel secondary metabolites, particularly of those secondary metabolites which are produced in trace amounts encoded by silent or near-silent gene clusters.
Analytical workflow profiling gene expression in murine macrophages

PubMed Central

Nixon, Scott E.; González-Peña, Dianelys; Lawson, Marcus A.; McCusker, Robert H.; Hernandez, Alvaro G.; O’Connor, Jason C.; Dantzer, Robert; Kelley, Keith W.

2015-01-01

Comprehensive and simultaneous analysis of all genes in a biological sample is a capability of RNA-Seq technology. Analysis of the entire transcriptome benefits from summarization of genes at the functional level. As a cellular response of interest not previously explored with RNA-Seq, peritoneal macrophages from mice under two conditions (control and immunologically challenged) were analyzed for gene expression differences. Quantification of individual transcripts modeled RNA-Seq read distribution and uncertainty (using a Beta Negative Binomial distribution), then tested for differential transcript expression (False Discovery Rate-adjusted p-value < 0.05). Enrichment of functional categories utilized the list of differentially expressed genes. A total of 2079 differentially expressed transcripts representing 1884 genes were detected. Enrichment of 92 categories from Gene Ontology Biological Processes and Molecular Functions, and KEGG pathways were grouped into 6 clusters. Clusters included defense and inflammatory response (Enrichment Score = 11.24) and ribosomal activity (Enrichment Score = 17.89). Our work provides a context to the fine detail of individual gene expression differences in murine peritoneal macrophages during immunological challenge with high throughput RNA-Seq. PMID:25708305
Gene expression profiling reveals distinct molecular signatures associated with the rupture of intracranial aneurysm.

PubMed

Nakaoka, Hirofumi; Tajima, Atsushi; Yoneyama, Taku; Hosomichi, Kazuyoshi; Kasuya, Hidetoshi; Mizutani, Tohru; Inoue, Ituro

2014-08-01

The rupture of intracranial aneurysm (IA) causes subarachnoid hemorrhage associated with high morbidity and mortality. We compared gene expression profiles in aneurysmal domes between unruptured IAs and ruptured IAs (RIAs) to elucidate biological mechanisms predisposing to the rupture of IA. We determined gene expression levels of 8 RIAs, 5 unruptured IAs, and 10 superficial temporal arteries with the Agilent microarrays. To explore biological heterogeneity of IAs, we classified the samples into subgroups showing similar gene expression patterns, using clustering methods. The clustering analysis identified 4 groups: superficial temporal arteries and unruptured IAs were aggregated into their own clusters, whereas RIAs segregated into 2 distinct subgroups (early and late RIAs). Comparing gene expression levels between early RIAs and unruptured IAs, we identified 430 upregulated and 617 downregulated genes in early RIAs. The upregulated genes were associated with inflammatory and immune responses and phagocytosis including S100/calgranulin genes (S100A8, S100A9, and S100A12). The downregulated genes suggest mechanical weakness of aneurysm walls. The expressions of Krüppel-like family of transcription factors (KLF2, KLF12, and KLF15), which were anti-inflammatory regulators, and CDKN2A, which was located on chromosome 9p21 that was the most consistently replicated locus in genome-wide association studies of IA, were also downregulated. We demonstrate that gene expression patterns of RIAs were different according to the age of patients. The results suggest that macrophage-mediated inflammation is a key biological pathway for IA rupture. The identified genes can be good candidates for molecular markers of rupture-prone IAs and therapeutic targets. © 2014 American Heart Association, Inc.
Lung cancer signature biomarkers: tissue specific semantic similarity based clustering of digital differential display (DDD) data.

PubMed

Srivastava, Mousami; Khurana, Pankaj; Sugadev, Ragumani

2012-11-02

The tissue-specific Unigene Sets derived from more than one million expressed sequence tags (ESTs) in the NCBI, GenBank database offers a platform for identifying significantly and differentially expressed tissue-specific genes by in-silico methods. Digital differential display (DDD) rapidly creates transcription profiles based on EST comparisons and numerically calculates, as a fraction of the pool of ESTs, the relative sequence abundance of known and novel genes. However, the process of identifying the most likely tissue for a specific disease in which to search for candidate genes from the pool of differentially expressed genes remains difficult. Therefore, we have used 'Gene Ontology semantic similarity score' to measure the GO similarity between gene products of lung tissue-specific candidate genes from control (normal) and disease (cancer) sets. This semantic similarity score matrix based on hierarchical clustering represents in the form of a dendrogram. The dendrogram cluster stability was assessed by multiple bootstrapping. Multiple bootstrapping also computes a p-value for each cluster and corrects the bias of the bootstrap probability. Subsequent hierarchical clustering by the multiple bootstrapping method (α = 0.95) identified seven clusters. The comparative, as well as subtractive, approach revealed a set of 38 biomarkers comprising four distinct lung cancer signature biomarker clusters (panel 1-4). Further gene enrichment analysis of the four panels revealed that each panel represents a set of lung cancer linked metastasis diagnostic biomarkers (panel 1), chemotherapy/drug resistance biomarkers (panel 2), hypoxia regulated biomarkers (panel 3) and lung extra cellular matrix biomarkers (panel 4). Expression analysis reveals that hypoxia induced lung cancer related biomarkers (panel 3), HIF and its modulating proteins (TGM2, CSNK1A1, CTNNA1, NAMPT/Visfatin, TNFRSF1A, ETS1, SRC-1, FN1, APLP2, DMBT1/SAG, AIB1 and AZIN1) are significantly down regulated. All down regulated genes in this panel were highly up regulated in most other types of cancers. These panels of proteins may represent signature biomarkers for lung cancer and will aid in lung cancer diagnosis and disease monitoring as well as in the prediction of responses to therapeutics.
Differences in Flower Transcriptome between Grapevine Clones Are Related to Their Cluster Compactness, Fruitfulness, and Berry Size

PubMed Central

Grimplet, Jérôme; Tello, Javier; Laguna, Natalia; Ibáñez, Javier

2017-01-01

Grapevine cluster compactness has a clear impact on fruit quality and health status, as clusters with greater compactness are more susceptible to pests and diseases and ripen more asynchronously. Different parameters related to inflorescence and cluster architecture (length, width, branching, etc.), fruitfulness (number of berries, number of seeds) and berry size (length, width) contribute to the final level of compactness. From a collection of 501 clones of cultivar Garnacha Tinta, two compact and two loose clones with stable differences for cluster compactness-related traits were selected and phenotyped. Key organs and developmental stages were selected for sampling and transcriptomic analyses. Comparison of global gene expression patterns in flowers at the end of bloom allowed identification of potential gene networks with a role in determining the final berry number, berry size and ultimately cluster compactness. A large portion of the differentially expressed genes were found in networks related to cell division (carbohydrates uptake, cell wall metabolism, cell cycle, nucleic acids metabolism, cell division, DNA repair). Their greater expression level in flowers of compact clones indicated that the number of berries and the berry size at ripening appear related to the rate of cell replication in flowers during the early growth stages after pollination. In addition, fluctuations in auxin and gibberellin signaling and transport related gene expression support that they play a central role in fruit set and impact berry number and size. Other hormones, such as ethylene and jasmonate may differentially regulate indirect effects, such as defense mechanisms activation or polyphenols production. This is the first transcriptomic based analysis focused on the discovery of the underlying gene networks involved in grapevine traits of grapevine cluster compactness, berry number and berry size. PMID:28496449
Genomic and expression analysis of the vanG-like gene cluster of Clostridium difficile.

PubMed

Peltier, Johann; Courtin, Pascal; El Meouche, Imane; Catel-Ferreira, Manuella; Chapot-Chartier, Marie-Pierre; Lemée, Ludovic; Pons, Jean-Louis

2013-07-01

Primary antibiotic treatment of Clostridium difficile intestinal diseases requires metronidazole or vancomycin therapy. A cluster of genes homologous to enterococcal glycopeptides resistance vanG genes was found in the genome of C. difficile 630, although this strain remains sensitive to vancomycin. This vanG-like gene cluster was found to consist of five ORFs: the regulatory region consisting of vanR and vanS and the effector region consisting of vanG, vanXY and vanT. We found that 57 out of 83 C. difficile strains, representative of the main lineages of the species, harbour this vanG-like cluster. The cluster is expressed as an operon and, when present, is found at the same genomic location in all strains. The vanG, vanXY and vanT homologues in C. difficile 630 are co-transcribed and expressed to a low level throughout the growth phases in the absence of vancomycin. Conversely, the expression of these genes is strongly induced in the presence of subinhibitory concentrations of vancomycin, indicating that the vanG-like operon is functional at the transcriptional level in C. difficile. Hydrophilic interaction liquid chromatography (HILIC-HPLC) and MS analysis of cytoplasmic peptidoglycan precursors of C. difficile 630 grown without vancomycin revealed the exclusive presence of a UDP-MurNAc-pentapeptide with an alanine at the C terminus. UDP-MurNAc-pentapeptide [d-Ala] was also the only peptidoglycan precursor detected in C. difficile grown in the presence of vancomycin, corroborating the lack of vancomycin resistance. Peptidoglycan structures of a vanG-like mutant strain and of a strain lacking the vanG-like cluster did not differ from the C. difficile 630 strain, indicating that the vanG-like cluster also has no impact on cell-wall composition.

Differences in Flower Transcriptome between Grapevine Clones Are Related to Their Cluster Compactness, Fruitfulness, and Berry Size.

PubMed

Grimplet, Jérôme; Tello, Javier; Laguna, Natalia; Ibáñez, Javier

2017-01-01

Grapevine cluster compactness has a clear impact on fruit quality and health status, as clusters with greater compactness are more susceptible to pests and diseases and ripen more asynchronously. Different parameters related to inflorescence and cluster architecture (length, width, branching, etc.), fruitfulness (number of berries, number of seeds) and berry size (length, width) contribute to the final level of compactness. From a collection of 501 clones of cultivar Garnacha Tinta, two compact and two loose clones with stable differences for cluster compactness-related traits were selected and phenotyped. Key organs and developmental stages were selected for sampling and transcriptomic analyses. Comparison of global gene expression patterns in flowers at the end of bloom allowed identification of potential gene networks with a role in determining the final berry number, berry size and ultimately cluster compactness. A large portion of the differentially expressed genes were found in networks related to cell division (carbohydrates uptake, cell wall metabolism, cell cycle, nucleic acids metabolism, cell division, DNA repair). Their greater expression level in flowers of compact clones indicated that the number of berries and the berry size at ripening appear related to the rate of cell replication in flowers during the early growth stages after pollination. In addition, fluctuations in auxin and gibberellin signaling and transport related gene expression support that they play a central role in fruit set and impact berry number and size. Other hormones, such as ethylene and jasmonate may differentially regulate indirect effects, such as defense mechanisms activation or polyphenols production. This is the first transcriptomic based analysis focused on the discovery of the underlying gene networks involved in grapevine traits of grapevine cluster compactness, berry number and berry size.
Heterogeneous gene expression signatures correspond to distinct lung pathologies and biomarkers of disease severity in idiopathic pulmonary fibrosis.

PubMed

DePianto, Daryle J; Chandriani, Sanjay; Abbas, Alexander R; Jia, Guiquan; N'Diaye, Elsa N; Caplazi, Patrick; Kauder, Steven E; Biswas, Sabyasachi; Karnik, Satyajit K; Ha, Connie; Modrusan, Zora; Matthay, Michael A; Kukreja, Jasleen; Collard, Harold R; Egen, Jackson G; Wolters, Paul J; Arron, Joseph R

2015-01-01

There is microscopic spatial and temporal heterogeneity of pathological changes in idiopathic pulmonary fibrosis (IPF) lung tissue, which may relate to heterogeneity in pathophysiological mediators of disease and clinical progression. We assessed relationships between gene expression patterns, pathological features, and systemic biomarkers to identify biomarkers that reflect the aggregate disease burden in patients with IPF. Gene expression microarrays (N=40 IPF; 8 controls) and immunohistochemical analyses (N=22 IPF; 8 controls) of lung biopsies. Clinical characterisation and blood biomarker levels of MMP3 and CXCL13 in a separate cohort of patients with IPF (N=80). 2940 genes were significantly differentially expressed between IPF and control samples (|fold change| >1.5, p<0.05). Two clusters of co-regulated genes related to bronchiolar epithelium or lymphoid aggregates exhibited substantial heterogeneity within the IPF population. Gene expression in bronchiolar and lymphoid clusters corresponded to the extent of bronchiolisation and lymphoid aggregates determined by immunohistochemistry in adjacent tissue sections. Elevated serum levels of MMP3, encoded in the bronchiolar cluster, and CXCL13, encoded in the lymphoid cluster, corresponded to disease severity and shortened survival time (p<10(-7) for MMP3 and p<10(-5) for CXCL13; Cox proportional hazards model). Microscopic pathological heterogeneity in IPF lung tissue corresponds to specific gene expression patterns related to bronchiolisation and lymphoid aggregates. MMP3 and CXCL13 are systemic biomarkers that reflect the aggregate burden of these pathological features across total lung tissue. These biomarkers may have clinical utility as prognostic and/or surrogate biomarkers of disease activity in interventional studies in IPF. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Cancer-cell intrinsic gene expression signatures overcome intratumoural heterogeneity bias in colorectal cancer patient classification

PubMed Central

Dunne, Philip D.; Alderdice, Matthew; O'Reilly, Paul G.; Roddy, Aideen C.; McCorry, Amy M. B.; Richman, Susan; Maughan, Tim; McDade, Simon S.; Johnston, Patrick G.; Longley, Daniel B.; Kay, Elaine; McArt, Darragh G.; Lawler, Mark

2017-01-01

Stromal-derived intratumoural heterogeneity (ITH) has been shown to undermine molecular stratification of patients into appropriate prognostic/predictive subgroups. Here, using several clinically relevant colorectal cancer (CRC) gene expression signatures, we assessed the susceptibility of these signatures to the confounding effects of ITH using gene expression microarray data obtained from multiple tumour regions of a cohort of 24 patients, including central tumour, the tumour invasive front and lymph node metastasis. Sample clustering alongside correlative assessment revealed variation in the ability of each signature to cluster samples according to patient-of-origin rather than region-of-origin within the multi-region dataset. Signatures focused on cancer-cell intrinsic gene expression were found to produce more clinically useful, patient-centred classifiers, as exemplified by the CRC intrinsic signature (CRIS), which robustly clustered samples by patient-of-origin rather than region-of-origin. These findings highlight the potential of cancer-cell intrinsic signatures to reliably stratify CRC patients by minimising the confounding effects of stromal-derived ITH. PMID:28561046
Gene signatures and expression of miRNAs associated with efficacy of panitumumab in a head and neck cancer phase II trial.

PubMed

Siano, Marco; Espeli, Vittoria; Mach, Nicolas; Bossi, Paolo; Licitra, Lisa; Ghielmini, Michele; Frattini, Milo; Canevari, Silvana; De Cecco, Loris

2018-07-01

Platinum-based chemotherapy plus the anti-EGFR monoclonal antibody (mAb) cetuximab is used to treat recurrent/metastatic (RM) head-neck squamous cell carcinoma (HNSCC). Recently, we defined Cluster3 gene-expression signature as a potential predictor of favorable progression-free survival (PFS) in cetuximab-treated RM-HNSCC patients and predictor of partial metabolic FDG-PET response in an afatinib window-of-opportunity trial. Another anti-EGFR-mAb (panitumumab) was used as the treatment agent in RM-HNSCC patients in the phase II PANI01trial. PANI01 tumor samples were analyzed using functional genomics to explore response predictors to anti-EGFR therapy. Whole-gene expression and real-time PCR analyses were applied to pre-treatment samples from 25 PANI01 patients. Three gene signatures (Cluster3 score, RAS onco-signature, microenvironment score) and seven selected miRNAs were separately analyzed for association with panitumumab efficacy. Cluster3 expression levels had a profile with a significant bimodal separation of samples (P =  3.08 E-13). Higher RAS activation, microenvironment score, and miRNA expression were associated with low-Cluster3 patients. The same biomarkers were separately associated with PFS. Patients with high-Cluster3 had significantly longer PFS than patients with low-Cluster3 (median PFS: 174 versus 51 days; log-rank P = 0.0021). ROC analysis demonstrated accuracy in predicting PFS (AUC = 0.877). Despite differences in clinical settings and anti-EGFR inhibitors used for treatment, response prediction by the Cluster3 signature and selected miRNAs was essentially the same. Translation into a useful clinical assay requires validation in a broader setting. Copyright © 2018 Elsevier Ltd. All rights reserved.
Genome-wide association study identifies the SERPINB gene cluster as a susceptibility locus for food allergy.

PubMed

Marenholz, Ingo; Grosche, Sarah; Kalb, Birgit; Rüschendorf, Franz; Blümchen, Katharina; Schlags, Rupert; Harandi, Neda; Price, Mareike; Hansen, Gesine; Seidenberg, Jürgen; Röblitz, Holger; Yürek, Songül; Tschirner, Sebastian; Hong, Xiumei; Wang, Xiaobin; Homuth, Georg; Schmidt, Carsten O; Nöthen, Markus M; Hübner, Norbert; Niggemann, Bodo; Beyer, Kirsten; Lee, Young-Ae

2017-10-20

Genetic factors and mechanisms underlying food allergy are largely unknown. Due to heterogeneity of symptoms a reliable diagnosis is often difficult to make. Here, we report a genome-wide association study on food allergy diagnosed by oral food challenge in 497 cases and 2387 controls. We identify five loci at genome-wide significance, the clade B serpin (SERPINB) gene cluster at 18q21.3, the cytokine gene cluster at 5q31.1, the filaggrin gene, the C11orf30/LRRC32 locus, and the human leukocyte antigen (HLA) region. Stratifying the results for the causative food demonstrates that association of the HLA locus is peanut allergy-specific whereas the other four loci increase the risk for any food allergy. Variants in the SERPINB gene cluster are associated with SERPINB10 expression in leukocytes. Moreover, SERPINB genes are highly expressed in the esophagus. All identified loci are involved in immunological regulation or epithelial barrier function, emphasizing the role of both mechanisms in food allergy.
Evaluation of gene expression profiles and pathways underlying postnatal development in mouse sclera.

PubMed

Lim, Wan'E; Kwan, Jia Lin; Goh, Liang Kee; Beuerman, Roger W; Barathi, Veluchamy A

2012-01-01

The aim of this study was to identify the genes and pathways underlying the growth of the mouse sclera during postnatal development. Total RNA was isolated from each of 30 single mouse sclera (n=30, 6 sclera each from 1-, 2-, 3-, 6-, and 8-week-old mice) and reverse-transcribed into cDNA using a T7-N(6) primer. The resulting cDNA was fragmented, labeled with biotin, and hybridized to a Mouse Gene 1.0 ST Array. ANOVA analysis was then performed using Partek Genomic Suite 6.5 beta and differentially expressed transcript clusters were filtered based on a selection criterion of ≥ 2 relative fold change at a false discovery rate of ≤ 5%. Genes identified as involved in the main biologic processes during postnatal scleral development were further confirmed using qPCR. A possible pathway that contributes to the postnatal development of the sclera was investigated using Ingenuity Pathway Analysis software. The hierarchical clustering of all time points showed that they did not cluster according to age. The highest number of differentially expressed transcript clusters was found when week 1 and week 2 old scleral tissues were compared. The peroxisome proliferator- activated receptor gamma coactivator 1-alpha (Ppargc1a) gene was found to be involved in the networks generated using Ingenuity Pathway Studio (IPA) from the differentially expressed transcript cluster lists of week 2 versus 1, week 3 versus 2, week 6 versus 3, and week 8 versus 6. The gene expression of Ppargc1a varied during scleral growth from week 1 to 2, week 2 to 3, week 3 to 6, and week 6 to 8 and was found to interact with a different set of genes at different scleral growth stages. Therefore, this indicated that Ppargc1a might play a role in scleral growth during postnatal weeks 1 to 8. Gene expression of eye diseases should be studied as early as postnatal weeks 1-2 to ensure that any changes in gene expression pattern during disease development are detected. In addition, we propose that Ppargc1a might play a role in regulating postnatal scleral development by interacting with a different set of genes at different scleral growth stages.
Evaluation of gene expression profiles and pathways underlying postnatal development in mouse sclera

PubMed Central

Lim, Wan’E.; Kwan, Jia Lin; Goh, Liang Kee; Beuerman, Roger W.

2012-01-01

Purpose The aim of this study was to identify the genes and pathways underlying the growth of the mouse sclera during postnatal development. Methods Total RNA was isolated from each of 30 single mouse sclera (n=30, 6 sclera each from 1-, 2-, 3-, 6-, and 8-week-old mice) and reverse-transcribed into cDNA using a T7-N6 primer. The resulting cDNA was fragmented, labeled with biotin, and hybridized to a Mouse Gene 1.0 ST Array. ANOVA analysis was then performed using Partek Genomic Suite 6.5 beta and differentially expressed transcript clusters were filtered based on a selection criterion of ≥2 relative fold change at a false discovery rate of ≤5%. Genes identified as involved in the main biologic processes during postnatal scleral development were further confirmed using qPCR. A possible pathway that contributes to the postnatal development of the sclera was investigated using Ingenuity Pathway Analysis software. Results The hierarchical clustering of all time points showed that they did not cluster according to age. The highest number of differentially expressed transcript clusters was found when week 1 and week 2 old scleral tissues were compared. The peroxisome proliferator- activated receptor gamma coactivator 1-alpha (Ppargc1a) gene was found to be involved in the networks generated using Ingenuity Pathway Studio (IPA) from the differentially expressed transcript cluster lists of week 2 versus 1, week 3 versus 2, week 6 versus 3, and week 8 versus 6. The gene expression of Ppargc1a varied during scleral growth from week 1 to 2, week 2 to 3, week 3 to 6, and week 6 to 8 and was found to interact with a different set of genes at different scleral growth stages. Therefore, this indicated that Ppargc1a might play a role in scleral growth during postnatal weeks 1 to 8. Conclusions Gene expression of eye diseases should be studied as early as postnatal weeks 1–2 to ensure that any changes in gene expression pattern during disease development are detected. In addition, we propose that Ppargc1a might play a role in regulating postnatal scleral development by interacting with a different set of genes at different scleral growth stages. PMID:22736935
Formation of Nitrogenase NifDK Tetramers in the Mitochondria of Saccharomyces cerevisiae

PubMed Central

2017-01-01

Transferring the prokaryotic enzyme nitrogenase into a eukaryotic host with the final aim of developing N2 fixing cereal crops would revolutionize agricultural systems worldwide. Targeting it to mitochondria has potential advantages because of the organelle’s high O2 consumption and the presence of bacterial-type iron–sulfur cluster biosynthetic machinery. In this study, we constructed 96 strains of Saccharomyces cerevisiae in which transcriptional units comprising nine Azotobacter vinelandii nif genes (nifHDKUSMBEN) were integrated into the genome. Two combinatorial libraries of nif gene clusters were constructed: a library of mitochondrial leading sequences consisting of 24 clusters within four subsets of nif gene expression strength, and an expression library of 72 clusters with fixed mitochondrial leading sequences and nif expression levels assigned according to factorial design. In total, 29 promoters and 18 terminators were combined to adjust nif gene expression levels. Expression and mitochondrial targeting was confirmed at the protein level as immunoblot analysis showed that Nif proteins could be efficiently accumulated in mitochondria. NifDK tetramer formation, an essential step of nitrogenase assembly, was experimentally proven both in cell-free extracts and in purified NifDK preparations. This work represents a first step toward obtaining functional nitrogenase in the mitochondria of a eukaryotic cell. PMID:28221768
Genes Related to Antiviral Activity, Cell Migration, and Lysis Are Differentially Expressed in CD4+ T Cells in Human T Cell Leukemia Virus Type 1-Associated Myelopathy/Tropical Spastic Paraparesis Patients

PubMed Central

Pinto, Mariana Tomazini; Malta, Tathiane Maistro; Rodrigues, Evandra Strazza; Pinheiro, Daniel Guariz; Panepucci, Rodrigo Alexandre; Malmegrim de Farias, Kelen Cristina Ribeiro; Sousa, Alessandra De Paula; Takayanagui, Osvaldo Massaiti; Tanaka, Yuetsu; Covas, Dimas Tadeu

2014-01-01

Abstract Human T cell leukemia virus type 1 (HTLV-1) preferentially infects CD4+ T cells and these cells play a central role in HTLV-1 infection. In this study, we investigated the global gene expression profile of circulating CD4+ T cells from the distinct clinical status of HTLV-1-infected individuals in regard to TAX expression levels. CD4+ T cells were isolated from asymptomatic HTLV-1 carrier (HAC) and HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP) patients in order to identify genes involved in HAM/TSP development using a microarray technique. Hierarchical clustering analysis showed that healthy control (CT) and HTLV-1-infected samples clustered separately. We also observed that the HAC and HAM/TSP groups clustered separately regardless of TAX expression. The gene expression profile of CD4+ T cells was compared among the CT, HAC, and HAM/TSP groups. The paxillin (Pxn), chemokine (C-X-C motif ) receptor 4 (Cxcr4), interleukin 27 (IL27), and granzyme A (Gzma) genes were differentially expressed between the HAC and HAM/TSP groups, regardless of TAX expression. The perforin 1 (Prf1) and forkhead box P3 (Foxp3) genes were increased in the HAM/TSP group and presented a positive correlation to the expression of TAX and the proviral load (PVL). The frequency of CD4+FOXP3+ regulatory T cells (Treg) was higher in HTLV-1-infected individuals. Foxp3 gene expression was positively correlated with cell lysis-related genes (Gzma, Gzmb, and Prf1). These findings suggest that CD4+ T cell activity is distinct between the HAC and HAM/TSP groups. PMID:24041428
Clustering gene expression regulators: new approach to disease subtyping.

PubMed

Pyatnitskiy, Mikhail; Mazo, Ilya; Shkrob, Maria; Schwartz, Elena; Kotelnikova, Ekaterina

2014-01-01

One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA) which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms), that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient.
Clustering Gene Expression Regulators: New Approach to Disease Subtyping

PubMed Central

Pyatnitskiy, Mikhail; Mazo, Ilya; Shkrob, Maria; Schwartz, Elena; Kotelnikova, Ekaterina

2014-01-01

One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA) which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms), that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient. PMID:24416320
Identification of a new gene regulatory circuit involving B cell receptor activated signaling using a combined analysis of experimental, clinical and global gene expression data

PubMed Central

Schrader, Alexandra; Meyer, Katharina; Walther, Neele; Stolz, Ailine; Feist, Maren; Hand, Elisabeth; von Bonin, Frederike; Evers, Maurits; Kohler, Christian; Shirneshan, Katayoon; Vockerodt, Martina; Klapper, Wolfram; Szczepanowski, Monika; Murray, Paul G.; Bastians, Holger; Trümper, Lorenz; Spang, Rainer; Kube, Dieter

2016-01-01

To discover new regulatory pathways in B lymphoma cells, we performed a combined analysis of experimental, clinical and global gene expression data. We identified a specific cluster of genes that was coherently expressed in primary lymphoma samples and suppressed by activation of the B cell receptor (BCR) through αIgM treatment of lymphoma cells in vitro. This gene cluster, which we called BCR.1, includes numerous cell cycle regulators. A reduced expression of BCR.1 genes after BCR activation was observed in different cell lines and also in CD10+ germinal center B cells. We found that BCR activation led to a delayed entry to and progression of mitosis and defects in metaphase. Cytogenetic changes were detected upon long-term αIgM treatment. Furthermore, an inverse correlation of BCR.1 genes with c-Myc co-regulated genes in distinct groups of lymphoma patients was observed. Finally, we showed that the BCR.1 index discriminates activated B cell-like and germinal centre B cell-like diffuse large B cell lymphoma supporting the functional relevance of this new regulatory circuit and the power of guided clustering for biomarker discovery. PMID:27166259
Aberrant c-erbB2 expression in cell clusters overlying focally disrupted breast myoepithelial cell layers: a trigger or sign for emergence of more aggressive cell clones?

PubMed Central

Zhang, Xichen; Hashemi, Shahreyar Shar; Yousefi, Morvarid; Ni, Jinsong; Wang, Qiuyue; Gao, Ling; Gong, Pengtao; Gao, Chunling; Sheng, Joy; Mason, Jeffrey; Man, Yan-gao

2008-01-01

Our recent studies revealed that cell clusters overlying focal myoepithelial cell layer disruption (FMCLD) had a significantly higher frequency of genetic instabilities and expression of invasion-related genes than their adjacent counterparts within the same duct. Our current study attempted to assess whether these cell clusters would also have elevated c-erbB2 expression. Human breast tumors (n=50) with a high frequency of FMCLD were analyzed with double immunohistochemistry, real-time RT-PCR, and chromogenic in situ hybridization for c-erbB2 protein and gene expression. Of 448 FMCLD detected, 404 (90.2%) were associated with cell clusters that had intense c-erbB2 immunoreactivities primarily in their cytoplasm, in contrast to their adjacent counterparts within the same duct, which had no or barely detectable c-erbB2 expression. These c-erbB2 positive cells were arranged as tongue-like projections, “puncturing” into the stroma, and about 20% of them were in direct continuity with tube-like structures that resembled blood vessels. Aberrant c-erbB2 expression was also seen in clusters of architecturally normal-appearing ducts that had distinct cytological abnormalities in both ME and epithelial cells, whereas not in their clear-cut normal counterparts. Molecular assays detected markedly higher c-erbB2 mRNA and gene amplification in cell clusters associated with FMCLD than in those associated with non-disrupted ME cell layers. Our findings suggest that cell clusters overlying FMCLD may represent the precursors of pending invasive lesions, and that aberrant cerbB2 expression may trigger or signify the emergence of biologically more aggressive cell clones. PMID:18726004
[Regulation of the β-globin gene family expression, useful in the search for new therapeutic targets for hemoglobinopathies].

PubMed

Scheps, Karen G; Varela, Viviana

Different hemoglobin isoforms are expressed during the embryonic, fetal and postnatal stages. They are formed by combination of polypeptide chains synthesized from the α- and β-globin gene clusters. Based on the fact that the presence of high hemoglobin F levels is beneficial in both sickle cell disease and severe thalassemic syndromes, a revision of the regulation of the β-globin cluster expression is proposed, especially regarding the genes encoding the y-globin chains (HBG1 and HBG2). In this review we describe the current knowledge about transcription factors and epigenetic regulators involved in the switches of the β-globin cluster. It is expected that the consolidation of knowledge in this field will allow finding new therapeutic targets for the treatment of hemoglobinopathies.
Hox expression in the direct-type developing sand dollar Peronella japonica.

PubMed

Tsuchimoto, Jun; Yamaguchi, Masaaki

2014-08-01

Echinoderms are a curious group of deuterostomes that forms a clade with hemichordates but has a pentameral body plan. Hox complex plays a pivotal role in axial patterning in bilaterians and often occurs in a cluster on the chromosome. In contrast to hemichordates with an organized Hox cluster, the sea urchin Strongylocentrotus purpuratus has a Hox cluster with an atypical organization. However, the current data on hox expression in sea urchin rudiments are fragmentary. We report a comprehensive examination of hox expression in a sand dollar echinoid. Nine hox genes are expressed in the adult rudiment, which are classified into two groups, but hox11/13b belongs to both: one with linear expression in the coelomic mesoderm and another with radial expression around the adult mouth. The linear genes may endow the coelom/mesentery with axial information to direct postmetamorphic transformation of the digestive tract, whereas the radial genes developmentally correlate with the morphological novelties of echinoderms and/or sea urchins. Recruitment of the radial genes except hox11/13b appears to be accompanied by the loss of ancestral/axial roles. This in toto co-option of the hox genes provides insight into the molecular mechanisms underlying the evolution of echinoderms from a bilateral ancestor. © 2014 Wiley Periodicals, Inc.
Clustering of change patterns using Fourier coefficients.

PubMed

Kim, Jaehee; Kim, Haseong

2008-01-15

To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a time period because biologically related gene groups can share the same change patterns. Many clustering algorithms have been proposed to group observation data. However, because of the complexity of the underlying functions there have not been many studies on grouping data based on change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. The sample Fourier coefficients not only provide information about the underlying functions, but also reduce the dimension. In addition, as their limiting distribution is a multivariate normal, a model-based clustering method incorporating statistical properties would be appropriate. This work is aimed at discovering gene groups with similar change patterns that share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. The model-based method is advantageous over other methods in our proposed model because the sample Fourier coefficients asymptotically follow the multivariate normal distribution. Change patterns are automatically estimated with the Fourier representation in our model. Our model was tested in simulations and on real gene data sets. The simulation results showed that the model-based clustering method with the sample Fourier coefficients has a lower clustering error rate than K-means clustering. Even when the number of repeated time points was small, the same results were obtained. We also applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns. The R program is available upon the request.
HOX genes in human lung: altered expression in primary pulmonary hypertension and emphysema.

PubMed

Golpon, H A; Geraci, M W; Moore, M D; Miller, H L; Miller, G J; Tuder, R M; Voelkel, N F

2001-03-01

HOX genes belong to the large family of homeodomain genes that function as transcription factors. Animal studies indicate that they play an essential role in lung development. We investigated the expression pattern of HOX genes in human lung tissue by using microarray and degenerate reverse transcriptase-polymerase chain reaction survey techniques. HOX genes predominantly from the 3' end of clusters A and B were expressed in normal human adult lung and among them HOXA5 was the most abundant, followed by HOXB2 and HOXB6. In fetal (12 weeks old) and diseased lung specimens (emphysema, primary pulmonary hypertension) additional HOX genes from clusters C and D were expressed. Using in situ hybridization, transcripts for HOXA5 were predominantly found in alveolar septal and epithelial cells, both in normal and diseased lungs. A 2.5-fold increase in HOXA5 mRNA expression was demonstrated by quantitative reverse transcriptase-polymerase chain reaction in primary pulmonary hypertension lung specimens when compared to normal lung tissue. In conclusion, we demonstrate that HOX genes are selectively expressed in the human lung. Differences in the pattern of HOX gene expression exist among fetal, adult, and diseased lung specimens. The altered pattern of HOX gene expression may contribute to the development of pulmonary diseases.
Platypus globin genes and flanking loci suggest a new insertional model for beta-globin evolution in birds and mammals

PubMed Central

Patel, Vidushi S; Cooper, Steven JB; Deakin, Janine E; Fulton, Bob; Graves, Tina; Warren, Wesley C; Wilson, Richard K; Graves, Jennifer AM

2008-01-01

Background Vertebrate alpha (α)- and beta (β)-globin gene families exemplify the way in which genomes evolve to produce functional complexity. From tandem duplication of a single globin locus, the α- and β-globin clusters expanded, and then were separated onto different chromosomes. The previous finding of a fossil β-globin gene (ω) in the marsupial α-cluster, however, suggested that duplication of the α-β cluster onto two chromosomes, followed by lineage-specific gene loss and duplication, produced paralogous α- and β-globin clusters in birds and mammals. Here we analyse genomic data from an egg-laying monotreme mammal, the platypus (Ornithorhynchus anatinus), to explore haemoglobin evolution at the stem of the mammalian radiation. Results The platypus α-globin cluster (chromosome 21) contains embryonic and adult α- globin genes, a β-like ω-globin gene, and the GBY globin gene with homology to cytoglobin, arranged as 5'-ζ-ζ'-αD-α3-α2-α1-ω-GBY-3'. The platypus β-globin cluster (chromosome 2) contains single embryonic and adult globin genes arranged as 5'-ε-β-3'. Surprisingly, all of these globin genes were expressed in some adult tissues. Comparison of flanking sequences revealed that all jawed vertebrate α-globin clusters are flanked by MPG-C16orf35 and LUC7L, whereas all bird and mammal β-globin clusters are embedded in olfactory genes. Thus, the mammalian α- and β-globin clusters are orthologous to the bird α- and β-globin clusters respectively. Conclusion We propose that α- and β-globin clusters evolved from an ancient MPG-C16orf35-α-β-GBY-LUC7L arrangement 410 million years ago. A copy of the original β (represented by ω in marsupials and monotremes) was inserted into an array of olfactory genes before the amniote radiation (>315 million years ago), then duplicated and diverged to form orthologous clusters of β-globin genes with different expression profiles in different lineages. PMID:18657265
Identification of Common Differentially Expressed Genes in Urinary Bladder Cancer

PubMed Central

Zaravinos, Apostolos; Lambrou, George I.; Boulalas, Ioannis; Delakas, Dimitris; Spandidos, Demetrios A.

2011-01-01

Background Current diagnosis and treatment of urinary bladder cancer (BC) has shown great progress with the utilization of microarrays. Purpose Our goal was to identify common differentially expressed (DE) genes among clinically relevant subclasses of BC using microarrays. Methodology/Principal Findings BC samples and controls, both experimental and publicly available datasets, were analyzed by whole genome microarrays. We grouped the samples according to their histology and defined the DE genes in each sample individually, as well as in each tumor group. A dual analysis strategy was followed. First, experimental samples were analyzed and conclusions were formulated; and second, experimental sets were combined with publicly available microarray datasets and were further analyzed in search of common DE genes. The experimental dataset identified 831 genes that were DE in all tumor samples, simultaneously. Moreover, 33 genes were up-regulated and 85 genes were down-regulated in all 10 BC samples compared to the 5 normal tissues, simultaneously. Hierarchical clustering partitioned tumor groups in accordance to their histology. K-means clustering of all genes and all samples, as well as clustering of tumor groups, presented 49 clusters. K-means clustering of common DE genes in all samples revealed 24 clusters. Genes manifested various differential patterns of expression, based on PCA. YY1 and NFκB were among the most common transcription factors that regulated the expression of the identified DE genes. Chromosome 1 contained 32 DE genes, followed by chromosomes 2 and 11, which contained 25 and 23 DE genes, respectively. Chromosome 21 had the least number of DE genes. GO analysis revealed the prevalence of transport and binding genes in the common down-regulated DE genes; the prevalence of RNA metabolism and processing genes in the up-regulated DE genes; as well as the prevalence of genes responsible for cell communication and signal transduction in the DE genes that were down-regulated in T1-Grade III tumors and up-regulated in T2/T3-Grade III tumors. Combination of samples from all microarray platforms revealed 17 common DE genes, (BMP4, CRYGD, DBH, GJB1, KRT83, MPZ, NHLH1, TACR3, ACTC1, MFAP4, SPARCL1, TAGLN, TPM2, CDC20, LHCGR, TM9SF1 and HCCS) 4 of which participate in numerous pathways. Conclusions/Significance The identification of the common DE genes among BC samples of different histology can provide further insight into the discovery of new putative markers. PMID:21483740
Multiconstrained gene clustering based on generalized projections

PubMed Central

2010-01-01

Background Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO) annotations and gene network structures. How to integrate multiple pieces of constraints for an optimal clustering solution still remains an unsolved problem. Results We propose a novel multiconstrained gene clustering (MGC) method within the generalized projection onto convex sets (POCS) framework used widely in image reconstruction. Each constraint is formulated as a corresponding set. The generalized projector iteratively projects the clustering solution onto these sets in order to find a consistent solution included in the intersection set that satisfies all constraints. Compared with previous MGC methods, POCS can integrate multiple constraints from different nature without distorting the original constraints. To evaluate the clustering solution, we also propose a new performance measure referred to as Gene Log Likelihood (GLL) that considers genes having more than one function and hence in more than one cluster. Comparative experimental results show that our POCS-based gene clustering method outperforms current state-of-the-art MGC methods. Conclusions The POCS-based MGC method can successfully combine multiple constraints from different nature for gene clustering. Also, the proposed GLL is an effective performance measure for the soft clustering solutions. PMID:20356386

Outcome-Driven Cluster Analysis with Application to Microarray Data.

PubMed

Hsu, Jessie J; Finkelstein, Dianne M; Schoenfeld, David A

2015-01-01

One goal of cluster analysis is to sort characteristics into groups (clusters) so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes) into groups of highly correlated genes that have the same effect on the outcome (recovery). We propose a random effects model where the genes within each group (cluster) equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.
Prokaryotic Gene Clusters: A Rich Toolbox for Synthetic Biology

PubMed Central

Fischbach, Michael; Voigt, Christopher A.

2014-01-01

Bacteria construct elaborate nanostructures, obtain nutrients and energy from diverse sources, synthesize complex molecules, and implement signal processing to react to their environment. These complex phenotypes require the coordinated action of multiple genes, which are often encoded in a contiguous region of the genome, referred to as a gene cluster. Gene clusters sometimes contain all of the genes necessary and sufficient for a particular function. As an evolutionary mechanism, gene clusters facilitate the horizontal transfer of the complete function between species. Here, we review recent work on a number of clusters whose functions are relevant to biotechnology. Engineering these clusters has been hindered by their regulatory complexity, the need to balance the expression of many genes, and a lack of tools to design and manipulate DNA at this scale. Advances in synthetic biology will enable the large-scale bottom-up engineering of the clusters to optimize their functions, wake up cryptic clusters, or to transfer them between organisms. Understanding and manipulating gene clusters will move towards an era of genome engineering, where multiple functions can be “mixed-and-matched” to create a designer organism. PMID:21154668
A mixture model-based approach to the clustering of microarray expression data.

PubMed

McLachlan, G J; Bean, R W; Peel, D

2002-03-01

This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/
Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

PubMed Central

2010-01-01

Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is preferable, in particular if the gene selection is successful. However, this is an area that needs to be studied further in order to draw any general conclusions. Conclusions The choice of cluster analysis, and in particular gene selection, has a large impact on the ability to cluster individuals correctly based on expression profiles. Normalization has a positive effect, but the relative performance of different normalizations is an area that needs more research. In summary, although clustering, gene selection and normalization are considered standard methods in bioinformatics, our comprehensive analysis shows that selecting the right methods, and the right combinations of methods, is far from trivial and that much is still unexplored in what is considered to be the most basic analysis of genomic data. PMID:20937082
Pichia stipitis genomics, transcriptomics, and gene clusters

Treesearch

Thomas W. Jeffries; Jennifer R. Headman Van Vleet

2009-01-01

Genome sequencing and subsequent global gene expression studies have advanced our understanding of the lignocellulose-fermenting yeast Pichia stipitis. These studies have provided an insight into its central carbon metabolism, and analysis of its genome has revealed numerous functional gene clusters and tandem repeats. Specialized physiological traits are often the...
Mapping in an apple (Malus x domestica) F1 segregating population based on physical clustering of differentially expressed genes

PubMed Central

2014-01-01

Background Apple tree breeding is slow and difficult due to long generation times, self-incompatibility, and complex genetics. The identification of molecular markers linked to traits of interest is a way to expedite the breeding process. In the present study, we aimed to identify genes whose steady-state transcript abundance was associated with inheritance of specific traits segregating in an apple (Malus × domestica) rootstock F1 breeding population, including resistance to powdery mildew (Podosphaera leucotricha) disease and woolly apple aphid (Eriosoma lanigerum). Results Transcription profiling was performed for 48 individual F1 apple trees from a cross of two highly heterozygous parents, using RNA isolated from healthy, actively-growing shoot tips and a custom apple DNA oligonucleotide microarray representing 26,000 unique transcripts. Genome-wide expression profiles were not clear indicators of powdery mildew or woolly apple aphid resistance phenotype. However, standard differential gene expression analysis between phenotypic groups of trees revealed relatively small sets of genes with trait-associated expression levels. For example, thirty genes were identified that were differentially expressed between trees resistant and susceptible to powdery mildew. Interestingly, the genes encoding twenty-four of these transcripts were physically clustered on chromosome 12. Similarly, seven genes were identified that were differentially expressed between trees resistant and susceptible to woolly apple aphid, and the genes encoding five of these transcripts were also clustered, this time on chromosome 17. In each case, the gene clusters were in the vicinity of previously identified major quantitative trait loci for the corresponding trait. Similar results were obtained for a series of molecular traits. Several of the differentially expressed genes were used to develop DNA polymorphism markers linked to powdery mildew disease and woolly apple aphid resistance. Conclusions Gene expression profiling and trait-associated transcript analysis using an apple F1 population readily identified genes physically linked to powdery mildew disease resistance and woolly apple aphid resistance loci. This result was especially useful in apple, where extreme levels of heterozygosity make the development of reliable DNA markers quite difficult. The results suggest that this approach could prove effective in crops with complicated genetics, or for which few genomic information resources are available. PMID:24708064
Directed natural product biosynthesis gene cluster capture and expression in the model bacterium Bacillus subtilis

NASA Astrophysics Data System (ADS)

Li, Yongxin; Li, Zhongrui; Yamanaka, Kazuya; Xu, Ying; Zhang, Weipeng; Vlamakis, Hera; Kolter, Roberto; Moore, Bradley S.; Qian, Pei-Yuan

2015-03-01

Bacilli are ubiquitous low G+C environmental Gram-positive bacteria that produce a wide assortment of specialized small molecules. Although their natural product biosynthetic potential is high, robust molecular tools to support the heterologous expression of large biosynthetic gene clusters in Bacillus hosts are rare. Herein we adapt transformation-associated recombination (TAR) in yeast to design a single genomic capture and expression vector for antibiotic production in Bacillus subtilis. After validating this direct cloning ``plug-and-play'' approach with surfactin, we genetically interrogated amicoumacin biosynthetic gene cluster from the marine isolate Bacillus subtilis 1779. Its heterologous expression allowed us to explore an unusual maturation process involving the N-acyl-asparagine pro-drug intermediates preamicoumacins, which are hydrolyzed by the asparagine-specific peptidase into the active component amicoumacin A. This work represents the first direct cloning based heterologous expression of natural products in the model organism B. subtilis and paves the way to the development of future genome mining efforts in this genus.
Directed natural product biosynthesis gene cluster capture and expression in the model bacterium Bacillus subtilis.

PubMed

Li, Yongxin; Li, Zhongrui; Yamanaka, Kazuya; Xu, Ying; Zhang, Weipeng; Vlamakis, Hera; Kolter, Roberto; Moore, Bradley S; Qian, Pei-Yuan

2015-03-24

Bacilli are ubiquitous low G+C environmental Gram-positive bacteria that produce a wide assortment of specialized small molecules. Although their natural product biosynthetic potential is high, robust molecular tools to support the heterologous expression of large biosynthetic gene clusters in Bacillus hosts are rare. Herein we adapt transformation-associated recombination (TAR) in yeast to design a single genomic capture and expression vector for antibiotic production in Bacillus subtilis. After validating this direct cloning "plug-and-play" approach with surfactin, we genetically interrogated amicoumacin biosynthetic gene cluster from the marine isolate Bacillus subtilis 1779. Its heterologous expression allowed us to explore an unusual maturation process involving the N-acyl-asparagine pro-drug intermediates preamicoumacins, which are hydrolyzed by the asparagine-specific peptidase into the active component amicoumacin A. This work represents the first direct cloning based heterologous expression of natural products in the model organism B. subtilis and paves the way to the development of future genome mining efforts in this genus.
Impact of missing data imputation methods on gene expression clustering and classification.

PubMed

de Souto, Marcilio C P; Jaskowiak, Pablo A; Costa, Ivan G

2015-02-26

Several missing value imputation methods for gene expression data have been proposed in the literature. In the past few years, researchers have been putting a great deal of effort into presenting systematic evaluations of the different imputation algorithms. Initially, most algorithms were assessed with an emphasis on the accuracy of the imputation, using metrics such as the root mean squared error. However, it has become clear that the success of the estimation of the expression value should be evaluated in more practical terms as well. One can consider, for example, the ability of the method to preserve the significant genes in the dataset, or its discriminative/predictive power for classification/clustering purposes. We performed a broad analysis of the impact of five well-known missing value imputation methods on three clustering and four classification methods, in the context of 12 cancer gene expression datasets. We employed a statistical framework, for the first time in this field, to assess whether different imputation methods improve the performance of the clustering/classification methods. Our results suggest that the imputation methods evaluated have a minor impact on the classification and downstream clustering analyses. Simple methods such as replacing the missing values by mean or the median values performed as well as more complex strategies. The datasets analyzed in this study are available at http://costalab.org/Imputation/ .
Transcriptomic analysis of neuregulin-1 regulated genes following ischemic stroke by computational identification of promoter binding sites: A role for the ETS-1 transcription factor.

PubMed

Surles-Zeigler, Monique C; Li, Yonggang; Distel, Timothy J; Omotayo, Hakeem; Ge, Shaokui; Ford, Byron D

2018-01-01

Ischemic stroke is a major cause of mortality in the United States. We previously showed that neuregulin-1 (NRG1) was neuroprotective in rat models of ischemic stroke. We used gene expression profiling to understand the early cellular and molecular mechanisms of NRG1's effects after the induction of ischemia. Ischemic stroke was induced by middle cerebral artery occlusion (MCAO). Rats were allocated to 3 groups: (1) control, (2) MCAO and (3) MCAO + NRG1. Cortical brain tissues were collected three hours following MCAO and NRG1 treatment and subjected to microarray analysis. Data and statistical analyses were performed using R/Bioconductor platform alongside Genesis, Ingenuity Pathway Analysis and Enrichr software packages. There were 2693 genes differentially regulated following ischemia and NRG1 treatment. These genes were organized by expression patterns into clusters using a K-means clustering algorithm. We further analyzed genes in clusters where ischemia altered gene expression, which was reversed by NRG1 (clusters 4 and 10). NRG1, IRS1, OPA3, and POU6F1 were central linking (node) genes in cluster 4. Conserved Transcription Factor Binding Site Finder (CONFAC) identified ETS-1 as a potential transcriptional regulator of NRG1 suppressed genes following ischemia. A transcription factor activity array showed that ETS-1 activity was increased 2-fold, 3 hours following ischemia and this activity was attenuated by NRG1. These findings reveal key early transcriptional mechanisms associated with neuroprotection by NRG1 in the ischemic penumbra.
New natural products isolated from Metarhizium robertsii ARSEF 23 by chemical screening and identification of the gene cluster through engineered biosynthesis in Aspergillus nidulans A1145.

PubMed

Kato, Hiroki; Tsunematsu, Yuta; Yamamoto, Tsuyoshi; Namiki, Takuya; Kishimoto, Shinji; Noguchi, Hiroshi; Watanabe, Kenji

2016-07-01

To rapidly identify novel natural products and their associated biosynthetic genes from underutilized and genetically difficult-to-manipulate microbes, we developed a method that uses (1) chemical screening to isolate novel microbial secondary metabolites, (2) bioinformatic analyses to identify a potential biosynthetic gene cluster and (3) heterologous expression of the genes in a convenient host to confirm the identity of the gene cluster and the proposed biosynthetic mechanism. The chemical screen was achieved by searching known natural product databases with data from liquid chromatographic and high-resolution mass spectrometric analyses collected on the extract from a target microbe culture. Using this method, we were able to isolate two new meroterpenes, subglutinols C (1) and D (2), from an entomopathogenic filamentous fungus Metarhizium robertsii ARSEF 23. Bioinformatics analysis of the genome allowed us to identify a gene cluster likely to be responsible for the formation of subglutinols. Heterologous expression of three genes from the gene cluster encoding a polyketide synthase, a prenyltransferase and a geranylgeranyl pyrophosphate synthase in Aspergillus nidulans A1145 afforded an α-pyrone-fused uncyclized diterpene, the expected intermediate of the subglutinol biosynthesis, thereby confirming the gene cluster to be responsible for the subglutinol biosynthesis. These results indicate the usefulness of our methodology in isolating new natural products and identifying their associated biosynthetic gene cluster from microbes that are not amenable to genetic manipulation. Our method should facilitate the natural product discovery efforts by expediting the identification of new secondary metabolites and their associated biosynthetic genes from a wider source of microbes.
HOX Genes in Human Lung

PubMed Central

Golpon, Heiko A.; Geraci, Mark W.; Moore, Mark D.; Miller, Heidi L.; Miller, Gary J.; Tuder, Rubin M.; Voelkel, Norbert F.

2001-01-01

HOX genes belong to the large family of homeodomain genes that function as transcription factors. Animal studies indicate that they play an essential role in lung development. We investigated the expression pattern of HOX genes in human lung tissue by using microarray and degenerate reverse transcriptase-polymerase chain reaction survey techniques. HOX genes predominantly from the 3′ end of clusters A and B were expressed in normal human adult lung and among them HOXA5 was the most abundant, followed by HOXB2 and HOXB6. In fetal (12 weeks old) and diseased lung specimens (emphysema, primary pulmonary hypertension) additional HOX genes from clusters C and D were expressed. Using in situ hybridization, transcripts for HOXA5 were predominantly found in alveolar septal and epithelial cells, both in normal and diseased lungs. A 2.5-fold increase in HOXA5 mRNA expression was demonstrated by quantitative reverse transcriptase-polymerase chain reaction in primary pulmonary hypertension lung specimens when compared to normal lung tissue. In conclusion, we demonstrate that HOX genes are selectively expressed in the human lung. Differences in the pattern of HOX gene expression exist among fetal, adult, and diseased lung specimens. The altered pattern of HOX gene expression may contribute to the development of pulmonary diseases. PMID:11238043
MeSH key terms for validation and annotation of gene expression clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rechtsteiner, A.; Rocha, L. M.

2004-01-01

Integration of different sources of information is a great challenge for the analysis of gene expression data, and for the field of Functional Genomics in general. As the availability of numerical data from high-throughput methods increases, so does the need for technologies that assist in the validation and evaluation of the biological significance of results extracted from these data. In mRNA assaying with microarrays, for example, numerical analysis often attempts to identify clusters of co-expressed genes. The important task to find the biological significance of the results and validate them has so far mostly fallen to the biological expert whomore » had to perform this task manually. One of the most promising avenues to develop automated and integrative technology for such tasks lies in the application of modern Information Retrieval (IR) and Knowledge Management (KM) algorithms to databases with biomedical publications and data. Examples of databases available for the field are bibliographic databases c ntaining scientific publications (e.g. MEDLINE/PUBMED), databases containing sequence data (e.g. GenBank) and databases of semantic annotations (e.g. the Gene Ontology Consortium and Medical Subject Headings (MeSH)). We present here an approach that uses the MeSH terms and their concept hierarchies to validate and obtain functional information for gene expression clusters. The controlled and hierarchical MeSH vocabulary is used by the National Library of Medicine (NLM) to index all the articles cited in MEDLINE. Such indexing with a controlled vocabulary eliminates some of the ambiguity due to polysemy (terms that have multiple meanings) and synonymy (multiple terms have similar meaning) that would be encountered if terms would be extracted directly from the articles due to differing article contexts or author preferences and background. Further, the hierarchical organization of the MeSH terms can illustrate the conceptuallfunctional relationships of genes associated with MeSH terms. MeSH terms can be associated with genes through co-occurrence of these in MEDLINE citations, i.e. the genes occur in titles or abstracts and the MeSH terms are assigned by experts. To identify MeSH terms associated with a group of genes we used the tool MESHGENE developed at the Information Dynamics Lab at HP Labs (http://www-idl.hpl.hp.com/meshgene/). When presented with a list of human genes, MESHGENE uses some sophisticated techniques to search for these gene symbols in the titles and abstracts of all MEDLINE citations. MeSH terms and the number of co-occurrences can be retrieved. Gene symbols that are aliases of each other are pooled from several databases. This addresses the problem of synonymy, the fact that several symbols can refer to the same gene. MESHGENE employs some sophisticated algorithms that disregards symbols that are likely to be acronyms for other concepts than a gene. This addresses the problem of polysemy, i.e. possible multiple meanings of a gene symbol. We applied our approach to gene expression data from herpes virus infected human fibroblast cells. The data contains 12 time-points, between 1/2 hrs and 48 hrs after infection. Singular Value Decomposition was used to identify the dominant modes of expression. 75% of the variance in the expression data was captured by the first two modes, the first exhibiting a monotonly increasing expression pattern and the second a more transient pattern. Projection of the gene expression vectors onto this first two modes identified 3 statistically significant clusters of co-expressed genes. 500 genes from cluster 1 and 300 genes from clusters 2 and 3 each were uploaded to MESHGENE and the MeSH terms and co-occurrence values were retrieved. MeSH terms were also obtained for 5 groups of randomly selected genes with similar numbers of genes. The log was taken of the co-occurrence values and for each MeSH term these log co-occurrence values were summed for each group over the genes in that group. A matrix with 8 columns for the 8 groups of genes and with 14,000 rows with the MeSH terms was obtained. To analyze this association matrix we used a Latent Semantic Analysis (LSA) approach. We applied SVD to this gene-group vs. MeSH term association matrix. The first 2 modes that capture most of the variation (and therefore most times also information) in the association matrix were highly associated with MeSH terms that occurred uniquely or disproportionally in the 3 gene clusters. MeSH terms highly associated with the 5 groups of randomly selected genes were associated with the lower modes. These modes seem to just capture 'noise' in the association matrix. This result by itself is of great interest for gene expression analysis. We were able to show that the 3 clusters of genes not only separated in 'expression space' but also in the MeSH term space with which they are associated through the literature.« less
Evolution of Daily Gene Co-expression Patterns from Algae to Plants

PubMed Central

de los Reyes, Pedro; Romero-Campero, Francisco J.; Ruiz, M. Teresa; Romero, José M.; Valverde, Federico

2017-01-01

Daily rhythms play a key role in transcriptome regulation in plants and microalgae orchestrating responses that, among other processes, anticipate light transitions that are essential for their metabolism and development. The recent accumulation of genome-wide transcriptomic data generated under alternating light:dark periods from plants and microalgae has made possible integrative and comparative analysis that could contribute to shed light on the evolution of daily rhythms in the green lineage. In this work, RNA-seq and microarray data generated over 24 h periods in different light regimes from the eudicot Arabidopsis thaliana and the microalgae Chlamydomonas reinhardtii and Ostreococcus tauri have been integrated and analyzed using gene co-expression networks. This analysis revealed a reduction in the size of the daily rhythmic transcriptome from around 90% in Ostreococcus, being heavily influenced by light transitions, to around 40% in Arabidopsis, where a certain independence from light transitions can be observed. A novel Multiple Bidirectional Best Hit (MBBH) algorithm was applied to associate single genes with a family of potential orthologues from evolutionary distant species. Gene duplication, amplification and divergence of rhythmic expression profiles seems to have played a central role in the evolution of gene families in the green lineage such as Pseudo Response Regulators (PRRs), CONSTANS-Likes (COLs), and DNA-binding with One Finger (DOFs). Gene clustering and functional enrichment have been used to identify groups of genes with similar rhythmic gene expression patterns. The comparison of gene clusters between species based on potential orthologous relationships has unveiled a low to moderate level of conservation of daily rhythmic expression patterns. However, a strikingly high conservation was found for the gene clusters exhibiting their highest and/or lowest expression value during the light transitions. PMID:28751903
Identification of a cluster IV pleiotropic drug resistance transporter gene expressed in the style of Nicotiana plumbaginifolia.

PubMed

Trombik, Tomasz; Jasinski, Michal; Crouzet, Jérome; Boutry, Marc

2008-01-01

ATP-binding cassette transporters of the pleiotropic drug resistance (PDR) subfamily are composed of five clusters. We have cloned a gene, NpPDR2, belonging to the still uncharacterized cluster IV from Nicotiana plumbaginifolia. NpPDR2 transcripts were found in the roots and mature flowers. In the latter, NpPDR2 expression was restricted to the style and only after pollination. A 1.5-kb genomic sequence containing the putative NpPDR2 transcription promoter was fused to the beta-glucuronidase reporter gene. The GUS expression pattern confirmed the RT-PCR results that NpPDR2 was expressed in roots and the flower style and showed that it was localized around the conductive tissues. Unlike other PDR genes, NpPDR2 expression was not induced in leaf tissues by none of the hormones typically involved in biotic and abiotic stress response. Moreover, unlike NpPDR1 known to be involved in biotic stress response, NpPDR2 expression was not induced in the style upon Botrytis cinerea infection. In N. plumbaginifolia plants in which NpPDR2 expression was prevented by RNA interference, no unusual phenotype was observed, including at the flowering stage, which suggests that NpPDR2 is not essential in the reproductive process under the tested conditions.
Allopatric integrations selectively change host transcriptomes, leading to varied expression efficiencies of exotic genes in Myxococcus xanthus.

PubMed

Zhu, Li-Ping; Yue, Xin-Jing; Han, Kui; Li, Zhi-Feng; Zheng, Lian-Shuai; Yi, Xiu-Nan; Wang, Hai-Long; Zhang, You-Ming; Li, Yue-Zhong

2015-07-22

Exotic genes, especially clustered multiple-genes for a complex pathway, are normally integrated into chromosome for heterologous expression. The influences of insertion sites on heterologous expression and allotropic expressions of exotic genes on host remain mostly unclear. We compared the integration and expression efficiencies of single and multiple exotic genes that were inserted into Myxococcus xanthus genome by transposition and attB-site-directed recombination. While the site-directed integration had a rather stable chloramphenicol acetyl transferase (CAT) activity, the transposition produced varied CAT enzyme activities. We attempted to integrate the 56-kb gene cluster for the biosynthesis of antitumor polyketides epothilones into M. xanthus genome by site-direction but failed, which was determined to be due to the insertion size limitation at the attB site. The transposition technique produced many recombinants with varied production capabilities of epothilones, which, however, were not paralleled to the transcriptional characteristics of the local sites where the genes were integrated. Comparative transcriptomics analysis demonstrated that the allopatric integrations caused selective changes of host transcriptomes, leading to varied expressions of epothilone genes in different mutants. With the increase of insertion fragment size, transposition is a more practicable integration method for the expression of exotic genes. Allopatric integrations selectively change host transcriptomes, which lead to varied expression efficiencies of exotic genes.
Arrangement of the Clostridium baratii F7 Toxin Gene Cluster with Identification of a σ Factor That Recognizes the Botulinum Toxin Gene Cluster Promoters

DOE PAGES

Dover, Nir; Barash, Jason R.; Burke, Julianne N.; ...

2014-05-22

Botulinum neurotoxin (BoNT) is the most poisonous substances known and its eight toxin types (A to H) are distinguished by the inability of polyclonal antibodies that neutralize one toxin type to neutralize any of the other seven toxin types. Infant botulism, an intestinal toxemia orphan disease, is the most common form of human botulism in the United States. It results from swallowed spores of Clostridium botulinum (or rarely, neurotoxigenic Clostridium butyricum or Clostridium baratii) that germinate and temporarily colonize the lumen of the large intestine, where, as vegetative cells, they produce botulinum toxin. Botulinum neurotoxin is encoded by the bontmore » gene that is part of a toxin gene cluster that includes several accessory genes. In this paper, we sequenced for the first time the complete botulinum neurotoxin gene cluster of nonproteolytic C. baratii type F7. Like the type E and the nonproteolytic type F6 botulinum toxin gene clusters, the C. baratii type F7 had an orfX toxin gene cluster that lacked the regulatory botR gene which is found in proteolytic C. botulinum strains and codes for an alternative σ factor. In the absence of botR, we identified a putative alternative regulatory gene located upstream of the C. baratii type F7 toxin gene cluster. This putative regulatory gene codes for a predicted σ factor that contains DNA-binding-domain homologues to the DNA-binding domains both of BotR and of other members of the TcdR-related group 5 of the σ 70 family that are involved in the regulation of toxin gene expression in clostridia. We showed that this TcdR-related protein in association with RNA polymerase core enzyme specifically binds to the C. baratii type F7 botulinum toxin gene cluster promoters. Finally, this TcdR-related protein may therefore be involved in regulating the expression of the genes of the botulinum toxin gene cluster in neurotoxigenic C. baratii.« less
Motif-independent prediction of a secondary metabolism gene cluster using comparative genomics: application to sequenced genomes of Aspergillus and ten other filamentous fungal species.

PubMed

Takeda, Itaru; Umemura, Myco; Koike, Hideaki; Asai, Kiyoshi; Machida, Masayuki

2014-08-01

Despite their biological importance, a significant number of genes for secondary metabolite biosynthesis (SMB) remain undetected due largely to the fact that they are highly diverse and are not expressed under a variety of cultivation conditions. Several software tools including SMURF and antiSMASH have been developed to predict fungal SMB gene clusters by finding core genes encoding polyketide synthase, nonribosomal peptide synthetase and dimethylallyltryptophan synthase as well as several others typically present in the cluster. In this work, we have devised a novel comparative genomics method to identify SMB gene clusters that is independent of motif information of the known SMB genes. The method detects SMB gene clusters by searching for a similar order of genes and their presence in nonsyntenic blocks. With this method, we were able to identify many known SMB gene clusters with the core genes in the genomic sequences of 10 filamentous fungi. Furthermore, we have also detected SMB gene clusters without core genes, including the kojic acid biosynthesis gene cluster of Aspergillus oryzae. By varying the detection parameters of the method, a significant difference in the sequence characteristics was detected between the genes residing inside the clusters and those outside the clusters. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Gene Expression Profiles in Paired Gingival Biopsies from Periodontitis-Affected and Healthy Tissues Revealed by Massively Parallel Sequencing

PubMed Central

Båge, Tove; Lagervall, Maria; Jansson, Leif; Lundeberg, Joakim; Yucel-Lindberg, Tülay

2012-01-01

Periodontitis is a chronic inflammatory disease affecting the soft tissue and bone that surrounds the teeth. Despite extensive research, distinctive genes responsible for the disease have not been identified. The objective of this study was to elucidate transcriptome changes in periodontitis, by investigating gene expression profiles in gingival tissue obtained from periodontitis-affected and healthy gingiva from the same patient, using RNA-sequencing. Gingival biopsies were obtained from a disease-affected and a healthy site from each of 10 individuals diagnosed with periodontitis. Enrichment analysis performed among uniquely expressed genes for the periodontitis-affected and healthy tissues revealed several regulated pathways indicative of inflammation for the periodontitis-affected condition. Hierarchical clustering of the sequenced biopsies demonstrated clustering according to the degree of inflammation, as observed histologically in the biopsies, rather than clustering at the individual level. Among the top 50 upregulated genes in periodontitis-affected tissues, we investigated two genes which have not previously been demonstrated to be involved in periodontitis. These included interferon regulatory factor 4 and chemokine (C-C motif) ligand 18, which were also expressed at the protein level in gingival biopsies from patients with periodontitis. In conclusion, this study provides a first step towards a quantitative comprehensive insight into the transcriptome changes in periodontitis. We demonstrate for the first time site-specific local variation in gene expression profiles of periodontitis-affected and healthy tissues obtained from patients with periodontitis, using RNA-seq. Further, we have identified novel genes expressed in periodontitis tissues, which may constitute potential therapeutic targets for future treatment strategies of periodontitis. PMID:23029519
Conservation of regulatory sequences and gene expression patterns in the disintegrating Drosophila Hox gene complex

PubMed Central

Negre, Bárbara; Casillas, Sònia; Suzanne, Magali; Sánchez-Herrero, Ernesto; Akam, Michael; Nefedov, Michael; Barbadilla, Antonio; de Jong, Pieter; Ruiz, Alfredo

2005-01-01

Homeotic (Hox) genes are usually clustered and arranged in the same order as they are expressed along the anteroposterior body axis of metazoans. The mechanistic explanation for this colinearity has been elusive, and it may well be that a single and universal cause does not exist. The Hox-gene complex (HOM-C) has been rearranged differently in several Drosophila species, producing a striking diversity of Hox gene organizations. We investigated the genomic and functional consequences of the two HOM-C splits present in Drosophila buzzatii. Firstly, we sequenced two regions of the D. buzzatii genome, one containing the genes labial and abdominal A, and another one including proboscipedia, and compared their organization with that of D. melanogaster and D. pseudoobscura in order to map precisely the two splits. Then, a plethora of conserved noncoding sequences, which are putative enhancers, were identified around the three Hox genes closer to the splits. The position and order of these enhancers are conserved, with minor exceptions, between the three Drosophila species. Finally, we analyzed the expression patterns of the same three genes in embryos and imaginal discs of four Drosophila species with different Hox-gene organizations. The results show that their expression patterns are conserved despite the HOM-C splits. We conclude that, in Drosophila, Hox-gene clustering is not an absolute requirement for proper function. Rather, the organization of Hox genes is modular, and their clustering seems the result of phylogenetic inertia more than functional necessity. PMID:15867430

Identification of new participants in the rainbow trout (Oncorhynchus mykiss) oocyte maturation and ovulation processes using cDNA microarrays

PubMed Central

Bobe, Julien; Montfort, Jerôme; Nguyen, Thaovi; Fostier, Alexis

2006-01-01

Background The hormonal control of oocyte maturation and ovulation as well as the molecular mechanisms of nuclear maturation have been thoroughly studied in fish. In contrast, the other molecular events occurring in the ovary during post-vitellogenesis have received far less attention. Methods Nylon microarrays displaying 9152 rainbow trout cDNAs were hybridized using RNA samples originating from ovarian tissue collected during late vitellogenesis, post-vitellogenesis and oocyte maturation. Differentially expressed genes were identified using a statistical analysis. A supervised clustering analysis was performed using only differentially expressed genes in order to identify gene clusters exhibiting similar expression profiles. In addition, specific genes were selected and their preovulatory ovarian expression was analyzed using real-time PCR. Results From the statistical analysis, 310 differentially expressed genes were identified. Among those genes, 90 were up-regulated at the time of oocyte maturation while 220 exhibited an opposite pattern. After clustering analysis, 90 clones belonging to 3 gene clusters exhibiting the most remarkable expression patterns were kept for further analysis. Using real-time PCR analysis, we observed a strong up-regulation of ion and water transport genes such as aquaporin 4 (aqp4) and pendrin (slc26). In addition, a dramatic up-regulation of vasotocin (avt) gene was observed. Furthermore, angiotensin-converting-enzyme 2 (ace2), coagulation factor V (cf5), adam 22, and the chemokine cxcl14 genes exhibited a sharp up-regulation at the time of oocyte maturation. Finally, ovarian aromatase (cyp19a1) exhibited a dramatic down-regulation over the post-vitellogenic period while a down-regulation of Cytidine monophosphate-N-acetylneuraminic acid hydroxylase (cmah) was observed at the time of oocyte maturation. Conclusion We showed the over or under expression of more that 300 genes, most of them being previously unstudied or unknown in the fish preovulatory ovary. Our data confirmed the down-regulation of estrogen synthesis genes during the preovulatory period. In addition, the strong up-regulation of aqp4 and slc26 genes prior to ovulation suggests their participation in the oocyte hydration process occurring at that time. Furthermore, among the most up-regulated clones, several genes such as cxcl14, ace2, adam22, cf5 have pro-inflammatory, vasodilatory, proteolytics and coagulatory functions. The identity and expression patterns of those genes support the theory comparing ovulation to an inflammatory-like reaction. PMID:16872517
Biosynthesis of the acetyl‐CoA carboxylase‐inhibiting antibiotic, andrimid in Serratia is regulated by Hfq and the LysR‐type transcriptional regulator, AdmX

PubMed Central

Nogellova, Veronika; Morel, Bertrand; Krell, Tino

2016-01-01

Summary Infections due to multidrug‐resistant bacteria represent a major global health challenge. To combat this problem, new antibiotics are urgently needed and some plant‐associated bacteria are a promising source. The rhizobacterium Serratia plymuthica A153 produces several bioactive secondary metabolites, including the anti‐oomycete and antifungal haterumalide, oocydin A and the broad spectrum polyamine antibiotic, zeamine. In this study, we show that A153 produces a second broad spectrum antibiotic, andrimid. Using genome sequencing, comparative genomics and mutagenesis, we defined new genes involved in andrimid (adm) biosynthesis. Both the expression of the adm gene cluster and regulation of andrimid synthesis were investigated. The biosynthetic cluster is operonic and its expression is modulated by various environmental cues, including temperature and carbon source. Analysis of the genome context of the adm operon revealed a gene encoding a predicted LysR‐type regulator, AdmX, apparently unique to Serratia strains. Mutagenesis and gene expression assays demonstrated that AdmX is a transcriptional activator of the adm gene cluster. At the post‐transcriptional level, the expression of the adm cluster is positively regulated by the RNA chaperone, Hfq, in an RpoS‐independent manner. Our results highlight the complexity of andrimid biosynthesis – an antibiotic with potential clinical and agricultural utility. PMID:26914969
Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering

NASA Technical Reports Server (NTRS)

Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland

2000-01-01

Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.
Heterologous expression of oxytetracycline biosynthetic gene cluster in Streptomyces venezuelae WVR2006 to improve production level and to alter fermentation process.

PubMed

Yin, Shouliang; Li, Zilong; Wang, Xuefeng; Wang, Huizhuan; Jia, Xiaole; Ai, Guomin; Bai, Zishang; Shi, Mingxin; Yuan, Fang; Liu, Tiejun; Wang, Weishan; Yang, Keqian

2016-12-01

Heterologous expression is an important strategy to activate biosynthetic gene clusters of secondary metabolites. Here, it is employed to activate and manipulate the oxytetracycline (OTC) gene cluster and to alter OTC fermentation process. To achieve these goals, a fast-growing heterologous host Streptomyces venezuelae WVR2006 was rationally selected among several potential hosts. It shows rapid and dispersed growth and intrinsic high resistance to OTC. By manipulating the expression of two cluster-situated regulators (CSR) OtcR and OtrR and precursor supply, the OTC production level was significantly increased in this heterologous host from 75 to 431 mg/l only in 48 h, a level comparable to the native producer Streptomyces rimosus M4018 in 8 days. This work shows that S. venezuelae WVR2006 is a promising chassis for the production of secondary metabolites, and the engineered heterologous OTC producer has the potential to completely alter the fermentation process of OTC production.
Hox gene expression during postlarval development of the polychaete Alitta virens.

PubMed

Bakalenko, Nadezhda I; Novikova, Elena L; Nesterenko, Alexander Y; Kulakova, Milana A

2013-05-01

Hox genes are the family of transcription factors that play a key role in the patterning of the anterior-posterior axis of all bilaterian animals. These genes display clustered organization and colinear expression. Expression boundaries of individual Hox genes usually correspond with morphological boundaries of the body. Previously, we studied Hox gene expression during larval development of the polychaete Alitta virens (formerly Nereis virens) and discovered that Hox genes are expressed in nereid larva according to the spatial colinearity principle. Adult Alitta virens consist of multiple morphologically similar segments, which are formed sequentially in the growth zone. Since the worm grows for most of its life, postlarval segments constantly change their position along the anterior-posterior axis. We studied the expression dynamics of the Hox cluster during postlarval development of the nereid Alitta virens and found that 8 out of 11 Hox genes are transcribed as wide gene-specific gradients in the ventral nerve cord, ectoderm, and mesoderm. The expression domains constantly shift in accordance with the changing proportions of the growing worm, so expression domains of most Hox genes do not have stable anterior or/and posterior boundaries.In the course of our study, we revealed long antisense RNA (asRNA) for some Hox genes. Expression patterns of two of these genes were analyzed using whole-mount in-situ hybridization. This is the first discovery of antisense RNA for Hox genes in Lophotrochozoa. Hox gene expression in juvenile A. virens differs significantly from Hox gene expression patterns both in A. virens larva and in other Bilateria.We suppose that the postlarval function of the Hox genes in this polychaete is to establish and maintain positional coordinates in a constantly growing body, as opposed to creating morphological difference between segments.
Expression map of a complete set of gustatory receptor genes in chemosensory organs of Bombyx mori.

PubMed

Guo, Huizhen; Cheng, Tingcai; Chen, Zhiwei; Jiang, Liang; Guo, Youbing; Liu, Jianqiu; Li, Shenglong; Taniai, Kiyoko; Asaoka, Kiyoshi; Kadono-Okuda, Keiko; Arunkumar, Kallare P; Wu, Jiaqi; Kishino, Hirohisa; Zhang, Huijie; Seth, Rakesh K; Gopinathan, Karumathil P; Montagné, Nicolas; Jacquin-Joly, Emmanuelle; Goldsmith, Marian R; Xia, Qingyou; Mita, Kazuei

2017-03-01

Most lepidopteran species are herbivores, and interaction with host plants affects their gene expression and behavior as well as their genome evolution. Gustatory receptors (Grs) are expected to mediate host plant selection, feeding, oviposition and courtship behavior. However, due to their high diversity, sequence divergence and extremely low level of expression it has been difficult to identify precisely a complete set of Grs in Lepidoptera. By manual annotation and BAC sequencing, we improved annotation of 43 gene sequences compared with previously reported Grs in the most studied lepidopteran model, the silkworm, Bombyx mori, and identified 7 new tandem copies of BmGr30 on chromosome 7, bringing the total number of BmGrs to 76. Among these, we mapped 68 genes to chromosomes in a newly constructed chromosome distribution map and 8 genes to scaffolds; we also found new evidence for large clusters of BmGrs, especially from the bitter receptor family. RNA-seq analysis of diverse BmGr expression patterns in chemosensory organs of larvae and adults enabled us to draw a precise organ specific map of BmGr expression. Interestingly, most of the clustered genes were expressed in the same tissues and more than half of the genes were expressed in larval maxillae, larval thoracic legs and adult legs. For example, BmGr63 showed high expression levels in all organs in both larval and adult stages. By contrast, some genes showed expression limited to specific developmental stages or organs and tissues. BmGr19 was highly expressed in larval chemosensory organs (especially antennae and thoracic legs), the single exon genes BmGr53 and BmGr67 were expressed exclusively in larval tissues, the BmGr27-BmGr31 gene cluster on chr7 displayed a high expression level limited to adult legs and the candidate CO 2 receptor BmGr2 was highly expressed in adult antennae, where few other Grs were expressed. Transcriptional analysis of the Grs in B. mori provides a valuable new reference for finding genes involved in plant-insect interactions in Lepidoptera and establishing correlations between these genes and vital insect behaviors like host plant selection and courtship for mating. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Rapid generation of recombinant Pseudomonas putida secondary metabolite producers using yTREX.

PubMed

Domröse, Andreas; Weihmann, Robin; Thies, Stephan; Jaeger, Karl-Erich; Drepper, Thomas; Loeschcke, Anita

2017-12-01

Microbial secondary metabolites represent a rich source of valuable compounds with a variety of applications in medicine or agriculture. Effective exploitation of this wealth of chemicals requires the functional expression of the respective biosynthetic genes in amenable heterologous hosts. We have previously established the TREX system which facilitates the transfer, integration and expression of biosynthetic gene clusters in various bacterial hosts. Here, we describe the yTREX system, a new tool adapted for one-step yeast recombinational cloning of gene clusters. We show that with yTREX, Pseudomonas putida secondary metabolite production strains can rapidly be constructed by random targeting of chromosomal promoters by Tn5 transposition. Feasibility of this approach was corroborated by prodigiosin production after yTREX cloning, transfer and expression of the respective biosynthesis genes from Serratia marcescens . Furthermore, the applicability of the system for effective pathway rerouting by gene cluster adaptation was demonstrated using the violacein biosynthesis gene cluster from Chromobacterium violaceum , producing pathway metabolites violacein, deoxyviolacein, prodeoxyviolacein, and deoxychromoviridans. Clones producing both prodigiosin and violaceins could be readily identified among clones obtained after random chromosomal integration by their strong color-phenotype. Finally, the addition of a promoter-less reporter gene enabled facile detection also of phenazine-producing clones after transfer of the respective phenazine-1-carboxylic acid biosynthesis genes from Pseudomonas aeruginosa . All compounds accumulated to substantial titers in the mg range. We thus corroborate here the suitability of P. putida for the biosynthesis of diverse natural products, and demonstrate that the yTREX system effectively enables the rapid generation of secondary metabolite producing bacteria by activation of heterologous gene clusters, applicable for natural compound discovery and combinatorial biosynthesis.
HOX gene expression in phenotypic and genotypic subgroups and low HOXA gene expression as an adverse prognostic factor in pediatric ALL.

PubMed

Starkova, Julia; Zamostna, Blanka; Mejstrikova, Ester; Krejci, Roman; Drabkin, Harry A; Trka, Jan

2010-12-01

HOX genes play an important role in both normal lymphopoiesis and leukemogenesis. However, HOX expression patterns in leukemia cells compared to normal lymphoid progenitors have not been systematically studied in acute lymphoblastic leukemia (ALL) subtypes. The RNA expression levels of HOXA, HOXB, and CDX1/2 genes were analyzed by qRT-PCR in a cohort of 61 diagnostic pediatric ALL samples and FACS-sorted subpopulations of normal lymphoid progenitors. The RNA expression of HOXA7-10, HOXA13, and HOXB2-4 genes was exclusively detected in leukemic cells and immature progenitors. The RNA expression of HOXB6 and CDX2 genes was exclusively detected in leukemic cells but not in B-lineage cells at any of the studied developmental stages. HOXA3-4, HOXA7, and HOXB3-4 genes were differentially expressed between BCP-ALL and T-ALL subgroups, and among genotypically defined MLL/AF4, TEL/AML1, BCR/ABL, hyperdiploid and normal karyotype subgroups. However, this differential expression did not define specific clusters in hierarchical cluster analysis. HOXA7 gene was low expressed at the RNA level in patients with hyperdiploid leukemia, whereas HOXB7 and CDX2 genes were low expressed in TEL/AML1-positive and BCR/ABL-positive cases, respectively. In contrast to previous findings in acute myeloid leukemia, high HOXA RNA expression was associated with an excellent prognosis in Cox's regression model (P = 0.03). In MLL/AF4-positive ALL, lower HOXA RNA expression correlated with the methylation status of their promoters. HOX gene RNA expression cannot discriminate leukemia subgroups or relative maturity of leukemic cells. However, HOXA RNA expression correlates with prognosis, and particular HOX genes are expressed in specific genotypically characterized subgroups.
Ancient origin of placental expression in the growth hormone genes of anthropoid primates

PubMed Central

Papper, Zack; Jameson, Natalie M.; Romero, Roberto; Weckle, Amy L.; Mittal, Pooja; Benirschke, Kurt; Santolaya-Forgas, Joaquin; Uddin, Monica; Haig, David; Goodman, Morris; Wildman, Derek E.

2009-01-01

In anthropoid primates, growth hormone (GH) genes have undergone at least 2 independent locus expansions, one in platyrrhines (New World monkeys) and another in catarrhines (Old World monkeys and apes). In catarrhines, the GH cluster has a pituitary-expressed gene called GH1; the remaining GH genes include placental GHs and placental lactogens. Here, we provide cDNA sequence evidence that the platyrrhine GH cluster also includes at least 3 placenta expressed genes and phylogenetic evidence that placenta expressed anthropoid GH genes have undergone strong adaptive evolution, whereas pituitary-expressed GH genes have faced strict functional constraint. Our phylogenetic evidence also points to lineage-specific gene gain and loss in early placental mammalian evolution, with at least three copies of the GH gene present at the time of the last common ancestor (LCA) of primates, rodents, and laurasiatherians. Anthropoid primates and laurasiatherians share gene descendants of one of these three copies, whereas rodents and strepsirrhine primates each maintain a separate copy. Eight of the amino-acid replacements that occurred on the lineage leading to the LCA of extant anthropoids have been implicated in GH signaling at the maternal-fetal interface. Thus, placental expression of GH may have preceded the separate series of GH gene duplications that occurred in catarrhines and platyrrhines (i.e., the roles played by placenta-expressed GHs in human pregnancy may have a longer evolutionary history than previously appreciated). PMID:19805162
Ancient origin of placental expression in the growth hormone genes of anthropoid primates.

PubMed

Papper, Zack; Jameson, Natalie M; Romero, Roberto; Weckle, Amy L; Mittal, Pooja; Benirschke, Kurt; Santolaya-Forgas, Joaquin; Uddin, Monica; Haig, David; Goodman, Morris; Wildman, Derek E

2009-10-06

In anthropoid primates, growth hormone (GH) genes have undergone at least 2 independent locus expansions, one in platyrrhines (New World monkeys) and another in catarrhines (Old World monkeys and apes). In catarrhines, the GH cluster has a pituitary-expressed gene called GH1; the remaining GH genes include placental GHs and placental lactogens. Here, we provide cDNA sequence evidence that the platyrrhine GH cluster also includes at least 3 placenta expressed genes and phylogenetic evidence that placenta expressed anthropoid GH genes have undergone strong adaptive evolution, whereas pituitary-expressed GH genes have faced strict functional constraint. Our phylogenetic evidence also points to lineage-specific gene gain and loss in early placental mammalian evolution, with at least three copies of the GH gene present at the time of the last common ancestor (LCA) of primates, rodents, and laurasiatherians. Anthropoid primates and laurasiatherians share gene descendants of one of these three copies, whereas rodents and strepsirrhine primates each maintain a separate copy. Eight of the amino-acid replacements that occurred on the lineage leading to the LCA of extant anthropoids have been implicated in GH signaling at the maternal-fetal interface. Thus, placental expression of GH may have preceded the separate series of GH gene duplications that occurred in catarrhines and platyrrhines (i.e., the roles played by placenta-expressed GHs in human pregnancy may have a longer evolutionary history than previously appreciated).
Identification of lethal cluster of genes in the yeast transcription network

NASA Astrophysics Data System (ADS)

Rho, K.; Jeong, H.; Kahng, B.

2006-05-01

Identification of essential or lethal genes would be one of the ultimate goals in drug designs. Here we introduce an in silico method to select the cluster with a high population of lethal genes, called lethal cluster, through microarray assay. We construct a gene transcription network based on the microarray expression level. Links are added one by one in the descending order of the Pearson correlation coefficients between two genes. As the link density p increases, two meaningful link densities pm and ps are observed. At pm, which is smaller than the percolation threshold, the number of disconnected clusters is maximum, and the lethal genes are highly concentrated in a certain cluster that needs to be identified. Thus the deletion of all genes in that cluster could efficiently lead to a lethal inviable mutant. This lethal cluster can be identified by an in silico method. As p increases further beyond the percolation threshold, the power law behavior in the degree distribution of a giant cluster appears at ps. We measure the degree of each gene at ps. With the information pertaining to the degrees of each gene at ps, we return to the point pm and calculate the mean degree of genes of each cluster. We find that the lethal cluster has the largest mean degree.
Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

DOE PAGES

Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; ...

2016-11-24

Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-meansmore » clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less
Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.

Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-meansmore » clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less
The Association of Multiple Interacting Genes with Specific Phenotypes in Rice Using Gene Coexpression Networks1[C][W][OA

PubMed Central

Ficklin, Stephen P.; Luo, Feng; Feltus, F. Alex

2010-01-01

Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes. PMID:20668062
The association of multiple interacting genes with specific phenotypes in rice using gene coexpression networks.

PubMed

Ficklin, Stephen P; Luo, Feng; Feltus, F Alex

2010-09-01

Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes.
An Integrated workflow for phenazine biosynthetic gene cluster discovery and characterization

USDA-ARS?s Scientific Manuscript database

Increasing availability of new genomes and putative biosynthetic gene clusters (BGCs) has extended the opportunity to access novel chemical diversity for agriculture, medicine, environmental and industrial purposes. However, functional characterization of BGCs through heterologous expression is limi...
Serial analysis of gene expression (SAGE) in normal human trabecular meshwork.

PubMed

Liu, Yutao; Munro, Drew; Layfield, David; Dellinger, Andrew; Walter, Jeffrey; Peterson, Katherine; Rickman, Catherine Bowes; Allingham, R Rand; Hauser, Michael A

2011-04-08

To identify the genes expressed in normal human trabecular meshwork tissue, a tissue critical to the pathogenesis of glaucoma. Total RNA was extracted from human trabecular meshwork (HTM) harvested from 3 different donors. Extracted RNA was used to synthesize individual SAGE (serial analysis of gene expression) libraries using the I-SAGE Long kit from Invitrogen. Libraries were analyzed using SAGE 2000 software to extract the 17 base pair sequence tags. The extracted sequence tags were mapped to the genome using SAGE Genie map. A total of 298,834 SAGE tags were identified from all HTM libraries (96,842, 88,126, and 113,866 tags, respectively). Collectively, there were 107,325 unique tags. There were 10,329 unique tags with a minimum of 2 counts from a single library. These tags were mapped to known unique Unigene clusters. Approximately 29% of the tags (orphan tags) did not map to a known Unigene cluster. Thirteen percent of the tags mapped to at least 2 Unigene clusters. Sequence tags from many glaucoma-related genes, including myocilin, optineurin, and WD repeat domain 36, were identified. This is the first time SAGE analysis has been used to characterize the gene expression profile in normal HTM. SAGE analysis provides an unbiased sampling of gene expression of the target tissue. These data will provide new and valuable information to improve understanding of the biology of human aqueous outflow.
Identification and Analysis of the Biosynthetic Gene Cluster Encoding the Thiopeptide Antibiotic Cyclothiazomycin in Streptomyces hygroscopicus 10-22▿ †

PubMed Central

Wang, Jiang; Yu, Yi; Tang, Kexuan; Liu, Wen; He, Xinyi; Huang, Xi; Deng, Zixin

2010-01-01

Thiopeptide antibiotics are an important class of natural products resulting from posttranslational modifications of ribosomally synthesized peptides. Cyclothiazomycin is a typical thiopeptide antibiotic that has a unique bridged macrocyclic structure derived from an 18-amino-acid structural peptide. Here we reported cloning, sequencing, and heterologous expression of the cyclothiazomycin biosynthetic gene cluster from Streptomyces hygroscopicus 10-22. Remarkably, successful heterologous expression of a 22.7-kb gene cluster in Streptomyces lividans 1326 suggested that there is a minimum set of 15 open reading frames that includes all of the functional genes required for cyclothiazomycin production. Six genes of these genes, cltBCDEFG flanking the structural gene cltA, were predicted to encode the enzymes required for the main framework of cyclothiazomycin, and two enzymes encoded by a putative operon, cltMN, were hypothesized to participate in the tailoring step to generate the tertiary thioether, leading to the final cyclization of the bridged macrocyclic structure. This rigorous bioinformatics analysis based on heterologous expression of cyclothiazomycin resulted in an ideal biosynthetic model for us to understand the biosynthesis of thiopeptides. PMID:20154110
CRAWview: for viewing splicing variation, gene families, and polymorphism in clusters of ESTs and full-length sequences.

PubMed

Chou, A; Burke, J

1999-05-01

DNA sequence clustering has become a valuable method in support of gene discovery and gene expression analysis. Our interest lies in leveraging the sequence diversity within clusters of expressed sequence tags (ESTs) to model gene structure for the study of gene variants that arise from, among other things, alternative mRNA splicing, polymorphism, and divergence after gene duplication, fusion, and translocation events. In previous work, CRAW was developed to discover gene variants from assembled clusters of ESTs. Most importantly, novel gene features (the differing units between gene variants, for example alternative exons, polymorphisms, transposable elements, etc.) that are specialized to tissue, disease, population, or developmental states can be identified when these tools collate DNA source information with gene variant discrimination. While the goal is complete automation of novel feature and gene variant detection, current methods are far from perfect and hence the development of effective tools for visualization and exploratory data analysis are of paramount importance in the process of sifting through candidate genes and validating targets. We present CRAWview, a Java based visualization extension to CRAW. Features that vary between gene forms are displayed using an automatically generated color coded index. The reporting format of CRAWview gives a brief, high level summary report to display overlap and divergence within clusters of sequences as well as the ability to 'drill down' and see detailed information concerning regions of interest. Additionally, the alignment viewing and editing capabilities of CRAWview make it possible to interactively correct frame-shifts and otherwise edit cluster assemblies. We have implemented CRAWview as a Java application across windows NT/95 and UNIX platforms. A beta version of CRAWview will be freely available to academic users from Pangea Systems (http://www.pangeasystems.com). Contact :
Carbon-dependent control of electron transfer and central carbon pathway genes for methane biosynthesis in the Archaean, Methanosarcina acetivorans strain C2A

PubMed Central

2010-01-01

Background The archaeon, Methanosarcina acetivorans strain C2A forms methane, a potent greenhouse gas, from a variety of one-carbon substrates and acetate. Whereas the biochemical pathways leading to methane formation are well understood, little is known about the expression of the many of the genes that encode proteins needed for carbon flow, electron transfer and/or energy conservation. Quantitative transcript analysis was performed on twenty gene clusters encompassing over one hundred genes in M. acetivorans that encode enzymes/proteins with known or potential roles in substrate conversion to methane. Results The expression of many seemingly "redundant" genes/gene clusters establish substrate dependent control of approximately seventy genes for methane production by the pathways for methanol and acetate utilization. These include genes for soluble-type and membrane-type heterodisulfide reductases (hdr), hydrogenases including genes for a vht-type F420 non-reducing hydrogenase, molybdenum-type (fmd) as well as tungsten-type (fwd) formylmethanofuran dehydrogenases, genes for rnf and mrp-type electron transfer complexes, for acetate uptake, plus multiple genes for aha- and atp-type ATP synthesis complexes. Analysis of promoters for seven gene clusters reveal UTR leaders of 51-137 nucleotides in length, raising the possibility of both transcriptional and translational levels of control. Conclusions The above findings establish the differential and coordinated expression of two major gene families in M. acetivorans in response to carbon/energy supply. Furthermore, the quantitative mRNA measurements demonstrate the dynamic range for modulating transcript abundance. Since many of these gene clusters in M. acetivorans are also present in other Methanosarcina species including M. mazei, and in M. barkeri, these findings provide a basis for predicting related control in these environmentally significant methanogens. PMID:20178638

Two Gene Clusters Coordinate Galactose and Lactose Metabolism in Streptococcus gordonii

PubMed Central

Zeng, Lin; Martino, Nicole C.

2012-01-01

Streptococcus gordonii is an early colonizer of the human oral cavity and an abundant constituent of oral biofilms. Two tandemly arranged gene clusters, designated lac and gal, were identified in the S. gordonii DL1 genome, which encode genes of the tagatose pathway (lacABCD) and sugar phosphotransferase system (PTS) enzyme II permeases. Genes encoding a predicted phospho-β-galactosidase (LacG), a DeoR family transcriptional regulator (LacR), and a transcriptional antiterminator (LacT) were also present in the clusters. Growth and PTS assays supported that the permease designated EIILac transports lactose and galactose, whereas EIIGal transports galactose. The expression of the gene for EIIGal was markedly upregulated in cells growing on galactose. Using promoter-cat fusions, a role for LacR in the regulation of the expressions of both gene clusters was demonstrated, and the gal cluster was also shown to be sensitive to repression by CcpA. The deletion of lacT caused an inability to grow on lactose, apparently because of its role in the regulation of the expression of the genes for EIILac, but had little effect on galactose utilization. S. gordonii maintained a selective advantage over Streptococcus mutans in a mixed-species competition assay, associated with its possession of a high-affinity galactose PTS, although S. mutans could persist better at low pHs. Collectively, these results support the concept that the galactose and lactose systems of S. gordonii are subject to complex regulation and that a high-affinity galactose PTS may be advantageous when S. gordonii is competing against the caries pathogen S. mutans in oral biofilms. PMID:22660715
ATNT: an enhanced system for expression of polycistronic secondary metabolite gene clusters in Aspergillus niger.

PubMed

Geib, Elena; Brock, Matthias

2017-01-01

Fungi are treasure chests for yet unexplored natural products. However, exploitation of their real potential remains difficult as a significant proportion of biosynthetic gene clusters appears silent under standard laboratory conditions. Therefore, elucidation of novel products requires gene activation or heterologous expression. For heterologous gene expression, we previously developed an expression platform in Aspergillus niger that is based on the transcriptional regulator TerR and its target promoter P terA . In this study, we extended this system by regulating expression of terR by the doxycycline inducible Tet-on system. Reporter genes cloned under the control of the target promoter P terA remained silent in the absence of doxycycline, but were strongly expressed when doxycycline was added. Reporter quantification revealed that the coupled system results in about five times higher expression rates compared to gene expression under direct control of the Tet-on system. As production of secondary metabolites generally requires the expression of several biosynthetic genes, the suitability of the self-cleaving viral peptide sequence P2A was tested in this optimised expression system. P2A allowed polycistronic expression of genes required for Asp-melanin formation in combination with the gene coding for the red fluorescent protein tdTomato. Gene expression and Asp-melanin formation was prevented in the absence of doxycycline and strongly induced by addition of doxycycline. Fluorescence studies confirmed the correct subcellular localisation of the respective enzymes. This tightly regulated but strongly inducible expression system enables high level production of secondary metabolites most likely even those with toxic potential. Furthermore, this system is compatible with polycistronic gene expression and, thus, suitable for the discovery of novel natural products.
Functional Analyses of NSF1 in Wine Yeast Using Interconnected Correlation Clustering and Molecular Analyses

PubMed Central

Bessonov, Kyrylo; Walkey, Christopher J.; Shelp, Barry J.; van Vuuren, Hennie J. J.; Chiu, David; van der Merwe, George

2013-01-01

Analyzing time-course expression data captured in microarray datasets is a complex undertaking as the vast and complex data space is represented by a relatively low number of samples as compared to thousands of available genes. Here, we developed the Interdependent Correlation Clustering (ICC) method to analyze relationships that exist among genes conditioned on the expression of a specific target gene in microarray data. Based on Correlation Clustering, the ICC method analyzes a large set of correlation values related to gene expression profiles extracted from given microarray datasets. ICC can be applied to any microarray dataset and any target gene. We applied this method to microarray data generated from wine fermentations and selected NSF1, which encodes a C2H2 zinc finger-type transcription factor, as the target gene. The validity of the method was verified by accurate identifications of the previously known functional roles of NSF1. In addition, we identified and verified potential new functions for this gene; specifically, NSF1 is a negative regulator for the expression of sulfur metabolism genes, the nuclear localization of Nsf1 protein (Nsf1p) is controlled in a sulfur-dependent manner, and the transcription of NSF1 is regulated by Met4p, an important transcriptional activator of sulfur metabolism genes. The inter-disciplinary approach adopted here highlighted the accuracy and relevancy of the ICC method in mining for novel gene functions using complex microarray datasets with a limited number of samples. PMID:24130853
Sertoli cell-specific ablation of miR-17-92 cluster significantly alters whole testis transcriptome without apparent phenotypic effects.

PubMed

Hurtado, Alicia; Real, Francisca M; Palomino, Rogelio; Carmona, Francisco David; Burgos, Miguel; Jiménez, Rafael; Barrionuevo, Francisco J

2018-01-01

MicroRNAs are frequently organized into polycistronic clusters whose transcription is controlled by a single promoter. The miR-17-92 cluster is expressed in most embryonic and postnatal organs. It is a potent oncogene associated to several types of cancer and it is involved in several important developmental processes. In the testis, expression of the miR-17-92 cluster in the germ cells is necessary to maintain normal spermatogenesis. This cluster is also expressed in Sertoli cells (the somatic cells of the seminiferous tubules), which require miRNAs for correct cell development and survival. To study the possible role of miR-17-92 in Sertoli cell development and function and, in order to overcome the postnatal lethality of miR-17-92-/ mice, we conditionally deleted it in embryonic Sertoli cells shortly after the sex determination stage using an Amh-Cre allele. Mutant mice developed apparently normal testes and were fertile, but their testis transcriptomes contained hundreds of moderately deregulated genes, indicating that testis homeostasis is tightly controlled in mammals and that miR-17-92 expression in Sertoli cells contribute to maintain normal gene expression levels, but is unnecessary for testis development and function. Our results show that significant deregulation of hundreds of genes might have no functional consequences.
Tumour-associated and non-tumour-associated microbiota in colorectal cancer

PubMed Central

Flemer, Burkhardt; Lynch, Denise B; Brown, Jillian M R; Jeffery, Ian B; Ryan, Feargal J; Claesson, Marcus J; O'Riordain, Micheal; Shanahan, Fergus; O'Toole, Paul W

2017-01-01

Objective A signature that unifies the colorectal cancer (CRC) microbiota across multiple studies has not been identified. In addition to methodological variance, heterogeneity may be caused by both microbial and host response differences, which was addressed in this study. Design We prospectively studied the colonic microbiota and the expression of specific host response genes using faecal and mucosal samples (‘ON’ and ‘OFF’ the tumour, proximal and distal) from 59 patients undergoing surgery for CRC, 21 individuals with polyps and 56 healthy controls. Microbiota composition was determined by 16S rRNA amplicon sequencing; expression of host genes involved in CRC progression and immune response was quantified by real-time quantitative PCR. Results The microbiota of patients with CRC differed from that of controls, but alterations were not restricted to the cancerous tissue. Differences between distal and proximal cancers were detected and faecal microbiota only partially reflected mucosal microbiota in CRC. Patients with CRC can be stratified based on higher level structures of mucosal-associated bacterial co-abundance groups (CAGs) that resemble the previously formulated concept of enterotypes. Of these, Bacteroidetes Cluster 1 and Firmicutes Cluster 1 were in decreased abundance in CRC mucosa, whereas Bacteroidetes Cluster 2, Firmicutes Cluster 2, Pathogen Cluster and Prevotella Cluster showed increased abundance in CRC mucosa. CRC-associated CAGs were differentially correlated with the expression of host immunoinflammatory response genes. Conclusions CRC-associated microbiota profiles differ from those in healthy subjects and are linked with distinct mucosal gene-expression profiles. Compositional alterations in the microbiota are not restricted to cancerous tissue and differ between distal and proximal cancers. PMID:26992426
Differential gene expression profiles of peripheral blood mononuclear cells in childhood asthma.

PubMed

Kong, Qian; Li, Wen-Jing; Huang, Hua-Rong; Zhong, Ying-Qiang; Fang, Jian-Pei

2015-05-01

Asthma is a common childhood disease with strong genetic components. This study compared whole-genome expression differences between asthmatic young children and healthy controls to identify gene signatures of childhood asthma. Total RNA extracted from peripheral blood mononuclear cells (PBMC) was subjected to microarray analysis. QRT-PCR was performed to verify the microarray results. Classification and functional characterization of differential genes were illustrated by hierarchical clustering and gene ontology analysis. Multiple logistic regression (MLR) analysis, receiver operating characteristic (ROC) curve analysis, and discriminate power were used to scan asthma-specific diagnostic markers. For fold-change>2 and p < 0.05, there were 758 named differential genes. The results of QRT-PCR confirmed successfully the array data. Hierarchical clustering divided 29 highly possible genes into seven categories and the genes in the same cluster were likely to possess similar expression patterns or functions. Gene ontology analysis presented that differential genes primarily enriched in immune response, response to stress or stimulus, and regulation of apoptosis in biological process. MLR and ROC curve analysis revealed that the combination of ADAM33, Smad7, and LIGHT possessed excellent discriminating power. The combination of ADAM33, Smad7, and LIGHT would be a reliable and useful childhood asthma model for prediction and diagnosis.
Familial aggregation analysis of gene expressions

PubMed Central

Rao, Shao-Qi; Xu, Liang-De; Zhang, Guang-Mei; Li, Xia; Li, Lin; Shen, Gong-Qing; Jiang, Yang; Yang, Yue-Ying; Gong, Bin-Sheng; Jiang, Wei; Zhang, Fan; Xiao, Yun; Wang, Qing K

2007-01-01

Traditional studies of familial aggregation are aimed at defining the genetic (and non-genetic) causes of a disease from physiological or clinical traits. However, there has been little attempt to use genome-wide gene expressions, the direct phenotypic measures of genes, as the traits to investigate several extended issues regarding the distributions of familially aggregated genes on chromosomes or in functions. In this study we conducted a genome-wide familial aggregation analysis by using the in vitro cell gene expressions of 3300 human autosome genes (Problem 1 data provided to Genetic Analysis Workshop 15) in order to answer three basic genetics questions. First, we investigated how gene expressions aggregate among different types (degrees) of relative pairs. Second, we conducted a bioinformatics analysis of highly familially aggregated genes to see how they are distributed on chromosomes. Third, we performed a gene ontology enrichment test of familially aggregated genes to find evidence to support their functional consensus. The results indicated that 1) gene expressions did aggregate in families, especially between sibs. Of 3300 human genes analyzed, there were a total of 1105 genes with one or more significant (empirical p < 0.05) familial correlation; 2) there were several genomic hot spots where highly familially aggregated genes (e.g., the chromosome 6 HLA genes cluster) were clustered; 3) as we expected, gene ontology enrichment tests revealed that the 1105 genes were aggregating not only in families but also in functional categories. PMID:18466548
Ancient Expansion of the Hox Cluster in Lepidoptera Generated Four Homeobox Genes Implicated in Extra-Embryonic Tissue Formation

PubMed Central

Taylor, William R.; Gibbs, Melanie; Breuker, Casper J.; Holland, Peter W. H.

2014-01-01

Gene duplications within the conserved Hox cluster are rare in animal evolution, but in Lepidoptera an array of divergent Hox-related genes (Shx genes) has been reported between pb and zen. Here, we use genome sequencing of five lepidopteran species (Polygonia c-album, Pararge aegeria, Callimorpha dominula, Cameraria ohridella, Hepialus sylvina) plus a caddisfly outgroup (Glyphotaelius pellucidus) to trace the evolution of the lepidopteran Shx genes. We demonstrate that Shx genes originated by tandem duplication of zen early in the evolution of large clade Ditrysia; Shx are not found in a caddisfly and a member of the basally diverging Hepialidae (swift moths). Four distinct Shx genes were generated early in ditrysian evolution, and were stably retained in all descendent Lepidoptera except the silkmoth which has additional duplications. Despite extensive sequence divergence, molecular modelling indicates that all four Shx genes have the potential to encode stable homeodomains. The four Shx genes have distinct spatiotemporal expression patterns in early development of the Speckled Wood butterfly (Pararge aegeria), with ShxC demarcating the future sites of extraembryonic tissue formation via strikingly localised maternal RNA in the oocyte. All four genes are also expressed in presumptive serosal cells, prior to the onset of zen expression. Lepidopteran Shx genes represent an unusual example of Hox cluster expansion and integration of novel genes into ancient developmental regulatory networks. PMID:25340822
Biosynthetic pathway for γ-cyclic sarcinaxanthin in Micrococcus luteus: heterologous expression and evidence for diverse and multiple catalytic functions of C(50) carotenoid cyclases.

PubMed

Netzer, Roman; Stafsnes, Marit H; Andreassen, Trygve; Goksøyr, Audun; Bruheim, Per; Brautaset, Trygve

2010-11-01

We report the cloning and characterization of the biosynthetic gene cluster (crtE, crtB, crtI, crtE2, crtYg, crtYh, and crtX) of the γ-cyclic C(50) carotenoid sarcinaxanthin in Micrococcus luteus NCTC2665. Expression of the complete and partial gene cluster in Escherichia coli hosts revealed that sarcinaxanthin biosynthesis from the precursor molecule farnesyl pyrophosphate (FPP) proceeds via C(40) lycopene, C(45) nonaflavuxanthin, C(50) flavuxanthin, and C(50) sarcinaxanthin. Glucosylation of sarcinaxanthin was accomplished by the crtX gene product. This is the first report describing the biosynthetic pathway of a γ-cyclic C(50) carotenoid. Expression of the corresponding genes from the marine M. luteus isolate Otnes7 in a lycopene-producing E. coli host resulted in the production of up to 2.5 mg/g cell dry weight sarcinaxanthin in shake flasks. In an attempt to experimentally understand the specific difference between the biosynthetic pathways of sarcinaxanthin and the structurally related ε-cyclic decaprenoxanthin, we constructed a hybrid gene cluster with the γ-cyclic C(50) carotenoid cyclase genes crtYg and crtYh from M. luteus replaced with the analogous ε-cyclic C(50) carotenoid cyclase genes crtYe and crtYf from the natural decaprenoxanthin producer Corynebacterium glutamicum. Surprisingly, expression of this hybrid gene cluster in an E. coli host resulted in accumulation of not only decaprenoxanthin, but also sarcinaxanthin and the asymmetric ε- and γ-cyclic C(50) carotenoid sarprenoxanthin, described for the first time in this work. Together, these data contributed to new insight into the diverse and multiple functions of bacterial C(50) carotenoid cyclases as key catalysts for the synthesis of structurally different carotenoids.
Activation of the alpha-globin gene expression correlates with dramatic upregulation of nearby non-globin genes and changes in local and large-scale chromatin spatial structure.

PubMed

Ulianov, Sergey V; Galitsyna, Aleksandra A; Flyamer, Ilya M; Golov, Arkadiy K; Khrameeva, Ekaterina E; Imakaev, Maxim V; Abdennur, Nezar A; Gelfand, Mikhail S; Gavrilov, Alexey A; Razin, Sergey V

2017-07-11

In homeotherms, the alpha-globin gene clusters are located within permanently open genome regions enriched in housekeeping genes. Terminal erythroid differentiation results in dramatic upregulation of alpha-globin genes making their expression comparable to the rRNA transcriptional output. Little is known about the influence of the erythroid-specific alpha-globin gene transcription outburst on adjacent, widely expressed genes and large-scale chromatin organization. Here, we have analyzed the total transcription output, the overall chromatin contact profile, and CTCF binding within the 2.7 Mb segment of chicken chromosome 14 harboring the alpha-globin gene cluster in cultured lymphoid cells and cultured erythroid cells before and after induction of terminal erythroid differentiation. We found that, similarly to mammalian genome, the chicken genomes is organized in TADs and compartments. Full activation of the alpha-globin gene transcription in differentiated erythroid cells is correlated with upregulation of several adjacent housekeeping genes and the emergence of abundant intergenic transcription. An extended chromosome region encompassing the alpha-globin cluster becomes significantly decompacted in differentiated erythroid cells, and depleted in CTCF binding and CTCF-anchored chromatin loops, while the sub-TAD harboring alpha-globin gene cluster and the upstream major regulatory element (MRE) becomes highly enriched with chromatin interactions as compared to lymphoid and proliferating erythroid cells. The alpha-globin gene domain and the neighboring loci reside within the A-like chromatin compartment in both lymphoid and erythroid cells and become further segregated from the upstream gene desert upon terminal erythroid differentiation. Our findings demonstrate that the effects of tissue-specific transcription activation are not restricted to the host genomic locus but affect the overall chromatin structure and transcriptional output of the encompassing topologically associating domain.
Online Analytical Processing (OLAP): A Fast and Effective Data Mining Tool for Gene Expression Databases

PubMed Central

2005-01-01

Gene expression databases contain a wealth of information, but current data mining tools are limited in their speed and effectiveness in extracting meaningful biological knowledge from them. Online analytical processing (OLAP) can be used as a supplement to cluster analysis for fast and effective data mining of gene expression databases. We used Analysis Services 2000, a product that ships with SQLServer2000, to construct an OLAP cube that was used to mine a time series experiment designed to identify genes associated with resistance of soybean to the soybean cyst nematode, a devastating pest of soybean. The data for these experiments is stored in the soybean genomics and microarray database (SGMD). A number of candidate resistance genes and pathways were found. Compared to traditional cluster analysis of gene expression data, OLAP was more effective and faster in finding biologically meaningful information. OLAP is available from a number of vendors and can work with any relational database management system through OLE DB. PMID:16046824
Determining Physical Mechanisms of Gene Expression Regulation from Single Cell Gene Expression Data.

PubMed

Ezer, Daphne; Moignard, Victoria; Göttgens, Berthold; Adryan, Boris

2016-08-01

Many genes are expressed in bursts, which can contribute to cell-to-cell heterogeneity. It is now possible to measure this heterogeneity with high throughput single cell gene expression assays (single cell qPCR and RNA-seq). These experimental approaches generate gene expression distributions which can be used to estimate the kinetic parameters of gene expression bursting, namely the rate that genes turn on, the rate that genes turn off, and the rate of transcription. We construct a complete pipeline for the analysis of single cell qPCR data that uses the mathematics behind bursty expression to develop more accurate and robust algorithms for analyzing the origin of heterogeneity in experimental samples, specifically an algorithm for clustering cells by their bursting behavior (Simulated Annealing for Bursty Expression Clustering, SABEC) and a statistical tool for comparing the kinetic parameters of bursty expression across populations of cells (Estimation of Parameter changes in Kinetics, EPiK). We applied these methods to hematopoiesis, including a new single cell dataset in which transcription factors (TFs) involved in the earliest branchpoint of blood differentiation were individually up- and down-regulated. We could identify two unique sub-populations within a seemingly homogenous group of hematopoietic stem cells. In addition, we could predict regulatory mechanisms controlling the expression levels of eighteen key hematopoietic transcription factors throughout differentiation. Detailed information about gene regulatory mechanisms can therefore be obtained simply from high throughput single cell gene expression data, which should be widely applicable given the rapid expansion of single cell genomics.
Comparative RNA-sequencing of the acarbose producer Actinoplanes sp. SE50/110 cultivated in different growth media.

PubMed

Schwientek, Patrick; Wendler, Sergej; Neshat, Armin; Eirich, Christina; Rückert, Christian; Klein, Andreas; Wehmeier, Udo F; Kalinowski, Jörn; Stoye, Jens; Pühler, Alfred

2013-08-20

Actinoplanes sp. SE50/110 is known as the producer of the alpha-glucosidase inhibitor acarbose, a potent drug in the treatment of type-2 diabetes mellitus. We conducted the first whole transcriptome analysis of Actinoplanes sp. SE50/110, using RNA-sequencing technology for comparative gene expression studies between cells grown in maltose minimal medium, maltose minimal medium with trace elements, and glucose complex medium. We first studied the behavior of Actinoplanes sp. SE50/110 cultivations in these three media and found that the different media had significant impact on growth rate and in particular on acarbose production. It was demonstrated that Actinoplanes sp. SE50/110 grew well in all three media, but acarbose biosynthesis was only observed in cultures grown in maltose minimal medium with and without trace elements. When comparing the expression profiles between the maltose minimal media with and without trace elements, only few significantly differentially expressed genes were found, which mainly code for uptake systems of metal ions provided in the trace element solution. In contrast, the comparison of expression profiles from maltose minimal medium and glucose complex medium revealed a large number of differentially expressed genes, of which the most conspicuous genes account for iron storage and uptake. Furthermore, the acarbose gene cluster was found to be highly expressed in maltose-containing media and almost silent in the glucose-containing medium. In addition, a putative antibiotic biosynthesis gene cluster was found to be similarly expressed as the acarbose cluster. Copyright © 2012 Elsevier B.V. All rights reserved.
Developmental Progression in the Coral Acropora digitifera Is Controlled by Differential Expression of Distinct Regulatory Gene Networks

PubMed Central

Reyes-Bermudez, Alejandro; Villar-Briones, Alejandro; Ramirez-Portilla, Catalina; Hidaka, Michio; Mikheyev, Alexander S.

2016-01-01

Corals belong to the most basal class of the Phylum Cnidaria, which is considered the sister group of bilaterian animals, and thus have become an emerging model to study the evolution of developmental mechanisms. Although cell renewal, differentiation, and maintenance of pluripotency are cellular events shared by multicellular animals, the cellular basis of these fundamental biological processes are still poorly understood. To understand how changes in gene expression regulate morphogenetic transitions at the base of the eumetazoa, we performed quantitative RNA-seq analysis during Acropora digitifera’s development. We collected embryonic, larval, and adult samples to characterize stage-specific transcription profiles, as well as broad expression patterns. Transcription profiles reconstructed development revealing two main expression clusters. The first cluster grouped blastula and gastrula and the second grouped subsequent developmental time points. Consistently, we observed clear differences in gene expression between early and late developmental transitions, with higher numbers of differentially expressed genes and fold changes around gastrulation. Furthermore, we identified three coexpression clusters that represented discrete gene expression patterns. During early transitions, transcriptional networks seemed to regulate cellular fate and morphogenesis of the larval body. In late transitions, these networks seemed to play important roles preparing planulae for switch in lifestyle and regulation of adult processes. Although developmental progression in A. digitifera is regulated to some extent by differential coexpression of well-defined gene networks, stage-specific transcription profiles appear to be independent entities. While negative regulation of transcription is predominant in early development, cell differentiation was upregulated in larval and adult stages. PMID:26941230
Analysis of gene expression levels in individual bacterial cells without image segmentation.

PubMed

Kwak, In Hae; Son, Minjun; Hagen, Stephen J

2012-05-11

Studies of stochasticity in gene expression typically make use of fluorescent protein reporters, which permit the measurement of expression levels within individual cells by fluorescence microscopy. Analysis of such microscopy images is almost invariably based on a segmentation algorithm, where the image of a cell or cluster is analyzed mathematically to delineate individual cell boundaries. However segmentation can be ineffective for studying bacterial cells or clusters, especially at lower magnification, where outlines of individual cells are poorly resolved. Here we demonstrate an alternative method for analyzing such images without segmentation. The method employs a comparison between the pixel brightness in phase contrast vs fluorescence microscopy images. By fitting the correlation between phase contrast and fluorescence intensity to a physical model, we obtain well-defined estimates for the different levels of gene expression that are present in the cell or cluster. The method reveals the boundaries of the individual cells, even if the source images lack the resolution to show these boundaries clearly. Copyright © 2012 Elsevier Inc. All rights reserved.
Global hypomethylation and promoter methylation in small intestinal neuroendocrine tumors: an in vivo and in vitro study.

PubMed

Fotouhi, Omid; Adel Fahmideh, Maral; Kjellman, Magnus; Sulaiman, Luqman; Höög, Anders; Zedenius, Jan; Hashemi, Jamileh; Larsson, Catharina

2014-07-01

Aberrant DNA methylation is a feature of human cancer affecting gene expression and tumor phenotype. Here, we quantified promoter methylation of candidate genes and global methylation in 44 small intestinal-neuroendocrine tumors (SI-NETs) from 33 patients by pyrosequencing. Findings were compared with gene expression, patient outcome and known tumor copy number alterations. Promoter methylation was observed for WIF1, RASSF1A, CTNNB1, CXCL14, NKX2-3, P16, LAMA1, and CDH1. By contrast APC, CDH3, HIC1, P14, SMAD2, and SMAD4 only had low levels of methylation. WIF1 methylation was significantly increased (P = 0.001) and WIF1 expression was reduced in SI-NETs vs. normal references (P = 0.003). WIF1, NKX2-3, and CXCL14 expression was reduced in metastases vs. primary tumors (P<0.02). Low expression of RASSF1A and P16 were associated with poor overall survival (P = 0.045 and P = 0.011, respectively). Global methylation determined by pyrosequencing of LINE1 repeats was reduced in tumors vs. normal references, and was associated with loss in chromosome 18. The tumors fell into three clusters with enrichment of WIF1 methylation and LINE1 hypomethylation in Cluster I and RASSF1A and CTNNB1 methylation and loss in 16q in Cluster II. In Cluster III, these alterations were low-abundant and NKX2-3 methylation was low. Similar analyses in the SI-NET cell lines HC45 and CNDT2 showed methylation for CDH1 and WIF1 and/or P16, CXCL14, NKX2-3, LAMA1, and CTNNB1. Treatment with the demethylating agent 5-azacytidine reduced DNA methylation and increased expression of these genes in vitro. In conclusion, promoter methylation of tumor suppressor genes is associated with suppressed gene expression and DNA copy number alterations in SI-NETs, and may be restored in vitro.
A high resolution atlas of gene expression in the domestic sheep (Ovis aries)

PubMed Central

Farquhar, Iseabail L.; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G.; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C. Bruce; Freeman, Tom C.; Archibald, Alan L.; Hume, David A.

2017-01-01

Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of ‘guilt by association’ was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages. PMID:28915238
A high resolution atlas of gene expression in the domestic sheep (Ovis aries).

PubMed

Clark, Emily L; Bush, Stephen J; McCulloch, Mary E B; Farquhar, Iseabail L; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G; Wu, Chunlei; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C Bruce; Freeman, Tom C; Summers, Kim M; Archibald, Alan L; Hume, David A

2017-09-01

Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of 'guilt by association' was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages.
An enhanced deterministic K-Means clustering algorithm for cancer subtype prediction from gene expression data.

PubMed

Nidheesh, N; Abdul Nazeer, K A; Ameer, P M

2017-12-01

Clustering algorithms with steps involving randomness usually give different results on different executions for the same dataset. This non-deterministic nature of algorithms such as the K-Means clustering algorithm limits their applicability in areas such as cancer subtype prediction using gene expression data. It is hard to sensibly compare the results of such algorithms with those of other algorithms. The non-deterministic nature of K-Means is due to its random selection of data points as initial centroids. We propose an improved, density based version of K-Means, which involves a novel and systematic method for selecting initial centroids. The key idea of the algorithm is to select data points which belong to dense regions and which are adequately separated in feature space as the initial centroids. We compared the proposed algorithm to a set of eleven widely used single clustering algorithms and a prominent ensemble clustering algorithm which is being used for cancer data classification, based on the performances on a set of datasets comprising ten cancer gene expression datasets. The proposed algorithm has shown better overall performance than the others. There is a pressing need in the Biomedical domain for simple, easy-to-use and more accurate Machine Learning tools for cancer subtype prediction. The proposed algorithm is simple, easy-to-use and gives stable results. Moreover, it provides comparatively better predictions of cancer subtypes from gene expression data. Copyright © 2017 Elsevier Ltd. All rights reserved.
Implementation of spectral clustering with partitioning around medoids (PAM) algorithm on microarray data of carcinoma

NASA Astrophysics Data System (ADS)

Cahyaningrum, Rosalia D.; Bustamam, Alhadi; Siswantining, Titin

2017-03-01

Technology of microarray became one of the imperative tools in life science to observe the gene expression levels, one of which is the expression of the genes of people with carcinoma. Carcinoma is a cancer that forms in the epithelial tissue. These data can be analyzed such as the identification expressions hereditary gene and also build classifications that can be used to improve diagnosis of carcinoma. Microarray data usually served in large dimension that most methods require large computing time to do the grouping. Therefore, this study uses spectral clustering method which allows to work with any object for reduces dimension. Spectral clustering method is a method based on spectral decomposition of the matrix which is represented in the form of a graph. After the data dimensions are reduced, then the data are partitioned. One of the famous partition method is Partitioning Around Medoids (PAM) which is minimize the objective function with exchanges all the non-medoid points into medoid point iteratively until converge. Objectivity of this research is to implement methods spectral clustering and partitioning algorithm PAM to obtain groups of 7457 genes with carcinoma based on the similarity value. The result in this study is two groups of genes with carcinoma.

Geometry of the Gene Expression Space of Individual Cells

PubMed Central

Korem, Yael; Szekely, Pablo; Hart, Yuval; Sheftel, Hila; Hausser, Jean; Mayo, Avi; Rothenberg, Michael E.; Kalisky, Tomer; Alon, Uri

2015-01-01

There is a revolution in the ability to analyze gene expression of single cells in a tissue. To understand this data we must comprehend how cells are distributed in a high-dimensional gene expression space. One open question is whether cell types form discrete clusters or whether gene expression forms a continuum of states. If such a continuum exists, what is its geometry? Recent theory on evolutionary trade-offs suggests that cells that need to perform multiple tasks are arranged in a polygon or polyhedron (line, triangle, tetrahedron and so on, generally called polytopes) in gene expression space, whose vertices are the expression profiles optimal for each task. Here, we analyze single-cell data from human and mouse tissues profiled using a variety of single-cell technologies. We fit the data to shapes with different numbers of vertices, compute their statistical significance, and infer their tasks. We find cases in which single cells fill out a continuum of expression states within a polyhedron. This occurs in intestinal progenitor cells, which fill out a tetrahedron in gene expression space. The four vertices of this tetrahedron are each enriched with genes for a specific task related to stemness and early differentiation. A polyhedral continuum of states is also found in spleen dendritic cells, known to perform multiple immune tasks: cells fill out a tetrahedron whose vertices correspond to key tasks related to maturation, pathogen sensing and communication with lymphocytes. A mixture of continuum-like distributions and discrete clusters is found in other cell types, including bone marrow and differentiated intestinal crypt cells. This approach can be used to understand the geometry and biological tasks of a wide range of single-cell datasets. The present results suggest that the concept of cell type may be expanded. In addition to discreet clusters in gene-expression space, we suggest a new possibility: a continuum of states within a polyhedron, in which the vertices represent specialists at key tasks. PMID:26161936
Regulation of notochord-specific expression of Ci-Bra downstream genes in Ciona intestinalis embryos.

PubMed

Takahashi, Hiroki; Hotta, Kohji; Takagi, Chiyo; Ueno, Naoto; Satoh, Nori; Shoguchi, Eiichi

2010-02-01

Brachyury, a T-box transcription factor, is expressed in ascidian embryos exclusively in primordial notochord cells and plays a pivotal role in differentiation of notochord cells. Previously, we identified approximately 450 genes downstream of Ciona intestinalis Brachyury (Ci-Bra), and characterized the expression profiles of 45 of these in differentiating notochord cells. In this study, we looked for cisregulatory sequences in minimal enhancers of 20 Ci-Bra downstream genes by electroporating region within approximately 3 kb upstream of each gene fused with lacZ. Eight of the 20 reporters were expressed in notochord cells. The minimal enchancer for each of these eight genes was narrowed to a region approximately 0.5-1.0-kb long. We also explored the genome-wide and coordinate regulation of 43 Ci-Bra-downstream genes. When we determined their chromosomal localization, it became evident that they are not clustered in a given region of the genome, but rather distributed evenly over 13 of the 14 pairs of chromosomes, suggesting that gene clustering does not contribute to coordinate control of the Ci-Bra downstream gene expression. Our results might provide Insights Into the molecular mechanisms underlying notochord formation in chordates.
Time-course microarray analysis for identifying candidate genes involved in obesity-associated pathological changes in the mouse colon.

PubMed

Bae, Yun Jung; Kim, Sung-Eun; Hong, Seong Yeon; Park, Taesun; Lee, Sang Gyu; Choi, Myung-Sook; Sung, Mi-Kyung

2016-01-01

Obesity is known to increase the risk of colorectal cancer. However, mechanisms underlying the pathogenesis of obesity-induced colorectal cancer are not completely understood. The purposes of this study were to identify differentially expressed genes in the colon of mice with diet-induced obesity and to select candidate genes as early markers of obesity-associated abnormal cell growth in the colon. C57BL/6N mice were fed normal diet (11% fat energy) or high-fat diet (40% fat energy) and were euthanized at different time points. Genome-wide expression profiles of the colon were determined at 2, 4, 8, and 12 weeks. Cluster analysis was performed using expression data of genes showing log 2 fold change of ≥1 or ≤-1 (twofold change), based on time-dependent expression patterns, followed by virtual network analysis. High-fat diet-fed mice showed significant increase in body weight and total visceral fat weight over 12 weeks. Time-course microarray analysis showed that 50, 47, 36, and 411 genes were differentially expressed at 2, 4, 8, and 12 weeks, respectively. Ten cluster profiles representing distinguishable patterns of genes differentially expressed over time were determined. Cluster 4, which consisted of genes showing the most significant alterations in expression in response to high-fat diet over 12 weeks, included Apoa4 (apolipoprotein A-IV), Ppap2b (phosphatidic acid phosphatase type 2B), Cel (carboxyl ester lipase), and Clps (colipase, pancreatic), which interacted strongly with surrounding genes associated with colorectal cancer or obesity. Our data indicate that Apoa4 , Ppap2b , Cel , and Clps are candidate early marker genes associated with obesity-related pathological changes in the colon. Genome-wide analyses performed in the present study provide new insights on selecting novel genes that may be associated with the development of diseases of the colon.
Patterning C. elegans: homeotic cluster genes, cell fates and cell migrations.

PubMed

Salser, S J; Kenyon, C

1994-05-01

Despite its simple body form, the nematode C. elegans expresses homeotic cluster genes similar to those of insects and vertebrates in the patterning of many cell types and tissues along the anteroposterior axis. In the ventral nerve cord, these genes program spatial patterns of cell death, fusion, division and neurotransmitter production; in migrating cells they regulate the direction and extent of movement. Nematode development permits an analysis at the cellular level of how homeotic cluster genes interact to specify cell fates, and how cell behavior can be regulated to assemble an organism.
Tissue Gene Expression Analysis Using Arrayed Normalized cDNA Libraries

PubMed Central

Eickhoff, Holger; Schuchhardt, Johannes; Ivanov, Igor; Meier-Ewert, Sebastian; O'Brien, John; Malik, Arif; Tandon, Neeraj; Wolski, Eryk-Witold; Rohlfs, Elke; Nyarsik, Lajos; Reinhardt, Richard; Nietfeld, Wilfried; Lehrach, Hans

2000-01-01

We have used oligonucleotide-fingerprinting data on 60,000 cDNA clones from two different mouse embryonic stages to establish a normalized cDNA clone set. The normalized set of 5,376 clones represents different clusters and therefore, in almost all cases, different genes. The inserts of the cDNA clones were amplified by PCR and spotted on glass slides. The resulting arrays were hybridized with mRNA probes prepared from six different adult mouse tissues. Expression profiles were analyzed by hierarchical clustering techniques. We have chosen radioactive detection because it combines robustness with sensitivity and allows the comparison of multiple normalized experiments. Sensitive detection combined with highly effective clustering algorithms allowed the identification of tissue-specific expression profiles and the detection of genes specifically expressed in the tissues investigated. The obtained results are publicly available (http://www.rzpd.de) and can be used by other researchers as a digital expression reference. [The sequence data described in this paper have been submitted to the EMBL data library under accession nos. AL360374–AL36537.] PMID:10958641
A novel polyketide biosynthesis gene cluster is involved in fruiting body morphogenesis in the filamentous fungi Sordaria macrospora and Neurospora crassa.

PubMed

Nowrousian, Minou

2009-04-01

During fungal fruiting body development, hyphae aggregate to form multicellular structures that protect and disperse the sexual spores. Analysis of microarray data revealed a gene cluster strongly upregulated during fruiting body development in the ascomycete Sordaria macrospora. Real time PCR analysis showed that the genes from the orthologous cluster in Neurospora crassa are also upregulated during development. The cluster encodes putative polyketide biosynthesis enzymes, including a reducing polyketide synthase. Analysis of knockout strains of a predicted dehydrogenase gene from the cluster showed that mutants in N. crassa and S. macrospora are delayed in fruiting body formation. In addition to the upregulated cluster, the N. crassa genome comprises another cluster containing a polyketide synthase gene, and five additional reducing polyketide synthase (rpks) genes that are not part of clusters. To study the role of these genes in sexual development, expression of the predicted rpks genes in S. macrospora (five genes) and N. crassa (six genes) was analyzed; all but one are upregulated during sexual development. Analysis of knockout strains for the N. crassa rpks genes showed that one of them is essential for fruiting body formation. These data indicate that polyketides produced by RPKSs are involved in sexual development in filamentous ascomycetes.
Drug repositioning for orphan genetic diseases through Conserved Anticoexpressed Gene Clusters (CAGCs)

PubMed Central

2013-01-01

Background The development of new therapies for orphan genetic diseases represents an extremely important medical and social challenge. Drug repositioning, i.e. finding new indications for approved drugs, could be one of the most cost- and time-effective strategies to cope with this problem, at least in a subset of cases. Therefore, many computational approaches based on the analysis of high throughput gene expression data have so far been proposed to reposition available drugs. However, most of these methods require gene expression profiles directly relevant to the pathologic conditions under study, such as those obtained from patient cells and/or from suitable experimental models. In this work we have developed a new approach for drug repositioning, based on identifying known drug targets showing conserved anti-correlated expression profiles with human disease genes, which is completely independent from the availability of ‘ad hoc’ gene expression data-sets. Results By analyzing available data, we provide evidence that the genes displaying conserved anti-correlation with drug targets are antagonistically modulated in their expression by treatment with the relevant drugs. We then identified clusters of genes associated to similar phenotypes and showing conserved anticorrelation with drug targets. On this basis, we generated a list of potential candidate drug-disease associations. Importantly, we show that some of the proposed associations are already supported by independent experimental evidence. Conclusions Our results support the hypothesis that the identification of gene clusters showing conserved anticorrelation with drug targets can be an effective method for drug repositioning and provide a wide list of new potential drug-disease associations for experimental validation. PMID:24088245
Intracellular Growth Is Dependent on Tyrosine Catabolism in the Dimorphic Fungal Pathogen Penicillium marneffei

PubMed Central

Boyce, Kylie J.; McLauchlan, Alisha; Schreider, Lena; Andrianopoulos, Alex

2015-01-01

During infection, pathogens must utilise the available nutrient sources in order to grow while simultaneously evading or tolerating the host’s defence systems. Amino acids are an important nutritional source for pathogenic fungi and can be assimilated from host proteins to provide both carbon and nitrogen. The hpdA gene of the dimorphic fungus Penicillium marneffei, which encodes an enzyme which catalyses the second step of tyrosine catabolism, was identified as up-regulated in pathogenic yeast cells. As well as enabling the fungus to acquire carbon and nitrogen, tyrosine is also a precursor in the formation of two types of protective melanin; DOPA melanin and pyomelanin. Chemical inhibition of HpdA in P. marneffei inhibits ex vivo yeast cell production suggesting that tyrosine is a key nutrient source during infectious growth. The genes required for tyrosine catabolism, including hpdA, are located in a gene cluster and the expression of these genes is induced in the presence of tyrosine. A gene (hmgR) encoding a Zn(II)2-Cys6 binuclear cluster transcription factor is present within the cluster and is required for tyrosine induced expression and repression in the presence of a preferred nitrogen source. AreA, the GATA-type transcription factor which regulates the global response to limiting nitrogen conditions negatively regulates expression of cluster genes in the absence of tyrosine and is required for nitrogen metabolite repression. Deletion of the tyrosine catabolic genes in the cluster affects growth on tyrosine as either a nitrogen or carbon source and affects pyomelanin, but not DOPA melanin, production. In contrast to other genes of the tyrosine catabolic cluster, deletion of hpdA results in no growth within macrophages. This suggests that the ability to catabolise tyrosine is not required for macrophage infection and that HpdA has an additional novel role to that of tyrosine catabolism and pyomelanin production during growth in host cells. PMID:25812137
Characterization of Adelphocoris suturalis (Hemiptera: Miridae) Transcriptome from Different Developmental Stages

NASA Astrophysics Data System (ADS)

Tian, Caihong; Tek Tay, Wee; Feng, Hongqiang; Wang, Ying; Hu, Yongmin; Li, Guoping

2015-06-01

Adelphocoris suturalis is one of the most serious pest insects of Bt cotton in China, however its molecular genetics, biochemistry and physiology are poorly understood. We used high throughput sequencing platform to perform de novo transcriptome assembly and gene expression analyses across different developmental stages (eggs, 2nd and 5th instar nymphs, female and male adults). We obtained 20 GB of clean data and revealed 88,614 unigenes, including 23,830 clusters and 64,784 singletons. These unigene sequences were annotated and classified by Gene Ontology, Clusters of Orthologous Groups, and Kyoto Encyclopedia of Genes and Genomes databases. A large number of differentially expressed genes were discovered through pairwise comparisons between these developmental stages. Gene expression profiles were dramatically different between life stage transitions, with some of these most differentially expressed genes being associated with sex difference, metabolism and development. Quantitative real-time PCR results confirm deep-sequencing findings based on relative expression levels of nine randomly selected genes. Furthermore, over 791,390 single nucleotide polymorphisms and 2,682 potential simple sequence repeats were identified. Our study provided comprehensive transcriptional gene expression information for A. suturalis that will form the basis to better understanding of development pathways, hormone biosynthesis, sex differences and wing formation in mirid bugs.
Characterization of Adelphocoris suturalis (Hemiptera: Miridae) Transcriptome from Different Developmental Stages

PubMed Central

Tian, Caihong; Tek Tay, Wee; Feng, Hongqiang; Wang, Ying; Hu, Yongmin; Li, Guoping

2015-01-01

Adelphocoris suturalis is one of the most serious pest insects of Bt cotton in China, however its molecular genetics, biochemistry and physiology are poorly understood. We used high throughput sequencing platform to perform de novo transcriptome assembly and gene expression analyses across different developmental stages (eggs, 2nd and 5th instar nymphs, female and male adults). We obtained 20 GB of clean data and revealed 88,614 unigenes, including 23,830 clusters and 64,784 singletons. These unigene sequences were annotated and classified by Gene Ontology, Clusters of Orthologous Groups, and Kyoto Encyclopedia of Genes and Genomes databases. A large number of differentially expressed genes were discovered through pairwise comparisons between these developmental stages. Gene expression profiles were dramatically different between life stage transitions, with some of these most differentially expressed genes being associated with sex difference, metabolism and development. Quantitative real-time PCR results confirm deep-sequencing findings based on relative expression levels of nine randomly selected genes. Furthermore, over 791,390 single nucleotide polymorphisms and 2,682 potential simple sequence repeats were identified. Our study provided comprehensive transcriptional gene expression information for A. suturalis that will form the basis to better understanding of development pathways, hormone biosynthesis, sex differences and wing formation in mirid bugs. PMID:26047353
Dose-related gene expression changes in forebrain following acute, low-level chlorpyrifos exposure in neonatal rats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ray, Anamika; Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK 74078; Liu Jing

2010-10-15

Chlorpyrifos (CPF) is a widely used organophosphorus insecticide (OP) and putative developmental neurotoxicant in humans. The acute toxicity of CPF is elicited by acetylcholinesterase (AChE) inhibition. We characterized dose-related (0.1, 0.5, 1 and 2 mg/kg) gene expression profiles and changes in cell signaling pathways 24 h following acute CPF exposure in 7-day-old rats. Microarray experiments indicated that approximately 9% of the 44,000 genes were differentially expressed following either one of the four CPF dosages studied (546, 505, 522, and 3,066 genes with 0.1, 0.5, 1.0 and 2.0 mg/kg CPF). Genes were grouped according to dose-related expression patterns using K-means clusteringmore » while gene networks and canonical pathways were evaluated using Ingenuity Pathway Analysis (registered) . Twenty clusters were identified and differential expression of selected genes was verified by RT-PCR. The four largest clusters (each containing from 276 to 905 genes) constituted over 50% of all differentially expressed genes and exhibited up-regulation following exposure to the highest dosage (2 mg/kg CPF). The total number of gene networks affected by CPF also rose sharply with the highest dosage of CPF (18, 16, 18 and 50 with 0.1, 0.5, 1 and 2 mg/kg CPF). Forebrain cholinesterase (ChE) activity was significantly reduced (26%) only in the highest dosage group. Based on magnitude of dose-related changes in differentially expressed genes, relative numbers of gene clusters and signaling networks affected, and forebrain ChE inhibition only at 2 mg/kg CPF, we focused subsequent analyses on this treatment group. Six canonical pathways were identified that were significantly affected by 2 mg/kg CPF (MAPK, oxidative stress, NF{Kappa}B, mitochondrial dysfunction, arylhydrocarbon receptor and adrenergic receptor signaling). Evaluation of different cellular functions of the differentially expressed genes suggested changes related to olfactory receptors, cell adhesion/migration, synapse/synaptic transmission and transcription/translation. Nine genes were differentially affected in all four CPF dosing groups. We conclude that the most robust, consistent changes in differential gene expression in neonatal forebrain across a range of acute CPF dosages occurred at an exposure level associated with the classical marker of OP toxicity, AChE inhibition. Disruption of multiple cellular pathways, in particular cell adhesion, may contribute to the developmental neurotoxicity potential of this pesticide.« less
Prediction of epigenetically regulated genes in breast cancer cell lines.

PubMed

Loss, Leandro A; Sadanandam, Anguraj; Durinck, Steffen; Nautiyal, Shivani; Flaucher, Diane; Carlton, Victoria E H; Moorhead, Martin; Lu, Yontao; Gray, Joe W; Faham, Malek; Spellman, Paul; Parvin, Bahram

2010-06-04

Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profiles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines, which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profiles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fixed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis. Our two-step dimensionality reduction compressed 90% of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically significant negative correlation between methylation profiles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identified 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes. Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators.
Gene expression profiling reveals two separate mechanisms regulating apoptosis in rectal carcinomas in vivo

PubMed Central

de Bruin, Elza C.; van de Pas, Simone; van de Velde, Cornelis J. H.; van Krieken, J. Han J. M.; Peltenburg, Lucy T. C.; Marijnen, Corrie A. M.

2007-01-01

The level of apoptosis in rectal carcinomas of patients treated by surgery only predicts local failure; patients with intrinsically high-apoptotic tumors develop less local recurrences than patients with low levels of apoptosis. To identify genes involved in this intrinsic apoptotic process in vivo, 47 rectal tumors with known apoptotic phenotype (24 low- and 23 high-apoptotic) were analyzed by oligonucleotide microarray technology. We identified several genes differentially expressed between low- and high-apoptotic tumors. Unsupervised clustering of the tumors based on expression levels of these genes separated the low-apoptotic from the high-apoptotic tumors, indicating a gene expression-dependent regulation. In addition, this clustering revealed two subgroups of high-apoptotic tumors. One high-apoptotic subgroup showed subtle differences in mRNA and protein expression of the known apoptotic regulators BAX, cIAP2 and ARC compared to the low-apoptotic tumors. The other subgroup of high-apoptotic tumors showed high expression of immune-related genes; predominantly HLA class II and chemokines, but also HLA class I and interferon-inducible genes were highly expressed. Immunohistochemistry revealed HLA-DR expression in epithelial tumor cells in 70% of these high-apoptotic tumors. The expression data suggest that high levels of apoptosis in rectal carcinoma patients can be the result of either slightly altered expression of known pro- and anti-apoptotic genes or high expression of immune-related genes. Electronic supplementary material The online version of this article (doi: 10.1007/s10495-007-0088-2) contains supplementary material, which is available to authorized users. PMID:17610066
Transcriptome-Wide Changes in Chlamydomonas reinhardtii Gene Expression Regulated by Carbon Dioxide and the CO2-Concentrating Mechanism Regulator CIA5/CCM1[W][OA

PubMed Central

Fang, Wei; Si, Yaqing; Douglass, Stephen; Casero, David; Merchant, Sabeeha S.; Pellegrini, Matteo; Ladunga, Istvan; Liu, Peng; Spalding, Martin H.

2012-01-01

We used RNA sequencing to query the Chlamydomonas reinhardtii transcriptome for regulation by CO2 and by the transcription regulator CIA5 (CCM1). Both CO2 and CIA5 are known to play roles in acclimation to low CO2 and in induction of an essential CO2-concentrating mechanism (CCM), but less is known about their interaction and impact on the whole transcriptome. Our comparison of the transcriptome of a wild type versus a cia5 mutant strain under three different CO2 conditions, high CO2 (5%), low CO2 (0.03 to 0.05%), and very low CO2 (<0.02%), provided an entry into global changes in the gene expression patterns occurring in response to the interaction between CO2 and CIA5. We observed a massive impact of CIA5 and CO2 on the transcriptome, affecting almost 25% of all Chlamydomonas genes, and we discovered an array of gene clusters with distinctive expression patterns that provide insight into the regulatory interaction between CIA5 and CO2. Several individual clusters respond primarily to either CIA5 or CO2, providing access to genes regulated by one factor but decoupled from the other. Three distinct clusters clearly associated with CCM-related genes may represent a rich source of candidates for new CCM components, including a small cluster of genes encoding putative inorganic carbon transporters. PMID:22634760
Metabolic engineering of Pseudomonas putida for production of docosahexaenoic acid based on a myxobacterial PUFA synthase.

PubMed

Gemperlein, Katja; Zipf, Gregor; Bernauer, Hubert S; Müller, Rolf; Wenzel, Silke C

2016-01-01

Long-chain polyunsaturated fatty acids (LC-PUFAs) can be produced de novo via polyketide synthase-like enzymes known as PUFA synthases, which are encoded by pfa biosynthetic gene clusters originally discovered from marine microorganisms. Recently similar gene clusters were detected and characterized in terrestrial myxobacteria revealing several striking differences. As the identified myxobacterial producers are difficult to handle genetically and grow very slowly we aimed to establish heterologous expression platforms for myxobacterial PUFA synthases. Here we report the heterologous expression of the pfa gene cluster from Aetherobacter fasciculatus (SBSr002) in the phylogenetically distant model host bacteria Escherichia coli and Pseudomonas putida. The latter host turned out to be the more promising PUFA producer revealing higher production rates of n-6 docosapentaenoic acid (DPA) and docosahexaenoic acid (DHA). After several rounds of genetic engineering of expression plasmids combined with metabolic engineering of P. putida, DHA production yields were eventually increased more than threefold. Additionally, we applied synthetic biology approaches to redesign and construct artificial versions of the A. fasciculatus pfa gene cluster, which to the best of our knowledge represents the first example of a polyketide-like biosynthetic gene cluster modulated and synthesized for P. putida. Combination with the engineering efforts described above led to a further increase in LC-PUFA production yields. The established production platform based on synthetic DNA now sets the stage for flexible engineering of the complex PUFA synthase. Copyright © 2015 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Transcriptional Coupling of Neighboring Genes and Gene Expression Noise: Evidence that Gene Orientation and Noncoding Transcripts Are Modulators of Noise

PubMed Central

Wang, Guang-Zhong; Lercher, Martin J.; Hurst, Laurence D.

2011-01-01

Abstract How is noise in gene expression modulated? Do mechanisms of noise control impact genome organization? In yeast, the expression of one gene can affect that of a very close neighbor. As the effect is highly regionalized, we hypothesize that genes in different orientations will have differing degrees of coupled expression and, in turn, different noise levels. Divergently organized gene pairs, in particular those with bidirectional promoters, have close promoters, maximizing the likelihood that expression of one gene affects the neighbor. With more distant promoters, the same is less likely to hold for gene pairs in nondivergent orientation. Stochastic models suggest that coupled chromatin dynamics will typically result in low abundance-corrected noise (ACN). Transcription of noncoding RNA (ncRNA) from a bidirectional promoter, we thus hypothesize to be a noise-reduction, expression-priming, mechanism. The hypothesis correctly predicts that protein-coding genes with a bidirectional promoter, including those with a ncRNA partner, have lower ACN than other genes and divergent gene pairs uniquely have correlated ACN. Moreover, as predicted, ACN increases with the distance between promoters. The model also correctly predicts ncRNA transcripts to be often divergently transcribed from genes that a priori would be under selection for low noise (essential genes, protein complex genes) and that the latter genes should commonly reside in divergent orientation. Likewise, that genes with bidirectional promoters are rare subtelomerically, cluster together, and are enriched in essential gene clusters is expected and observed. We conclude that gene orientation and transcription of ncRNAs are candidate modulators of noise. PMID:21402863
Soybean Fe-S cluster biosynthesis regulated by external iron or phosphate fluctuation.

PubMed

Qin, Lu; Wang, Meihuan; Chen, Liyu; Liang, Xuejiao; Wu, Zhigeng; Lin, Zhihao; Zuo, Jia; Feng, Xiangyang; Zhao, Jing; Liao, Hong; Ye, Hong

2015-03-01

Iron and phosphorus are essential for soybean nodulation. Our results suggested that the deficiency of Fe or P impairs nodulation by affecting the assembly of functional iron-sulfur cluster via different mechanisms. Iron (Fe) and phosphorus (P) are important mineral nutrients for soybean and are indispensable for nodulation. However, it remains elusive how the pathways of Fe metabolism respond to the fluctuation of external Fe or P. Iron is required for the iron-sulfur (Fe-S) cluster assembly in higher plant. Here, we investigated the expression pattern of Fe-S cluster biosynthesis genes in the nodulated soybean. Soybean genome encodes 42 putative Fe-S cluster biosynthesis genes, which were expressed differently in shoots and roots, suggesting of physiological relevance. Nodules initiated from roots of soybean after rhizobia inoculation. In comparison with that in shoots, iron concentration was three times higher in nodules. The Fe-S cluster biosynthesis genes were activated and several Fe-S protein activities were increased in nodules, indicating that a more effective Fe-S cluster biosynthesis is accompanied by nodulation. Fe-S cluster biosynthesis genes were massively repressed and some Fe-S protein activities were decreased in nodules by Fe deficiency, leading to tiny nodules. Notably, P deficiency induced a similar Fe-deficiency response in nodules, i.e, certain Fe-S enzyme activity loss and tiny nodules. However, distinct from Fe-deficient nodules, higher iron concentration was accumulated and the Fe-S cluster biosynthesis genes were not suppressed in the P-deficiency-treated nodules. Taken together, our results showed that both Fe deficiency and P deficiency impair nodulation, but they affect the assembly of Fe-S cluster maybe via different mechanisms. The data also suggested that Fe-S cluster biosynthesis likely links Fe metabolism and P metabolism in root and nodule cells of soybean.
DMRT gene cluster analysis in the platypus: new insights into genomic organization and regulatory regions.

PubMed

El-Mogharbel, Nisrine; Wakefield, Matthew; Deakin, Janine E; Tsend-Ayush, Enkhjargal; Grützner, Frank; Alsop, Amber; Ezaz, Tariq; Marshall Graves, Jennifer A

2007-01-01

We isolated and characterized a cluster of platypus DMRT genes and compared their arrangement, location, and sequence across vertebrates. The DMRT gene cluster on human 9p24.3 harbors, in order, DMRT1, DMRT3, and DMRT2, which share a DM domain. DMRT1 is highly conserved and involved in sexual development in vertebrates, and deletions in this region cause sex reversal in humans. Sequence comparisons of DMRT genes between species have been valuable in identifying exons, control regions, and conserved nongenic regions (CNGs). The addition of platypus sequences is expected to be particularly valuable, since monotremes fill a gap in the vertebrate genome coverage. We therefore isolated and fully sequenced platypus BAC clones containing DMRT3 and DMRT2 as well as DMRT1 and then generated multispecies alignments and ran prediction programs followed by experimental verification to annotate this gene cluster. We found that the three genes have 58-66% identity to their human orthologues, lie in the same order as in other vertebrates, and colocate on 1 of the 10 platypus sex chromosomes, X5. We also predict that optimal annotation of the newly sequenced platypus genome will be challenging. The analysis of platypus sequence revealed differences in structure and sequence of the DMRT gene cluster. Multispecies comparison was particularly effective for detecting CNGs, revealing several novel potential regulatory regions within DMRT3 and DMRT2 as well as DMRT1. RT-PCR indicated that platypus DMRT1 and DMRT3 are expressed specifically in the adult testis (and not ovary), but DMRT2 has a wider expression profile, as it does for other mammals. The platypus DMRT1 expression pattern, and its location on an X chromosome, suggests an involvement in monotreme sexual development.
The cytochrome P450 2AA gene cluster in zebrafish (Danio rerio): Expression of CYP2AA1 and CYP2AA2 and response to phenobarbital-type inducers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kubota, Akira; Bainy, Afonso C.D.; Departamento de Bioquímica, CCB, Universidade Federal de Santa Catarina, Florianopolis, SC 88040-900

2013-10-01

The cytochrome P450 (CYP) 2 gene family is the largest and most diverse CYP gene family in vertebrates. In zebrafish, we have identified 10 genes in a new subfamily, CYP2AA, which does not show orthology to any human or other mammalian CYP genes. Here we report evolutionary and structural relationships of the 10 CYP2AA genes and expression of the first two genes, CYP2AA1 and CYP2AA2. Parsimony reconstruction of the tandem duplication pattern for the CYP2AA cluster suggests that CYP2AA1, CYP2AA2 and CYP2AA3 likely arose in the earlier duplication events and thus are most diverged in function from the other CYP2AAs.more » On the other hand, CYP2AA8 and CYP2AA9 are genes that arose in the latest duplication event, implying functional similarity between these two CYPs. A molecular model of CYP2AA1 showing the sequence conservation across the CYP2AA cluster reveals that the regions with the highest variability within the cluster map onto CYP2AA1 near the substrate access channels, suggesting differing substrate specificities. Zebrafish CYP2AA1 transcript was expressed predominantly in the intestine, while CYP2AA2 was most highly expressed in the kidney, suggesting differing roles in physiology. In the liver CYP2AA2 expression but not that of CYP2AA1, was increased by 1,4-bis [2-(3,5-dichloropyridyloxy)] benzene (TCPOBOP) and, to a lesser extent, by phenobarbital (PB). In contrast, pregnenolone 16α-carbonitrile (PCN) increased CYP2AA1 expression, but not CYP2AA2 in the liver. The results identify a CYP2 subfamily in zebrafish that includes genes apparently induced by PB-type chemicals and PXR agonists, the first concrete in vivo evidence for a PB-type response in fish. - Highlights: • A tandemly duplicated cluster of ten CYP2AA genes was described in zebrafish. • Parsimony and duplication analyses suggest pathways to CYP2AA diversity. • Homology models reveal amino acid positions possibly related to functional diversity. • The CYP2AA locus does not share synteny with any CYP2 subfamily in mammals. • Induction of CYP2AA1 and CYP2AA2 indicates a phenobarbital-type response in fish.« less
Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

2004-08-06

The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayedmore » embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Measuring conservation of sequence features closely linked to function--such as binding-site clustering--makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less

Mining subspace clusters from DNA microarray data using large itemset techniques.

PubMed

Chang, Ye-In; Chen, Jiun-Rung; Tsai, Yueh-Chi

2009-05-01

Mining subspace clusters from the DNA microarrays could help researchers identify those genes which commonly contribute to a disease, where a subspace cluster indicates a subset of genes whose expression levels are similar under a subset of conditions. Since in a DNA microarray, the number of genes is far larger than the number of conditions, those previous proposed algorithms which compute the maximum dimension sets (MDSs) for any two genes will take a long time to mine subspace clusters. In this article, we propose the Large Itemset-Based Clustering (LISC) algorithm for mining subspace clusters. Instead of constructing MDSs for any two genes, we construct only MDSs for any two conditions. Then, we transform the task of finding the maximal possible gene sets into the problem of mining large itemsets from the condition-pair MDSs. Since we are only interested in those subspace clusters with gene sets as large as possible, it is desirable to pay attention to those gene sets which have reasonable large support values in the condition-pair MDSs. From our simulation results, we show that the proposed algorithm needs shorter processing time than those previous proposed algorithms which need to construct gene-pair MDSs.
Comparative interrogation of the developing xylem transcriptomes of two wood-forming species: Populus trichocarpa and Eucalyptus grandis.

PubMed

Hefer, Charles A; Mizrachi, Eshchar; Myburg, Alexander A; Douglas, Carl J; Mansfield, Shawn D

2015-06-01

Wood formation is a complex developmental process governed by genetic and environmental stimuli. Populus and Eucalyptus are fast-growing, high-yielding tree genera that represent ecologically and economically important species suitable for generating significant lignocellulosic biomass. Comparative analysis of the developing xylem and leaf transcriptomes of Populus trichocarpa and Eucalyptus grandis together with phylogenetic analyses identified clusters of homologous genes preferentially expressed during xylem formation in both species. A conserved set of 336 single gene pairs showed highly similar xylem preferential expression patterns, as well as evidence of high functional constraint. Individual members of multi-gene orthologous clusters known to be involved in secondary cell wall biosynthesis also showed conserved xylem expression profiles. However, species-specific expression as well as opposite (xylem versus leaf) expression patterns observed for a subset of genes suggest subtle differences in the transcriptional regulation important for xylem development in each species. Using sequence similarity and gene expression status, we identified functional homologs likely to be involved in xylem developmental and biosynthetic processes in Populus and Eucalyptus. Our study suggests that, while genes involved in secondary cell wall biosynthesis show high levels of gene expression conservation, differential regulation of some xylem development genes may give rise to unique xylem properties. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Genome mining-directed activation of a silent angucycline biosynthetic gene cluster in Streptomyces chattanoogensis.

PubMed

Zhou, Zhenxing; Xu, Qingqing; Bu, Qingting; Guo, Yuanyang; Liu, Shuiping; Liu, Yu; Du, Yiling; Li, Yongquan

2015-02-09

Genomic sequencing of actinomycetes has revealed the presence of numerous gene clusters seemingly capable of natural product biosynthesis, yet most clusters are cryptic under laboratory conditions. Bioinformatics analysis of the completely sequenced genome of Streptomyces chattanoogensis L10 (CGMCC 2644) revealed a silent angucycline biosynthetic gene cluster. The overexpression of a pathway-specific activator gene under the constitutive ermE* promoter successfully triggered the expression of the angucycline biosynthetic genes. Two novel members of the angucycline antibiotic family, chattamycins A and B, were further isolated and elucidated. Biological activity assays demonstrated that chattamycin B possesses good antitumor activities against human cancer cell lines and moderate antibacterial activities. The results presented here provide a feasible method to activate silent angucycline biosynthetic gene clusters to discover potential new drug leads. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The regulatory network of cluster-root function and development in phosphate-deficient white lupin (Lupinus albus) identified by transcriptome sequencing.

PubMed

Wang, Zhengrui; Straub, Daniel; Yang, Huaiyu; Kania, Angelika; Shen, Jianbo; Ludewig, Uwe; Neumann, Günter

2014-07-01

Lupinus albus serves as model plant for root-induced mobilization of sparingly soluble soil phosphates via the formation of cluster-roots (CRs) that mediate secretion of protons, citrate, phenolics and acid phosphatases (APases). This study employed next-generation sequencing to investigate the molecular mechanisms behind these complex adaptive responses at the transcriptome level. We compared different stages of CR development, including pre-emergent (PE), juvenile (JU) and the mature (MA) stages. The results confirmed that the primary metabolism underwent significant modifications during CR maturation, promoting the biosynthesis of organic acids, as had been deduced from physiological studies. Citrate catabolism was downregulated, associated with citrate accumulation in MA clusters. Upregulation of the phenylpropanoid pathway reflected the accumulation of phenolics. Specific transcript expression of ALMT and MATE transporter genes correlated with the exudation of citrate and flavonoids. The expression of transcripts related to nucleotide degradation and APases in MA clusters coincided with the re-mobilization and hydrolysis of organic phosphate resources. Most interestingly, hormone-related gene expression suggested a central role of ethylene during CR maturation. This was associated with the upregulation of the iron (Fe)-deficiency regulated network that mediates ethylene-induced expression of Fe-deficiency responses in other species. Finally, transcripts related to abscisic acid and jasmonic acid were upregulated in MA clusters, while auxin- and brassinosteroid-related genes and cytokinin receptors were most strongly expressed during CR initiation. Key regulations proposed by the RNA-seq data were confirmed by quantitative real-time polymerase chain reaction (RT-qPCR) and some physiological analyses. A model for the gene network regulating CR development and function is presented. © 2014 Scandinavian Plant Physiology Society.
Molecular codes for neuronal individuality and cell assembly in the brain

PubMed Central

Yagi, Takeshi

2012-01-01

The brain contains an enormous, but finite, number of neurons. The ability of this limited number of neurons to produce nearly limitless neural information over a lifetime is typically explained by combinatorial explosion; that is, by the exponential amplification of each neuron's contribution through its incorporation into “cell assemblies” and neural networks. In development, each neuron expresses diverse cellular recognition molecules that permit the formation of the appropriate neural cell assemblies to elicit various brain functions. The mechanism for generating neuronal assemblies and networks must involve molecular codes that give neurons individuality and allow them to recognize one another and join appropriate networks. The extensive molecular diversity of cell-surface proteins on neurons is likely to contribute to their individual identities. The clustered protocadherins (Pcdh) is a large subfamily within the diverse cadherin superfamily. The clustered Pcdh genes are encoded in tandem by three gene clusters, and are present in all known vertebrate genomes. The set of clustered Pcdh genes is expressed in a random and combinatorial manner in each neuron. In addition, cis-tetramers composed of heteromultimeric clustered Pcdh isoforms represent selective binding units for cell-cell interactions. Here I present the mathematical probabilities for neuronal individuality based on the random and combinatorial expression of clustered Pcdh isoforms and their formation of cis-tetramers in each neuron. Notably, clustered Pcdh gene products are known to play crucial roles in correct axonal projections, synaptic formation, and neuronal survival. Their molecular and biological features induce a hypothesis that the diverse clustered Pcdh molecules provide the molecular code by which neuronal individuality and cell assembly permit the combinatorial explosion of networks that supports enormous processing capability and plasticity of the brain. PMID:22518100
Enhancers and super-enhancers have an equivalent regulatory role in embryonic stem cells through regulation of single or multiple genes

PubMed Central

Moorthy, Sakthi D.; Davidson, Scott; Shchuka, Virlana M.; Singh, Gurdeep; Malek-Gilani, Nakisa; Langroudi, Lida; Martchenko, Alexandre; So, Vincent; Macpherson, Neil N.; Mitchell, Jennifer A.

2017-01-01

Transcriptional enhancers are critical for maintaining cell-type–specific gene expression and driving cell fate changes during development. Highly transcribed genes are often associated with a cluster of individual enhancers such as those found in locus control regions. Recently, these have been termed stretch enhancers or super-enhancers, which have been predicted to regulate critical cell identity genes. We employed a CRISPR/Cas9-mediated deletion approach to study the function of several enhancer clusters (ECs) and isolated enhancers in mouse embryonic stem (ES) cells. Our results reveal that the effect of deleting ECs, also classified as ES cell super-enhancers, is highly variable, resulting in target gene expression reductions ranging from 12% to as much as 92%. Partial deletions of these ECs which removed only one enhancer or a subcluster of enhancers revealed partially redundant control of the regulated gene by multiple enhancers within the larger cluster. Many highly transcribed genes in ES cells are not associated with a super-enhancer; furthermore, super-enhancer predictions ignore 81% of the potentially active regulatory elements predicted by cobinding of five or more pluripotency-associated transcription factors. Deletion of these additional enhancer regions revealed their robust regulatory role in gene transcription. In addition, select super-enhancers and enhancers were identified that regulated clusters of paralogous genes. We conclude that, whereas robust transcriptional output can be achieved by an isolated enhancer, clusters of enhancers acting on a common target gene act in a partially redundant manner to fine tune transcriptional output of their target genes. PMID:27895109
Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome.

PubMed

Tothill, Richard W; Tinker, Anna V; George, Joshy; Brown, Robert; Fox, Stephen B; Lade, Stephen; Johnson, Daryl S; Trivett, Melanie K; Etemadmoghadam, Dariush; Locandro, Bianca; Traficante, Nadia; Fereday, Sian; Hung, Jillian A; Chiew, Yoke-Eng; Haviv, Izhak; Gertig, Dorota; DeFazio, Anna; Bowtell, David D L

2008-08-15

The study aim to identify novel molecular subtypes of ovarian cancer by gene expression profiling with linkage to clinical and pathologic features. Microarray gene expression profiling was done on 285 serous and endometrioid tumors of the ovary, peritoneum, and fallopian tube. K-means clustering was applied to identify robust molecular subtypes. Statistical analysis identified differentially expressed genes, pathways, and gene ontologies. Laser capture microdissection, pathology review, and immunohistochemistry validated the array-based findings. Patient survival within k-means groups was evaluated using Cox proportional hazards models. Class prediction validated k-means groups in an independent dataset. A semisupervised survival analysis of the array data was used to compare against unsupervised clustering results. Optimal clustering of array data identified six molecular subtypes. Two subtypes represented predominantly serous low malignant potential and low-grade endometrioid subtypes, respectively. The remaining four subtypes represented higher grade and advanced stage cancers of serous and endometrioid morphology. A novel subtype of high-grade serous cancers reflected a mesenchymal cell type, characterized by overexpression of N-cadherin and P-cadherin and low expression of differentiation markers, including CA125 and MUC1. A poor prognosis subtype was defined by a reactive stroma gene expression signature, correlating with extensive desmoplasia in such samples. A similar poor prognosis signature could be found using a semisupervised analysis. Each subtype displayed distinct levels and patterns of immune cell infiltration. Class prediction identified similar subtypes in an independent ovarian dataset with similar prognostic trends. Gene expression profiling identified molecular subtypes of ovarian cancer of biological and clinical importance.
Expression level of miRNAs on chromosome 14q32.31 region correlates with tumor aggressiveness and survival of glioblastoma patients.

PubMed

Shahar, Tal; Granit, Avital; Zrihan, Daniel; Canello, Tamar; Charbit, Hanna; Einstein, Ofira; Rozovski, Uri; Elgavish, Sharona; Ram, Zvi; Siegal, Tali; Lavon, Iris

2016-12-01

The 54 microRNAs (miRNAs) within the DLK-DIO3 genomic region on chromosome 14q32.31 (cluster-14-miRNAs) are organized into sub-clusters 14A and 14B. These miRNAs are downregulated in glioblastomas and might have a tumor suppressive role. Any association between the expression levels of cluster-14-miRNAs with overall survival (OS) is undetermined. We randomly selected miR-433, belonging to sub-cluster 14A and miR-323a-3p and miR-369-3p, belonging to sub-cluster 14B, and assessed their role in glioblastomas in vitro and in vivo. We also determined the expression level of cluster-14-miRNAs in 27 patients with newly diagnosed glioblastoma, and analyzed the association between their level of expression and OS. Overexpression of miR-323a-3p and miR-369-3p, but not miR-433, in glioblastoma cells inhibited their proliferation and migration in vitro. Mice implanted with glioblastoma cells overexpressing miR323a-3p and miR369-3p, but not miR433, exhibited prolonged survival compared to controls (P = .003). Bioinformatics analysis identified 13 putative target genes of cluster-14-miRNAs, and real-time RT-PCR validated these findings. Pathway analysis of the putative target genes identified neuregulin as the most enriched pathway. The expression level of cluster-14-miRNAs correlated with patients' OS. The median OS was 8.5 months for patients with low expression levels and 52.7 months for patients with high expression levels (HR 0.34; 95 % CI 0.12-0.59, P = .003). The expression level of cluster-14-miRNAs correlates directly with OS, suggesting a role for this cluster in promoting aggressive behavior of glioblastoma, possibly through ErBb/neuregulin signaling.
Wide distribution of O157-antigen biosynthesis gene clusters in Escherichia coli.

PubMed

Iguchi, Atsushi; Shirai, Hiroki; Seto, Kazuko; Ooka, Tadasuke; Ogura, Yoshitoshi; Hayashi, Tetsuya; Osawa, Kayo; Osawa, Ro

2011-01-01

Most Escherichia coli O157-serogroup strains are classified as enterohemorrhagic E. coli (EHEC), which is known as an important food-borne pathogen for humans. They usually produce Shiga toxin (Stx) 1 and/or Stx2, and express H7-flagella antigen (or nonmotile). However, O157 strains that do not produce Stxs and express H antigens different from H7 are sometimes isolated from clinical and other sources. Multilocus sequence analysis revealed that these 21 O157:non-H7 strains tested in this study belong to multiple evolutionary lineages different from that of EHEC O157:H7 strains, suggesting a wide distribution of the gene set encoding the O157-antigen biosynthesis in multiple lineages. To gain insight into the gene organization and the sequence similarity of the O157-antigen biosynthesis gene clusters, we conducted genomic comparisons of the chromosomal regions (about 59 kb in each strain) covering the O-antigen gene cluster and its flanking regions between six O157:H7/non-H7 strains. Gene organization of the O157-antigen gene cluster was identical among O157:H7/non-H7 strains, but was divided into two distinct types at the nucleotide sequence level. Interestingly, distribution of the two types did not clearly follow the evolutionary lineages of the strains, suggesting that horizontal gene transfer of both types of O157-antigen gene clusters has occurred independently among E. coli strains. Additionally, detailed sequence comparison revealed that some positions of the repetitive extragenic palindromic (REP) sequences in the regions flanking the O-antigen gene clusters were coincident with possible recombination points. From these results, we conclude that the horizontal transfer of the O157-antigen gene clusters induced the emergence of multiple O157 lineages within E. coli and speculate that REP sequences may involve one of the driving forces for exchange and evolution of O-antigen loci.
Temporal Changes in Gene Expression after Injury in the Rat Retina

PubMed Central

Vázquez-Chona, Félix; Song, Bong K.; Geisert, Eldon E.

2010-01-01

Purpose The goal of this study was to define the temporal changes in gene expression after retinal injury and to relate these changes to the inflammatory and reactive response. A specific emphasis was placed on the tetraspanin family of proteins and their relationship with markers of reactive gliosis. Methods Retinal tears were induced in adult rats by scraping the retina with a needle. After different survival times (4 hours, and 1, 3, 7, and 30 days), the retinas were removed, and mRNA was isolated, prepared, and hybridized to the Affymatrix RGU34A microarray (Santa Clara, CA). Microarray results were confirmed by using RT-PCR and correlation to protein levels was determined. Results Of the 8750 genes analyzed, approximately 393 (4.5%) were differentially expressed. Clustering analysis revealed three major profiles: (1) The early response was characterized by the upregulation of transcription factors; (2) the delayed response included a high percentage of genes related to cell cycle and cell death; and (3) the late, sustained profile clustered a significant number of genes involved in retinal gliosis. The late, sustained cluster also contained the upregulated crystallin genes. The tetraspanins Cd9, Cd81, and Cd82 were also associated with the late, sustained response. Conclusions The use of microarray technology enables definition of complex genetic changes underlying distinct phases of the cellular response to retinal injury. The early response clusters genes associate with the transcriptional regulation of the wound-healing process and cell death. Most of the genes in the late, sustained response appear to be associated with reactive gliosis. PMID:15277499
HlyU Is a Positive Regulator of Hemolysin Expression in Vibrio anguillarum ▿

PubMed Central

Li, Ling; Mou, Xiangyu; Nelson, David R.

2011-01-01

The two hemolysin gene clusters previously identified in Vibrio anguillarum, the vah1 cluster and the rtxACHBDE cluster, are responsible for the hemolytic and cytotoxic activities of V. anguillarum in fish. In this study, we used degenerate PCR to identify a positive hemolysin regulatory gene, hlyU, from the unsequenced V. anguillarum genome. The hlyU gene of V. anguillarum encodes a 92-amino-acid protein and is highly homologous to other bacterial HlyU proteins. An hlyU mutant was constructed, which exhibited an ∼5-fold decrease in hemolytic activity on sheep blood agar with no statistically significant decrease in cytotoxicity of the wild-type strain. Complementation of the hlyU mutation restored both hemolytic activity and cytotoxic activity. Both semiquantitative reverse transcription-PCR (RT-PCR) and quantitative real-time RT-PCR (qRT-PCR) were used to examine expression of the hemolysin genes under exponential and stationary-phase conditions in wild-type, hlyU mutant, and hlyU complemented strains. Compared to the wild-type strain, expression of rtx genes decreased in the hlyU mutant, while expression of vah1 and plp was not affected in the hlyU mutant. Complementation of the hlyU mutation restored expression of the rtx genes and increased vah1 and plp expression to levels higher than those in the wild type. The transcriptional start sites in both the vah1-plp and rtxH-rtxB genes' intergenic regions were determined using 5′ random amplification of cDNA ends (5′-RACE), and the binding sites for purified HlyU were discovered using DNA gel mobility shift experiments and DNase protection assays. PMID:21764937
eMBI: Boosting Gene Expression-based Clustering for Cancer Subtypes.

PubMed

Chang, Zheng; Wang, Zhenjia; Ashby, Cody; Zhou, Chuan; Li, Guojun; Zhang, Shuzhong; Huang, Xiuzhen

2014-01-01

Identifying clinically relevant subtypes of a cancer using gene expression data is a challenging and important problem in medicine, and is a necessary premise to provide specific and efficient treatments for patients of different subtypes. Matrix factorization provides a solution by finding checker-board patterns in the matrices of gene expression data. In the context of gene expression profiles of cancer patients, these checkerboard patterns correspond to genes that are up- or down-regulated in patients with particular cancer subtypes. Recently, a new matrix factorization framework for biclustering called Maximum Block Improvement (MBI) is proposed; however, it still suffers several problems when applied to cancer gene expression data analysis. In this study, we developed many effective strategies to improve MBI and designed a new program called enhanced MBI (eMBI), which is more effective and efficient to identify cancer subtypes. Our tests on several gene expression profiling datasets of cancer patients consistently indicate that eMBI achieves significant improvements in comparison with MBI, in terms of cancer subtype prediction accuracy, robustness, and running time. In addition, the performance of eMBI is much better than another widely used matrix factorization method called nonnegative matrix factorization (NMF) and the method of hierarchical clustering, which is often the first choice of clinical analysts in practice.
eMBI: Boosting Gene Expression-based Clustering for Cancer Subtypes

PubMed Central

Chang, Zheng; Wang, Zhenjia; Ashby, Cody; Zhou, Chuan; Li, Guojun; Zhang, Shuzhong; Huang, Xiuzhen

2014-01-01

Identifying clinically relevant subtypes of a cancer using gene expression data is a challenging and important problem in medicine, and is a necessary premise to provide specific and efficient treatments for patients of different subtypes. Matrix factorization provides a solution by finding checker-board patterns in the matrices of gene expression data. In the context of gene expression profiles of cancer patients, these checkerboard patterns correspond to genes that are up- or down-regulated in patients with particular cancer subtypes. Recently, a new matrix factorization framework for biclustering called Maximum Block Improvement (MBI) is proposed; however, it still suffers several problems when applied to cancer gene expression data analysis. In this study, we developed many effective strategies to improve MBI and designed a new program called enhanced MBI (eMBI), which is more effective and efficient to identify cancer subtypes. Our tests on several gene expression profiling datasets of cancer patients consistently indicate that eMBI achieves significant improvements in comparison with MBI, in terms of cancer subtype prediction accuracy, robustness, and running time. In addition, the performance of eMBI is much better than another widely used matrix factorization method called nonnegative matrix factorization (NMF) and the method of hierarchical clustering, which is often the first choice of clinical analysts in practice. PMID:25374455
An EST-based analysis identifies new genes and reveals distinctive gene expression features of Coffea arabica and Coffea canephora

PubMed Central

2011-01-01

Background Coffee is one of the world's most important crops; it is consumed worldwide and plays a significant role in the economy of producing countries. Coffea arabica and C. canephora are responsible for 70 and 30% of commercial production, respectively. C. arabica is an allotetraploid from a recent hybridization of the diploid species, C. canephora and C. eugenioides. C. arabica has lower genetic diversity and results in a higher quality beverage than C. canephora. Research initiatives have been launched to produce genomic and transcriptomic data about Coffea spp. as a strategy to improve breeding efficiency. Results Assembling the expressed sequence tags (ESTs) of C. arabica and C. canephora produced by the Brazilian Coffee Genome Project and the Nestlé-Cornell Consortium revealed 32,007 clusters of C. arabica and 16,665 clusters of C. canephora. We detected different GC3 profiles between these species that are related to their genome structure and mating system. BLAST analysis revealed similarities between coffee and grape (Vitis vinifera) genes. Using KA/KS analysis, we identified coffee genes under purifying and positive selection. Protein domain and gene ontology analyses suggested differences between Coffea spp. data, mainly in relation to complex sugar synthases and nucleotide binding proteins. OrthoMCL was used to identify specific and prevalent coffee protein families when compared to five other plant species. Among the interesting families annotated are new cystatins, glycine-rich proteins and RALF-like peptides. Hierarchical clustering was used to independently group C. arabica and C. canephora expression clusters according to expression data extracted from EST libraries, resulting in the identification of differentially expressed genes. Based on these results, we emphasize gene annotation and discuss plant defenses, abiotic stress and cup quality-related functional categories. Conclusion We present the first comprehensive genome-wide transcript profile study of C. arabica and C. canephora, which can be freely assessed by the scientific community at http://www.lge.ibi.unicamp.br/coffea. Our data reveal the presence of species-specific/prevalent genes in coffee that may help to explain particular characteristics of these two crops. The identification of differentially expressed transcripts offers a starting point for the correlation between gene expression profiles and Coffea spp. developmental traits, providing valuable insights for coffee breeding and biotechnology, especially concerning sugar metabolism and stress tolerance. PMID:21303543
The effects of graded levels of calorie restriction: VII. Topological rearrangement of hypothalamic aging networks.

PubMed

Derous, Davina; Mitchell, Sharon E; Green, Cara L; Wang, Yingchun; Han, Jing Dong J; Chen, Luonan; Promislow, Daniel E L; Lusseau, David; Speakman, John R; Douglas, Alex

2016-05-01

Connectivity in a gene-gene network declines with age, typically within gene clusters. We explored the effect of short-term (3 months) graded calorie restriction (CR) (up to 40 %) on network structure of aging-associated genes in the murine hypothalamus by using conditional mutual information. The networks showed a topological rearrangement when exposed to graded CR with a higher relative within cluster connectivity at 40CR. We observed changes in gene centrality concordant with changes in CR level, with Ppargc1a, and Ppt1 having increased centrality and Etfdh, Traf3 and Abcc1 decreased centrality as CR increased. This change in gene centrality in a graded manner with CR, occurred in the absence of parallel changes in gene expression levels. This study emphasizes the importance of augmenting traditional differential gene expression analyses to better understand structural changes in the transcriptome. Overall our results suggested that CR induced changes in centrality of biological relevant genes that play an important role in preventing the age-associated loss of network integrity irrespective of their gene expression levels.
The effects of graded levels of calorie restriction: VII. Topological rearrangement of hypothalamic aging networks

PubMed Central

Derous, Davina; Mitchell, Sharon E.; Green, Cara L.; Wang, Yingchun; Han, Jing Dong J.; Chen, Luonan; Promislow, Daniel E.L.; Lusseau, David; Speakman, John R.; Douglas, Alex

2016-01-01

Connectivity in a gene-gene network declines with age, typically within gene clusters. We explored the effect of short-term (3 months) graded calorie restriction (CR) (up to 40 %) on network structure of aging-associated genes in the murine hypothalamus by using conditional mutual information. The networks showed a topological rearrangement when exposed to graded CR with a higher relative within cluster connectivity at 40CR. We observed changes in gene centrality concordant with changes in CR level, with Ppargc1a, and Ppt1 having increased centrality and Etfdh, Traf3 and Abcc1 decreased centrality as CR increased. This change in gene centrality in a graded manner with CR, occurred in the absence of parallel changes in gene expression levels. This study emphasizes the importance of augmenting traditional differential gene expression analyses to better understand structural changes in the transcriptome. Overall our results suggested that CR induced changes in centrality of biological relevant genes that play an important role in preventing the age-associated loss of network integrity irrespective of their gene expression levels. PMID:27115072
A novel strategy of integrated microarray analysis identifies CENPA, CDK1 and CDC20 as a cluster of diagnostic biomarkers in lung adenocarcinoma.

PubMed

Liu, Wan-Ting; Wang, Yang; Zhang, Jing; Ye, Fei; Huang, Xiao-Hui; Li, Bin; He, Qing-Yu

2018-07-01

Lung adenocarcinoma (LAC) is the most lethal cancer and the leading cause of cancer-related death worldwide. The identification of meaningful clusters of co-expressed genes or representative biomarkers may help improve the accuracy of LAC diagnoses. Public databases, such as the Gene Expression Omnibus (GEO), provide rich resources of valuable information for clinics, however, the integration of multiple microarray datasets from various platforms and institutes remained a challenge. To determine potential indicators of LAC, we performed genome-wide relative significance (GWRS), genome-wide global significance (GWGS) and support vector machine (SVM) analyses progressively to identify robust gene biomarker signatures from 5 different microarray datasets that included 330 samples. The top 200 genes with robust signatures were selected for integrative analysis according to "guilt-by-association" methods, including protein-protein interaction (PPI) analysis and gene co-expression analysis. Of these 200 genes, only 10 genes showed both intensive PPI network and high gene co-expression correlation (r > 0.8). IPA analysis of this regulatory networks suggested that the cell cycle process is a crucial determinant of LAC. CENPA, as well as two linked hub genes CDK1 and CDC20, are determined to be potential indicators of LAC. Immunohistochemical staining showed that CENPA, CDK1 and CDC20 were highly expressed in LAC cancer tissue with co-expression patterns. A Cox regression model indicated that LAC patients with CENPA + /CDK1 + and CENPA + /CDC20 + were high-risk groups in terms of overall survival. In conclusion, our integrated microarray analysis demonstrated that CENPA, CDK1 and CDC20 might serve as novel cluster of prognostic biomarkers for LAC, and the cooperative unit of three genes provides a technically simple approach for identification of LAC patients. Copyright © 2018 Elsevier B.V. All rights reserved.
Model-based clustering for RNA-seq data.

PubMed

Si, Yaqing; Liu, Peng; Li, Pinghua; Brutnell, Thomas P

2014-01-15

RNA-seq technology has been widely adopted as an attractive alternative to microarray-based methods to study global gene expression. However, robust statistical tools to analyze these complex datasets are still lacking. By grouping genes with similar expression profiles across treatments, cluster analysis provides insight into gene functions and networks, and hence is an important technique for RNA-seq data analysis. In this manuscript, we derive clustering algorithms based on appropriate probability models for RNA-seq data. An expectation-maximization algorithm and another two stochastic versions of expectation-maximization algorithms are described. In addition, a strategy for initialization based on likelihood is proposed to improve the clustering algorithms. Moreover, we present a model-based hybrid-hierarchical clustering method to generate a tree structure that allows visualization of relationships among clusters as well as flexibility of choosing the number of clusters. Results from both simulation studies and analysis of a maize RNA-seq dataset show that our proposed methods provide better clustering results than alternative methods such as the K-means algorithm and hierarchical clustering methods that are not based on probability models. An R package, MBCluster.Seq, has been developed to implement our proposed algorithms. This R package provides fast computation and is publicly available at http://www.r-project.org
Application of dynamic topic models to toxicogenomics data.

PubMed

Lee, Mikyung; Liu, Zhichao; Huang, Ruili; Tong, Weida

2016-10-06

All biological processes are inherently dynamic. Biological systems evolve transiently or sustainably according to sequential time points after perturbation by environment insults, drugs and chemicals. Investigating the temporal behavior of molecular events has been an important subject to understand the underlying mechanisms governing the biological system in response to, such as, drug treatment. The intrinsic complexity of time series data requires appropriate computational algorithms for data interpretation. In this study, we propose, for the first time, the application of dynamic topic models (DTM) for analyzing time-series gene expression data. A large time-series toxicogenomics dataset was studied. It contains over 3144 microarrays of gene expression data corresponding to rat livers treated with 131 compounds (most are drugs) at two doses (control and high dose) in a repeated schedule containing four separate time points (4-, 8-, 15- and 29-day). We analyzed, with DTM, the topics (consisting of a set of genes) and their biological interpretations over these four time points. We identified hidden patterns embedded in this time-series gene expression profiles. From the topic distribution for compound-time condition, a number of drugs were successfully clustered by their shared mode-of-action such as PPARɑ agonists and COX inhibitors. The biological meaning underlying each topic was interpreted using diverse sources of information such as functional analysis of the pathways and therapeutic uses of the drugs. Additionally, we found that sample clusters produced by DTM are much more coherent in terms of functional categories when compared to traditional clustering algorithms. We demonstrated that DTM, a text mining technique, can be a powerful computational approach for clustering time-series gene expression profiles with the probabilistic representation of their dynamic features along sequential time frames. The method offers an alternative way for uncovering hidden patterns embedded in time series gene expression profiles to gain enhanced understanding of dynamic behavior of gene regulation in the biological system.
Identification of potential crucial genes and construction of microRNA-mRNA negative regulatory networks in osteosarcoma.

PubMed

Pan, Yue; Lu, Lingyun; Chen, Junquan; Zhong, Yong; Dai, Zhehao

2018-01-01

This study aimed to identify potential crucial genes and construction of microRNA-mRNA negative regulatory networks in osteosarcoma by comprehensive bioinformatics analysis. Data of gene expression profiles (GSE28424) and miRNA expression profiles (GSE28423) were downloaded from GEO database. The differentially expressed genes (DEGs) and miRNAs (DEMIs) were obtained by R Bioconductor packages. Functional and enrichment analyses of selected genes were performed using DAVID database. Protein-protein interaction (PPI) network was constructed by STRING and visualized in Cytoscape. The relationships among the DEGs and module in PPI network were analyzed by plug-in NetworkAnalyzer and MCODE seperately. Through the TargetScan and comparing target genes with DEGs, the miRNA-mRNA regulation network was established. Totally 346 DEGs and 90 DEMIs were found to be differentially expressed. These DEGs were enriched in biological processes and KEGG pathway of inflammatory immune response. 25 genes in the PPI network were selected as hub genes. Top 10 hub genes were TYROBP, HLA-DRA, VWF, PPBP, SERPING1, HLA-DPA1, SERPINA1, KIF20A, FERMT3, HLA-E. PPI network of DEGs followed a pattern of power law network and met the characteristics of small-world network. MCODE analysis identified 4 clusters and the most significant cluster consisted of 11 nodes and 55 edges. SEPP1, CKS2, TCAP, BPI were identified as the seed genes in their own clusters, respectively. The miRNA-mRNA regulation network which was composed of 89 pairs was established. MiR-210 had the highest connectivity with 12 target genes. Among the predicted target of MiR-96, HLA-DPA1 and TYROBP were the hub genes. Our study indicated possible differentially expressed genes and miRNA, and microRNA-mRNA negative regulatory networks in osteosarcoma by bioinformatics analysis, which may provide novel insights for unraveling pathogenesis of osteosarcoma.

A strategy of gene overexpression based on tandem repetitive promoters in Escherichia coli.

PubMed

Li, Mingji; Wang, Junshu; Geng, Yanping; Li, Yikui; Wang, Qian; Liang, Quanfeng; Qi, Qingsheng

2012-02-06

For metabolic engineering, many rate-limiting steps may exist in the pathways of accumulating the target metabolites. Increasing copy number of the desired genes in these pathways is a general method to solve the problem, for example, the employment of the multi-copy plasmid-based expression system. However, this method may bring genetic instability, structural instability and metabolic burden to the host, while integrating of the desired gene into the chromosome may cause inadequate transcription or expression. In this study, we developed a strategy for obtaining gene overexpression by engineering promoter clusters consisted of multiple core-tac-promoters (MCPtacs) in tandem. Through a uniquely designed in vitro assembling process, a series of promoter clusters were constructed. The transcription strength of these promoter clusters showed a stepwise enhancement with the increase of tandem repeats number until it reached the critical value of five. Application of the MCPtacs promoter clusters in polyhydroxybutyrate (PHB) production proved that it was efficient. Integration of the phaCAB genes with the 5CPtacs promoter cluster resulted in an engineered E.coli that can accumulate 23.7% PHB of the cell dry weight in batch cultivation. The transcription strength of the MCPtacs promoter cluster can be greatly improved by increasing the tandem repeats number of the core-tac-promoter. By integrating the desired gene together with the MCPtacs promoter cluster into the chromosome of E. coli, we can achieve high and stale overexpression with only a small size. This strategy has an application potential in many fields and can be extended to other bacteria.
Preclinical Evaluation of An Anti-HCV miRNA Cluster for Treatment of HCV Infection

PubMed Central

Yang, Xiao; Marcucci, Katherine; Anguela, Xavier; Couto, Linda B.

2013-01-01

We developed a strategy to treat hepatitis C virus (HCV) infection by replacing five endogenous microRNA (miRNA) sequences of a natural miRNA cluster (miR-17–92) with sequences that are complementary to the HCV genome. This miRNA cluster (HCV-miR-Cluster 5) is delivered to cells using adeno-associated virus (AAV) vectors and the miRNAs are expressed in the liver, the site of HCV replication and assembly. AAV-HCV-miR-Cluster 5 inhibited bona fide HCV replication in vitro by up to 95% within 2 days, and the spread of HCV to uninfected cells was prevented by continuous expression of the anti-HCV miRNAs. Furthermore, the number of cells harboring HCV RNA replicons decreased dramatically by sustained expression of the anti-HCV miRNAs, suggesting that the vector is capable of curing cells of HCV. Delivery of AAV-HCV-miR-Cluster 5 to mice resulted in efficient transfer of the miRNA gene cluster and expression of all five miRNAs in liver tissue, at levels up to 1,300 copies/cell. These levels achieved up to 98% gene silencing of cognate HCV sequences, and no liver toxicity was observed, supporting the safety of this approach. Therefore, AAV-HCV-miR-Cluster 5 represents a different paradigm for the treatment of HCV infection. PMID:23295950
Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

PubMed Central

Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jakob; Gilchrist, Michael J; Panitz, Frank; Jørgensen, Claus; Scheibye-Knudsen, Karsten; Arvin, Troels; Lumholdt, Steen; Sawera, Milena; Green, Trine; Nielsen, Bente J; Havgaard, Jakob H; Rosenkilde, Carina; Wang, Jun; Li, Heng; Li, Ruiqiang; Liu, Bin; Hu, Songnian; Dong, Wei; Li, Wei; Yu, Jun; Wang, Jian; Stærfeldt, Hans-Henrik; Wernersson, Rasmus; Madsen, Lone B; Thomsen, Bo; Hornshøj, Henrik; Bujie, Zhan; Wang, Xuegang; Wang, Xuefei; Bolund, Lars; Brunak, Søren; Yang, Huanming; Bendixen, Christian; Fredholm, Merete

2007-01-01

Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. PMID:17407547
Mining biological information from 3D short time-series gene expression data: the OPTricluster algorithm.

PubMed

Tchagang, Alain B; Phan, Sieu; Famili, Fazel; Shearer, Heather; Fobert, Pierre; Huang, Yi; Zou, Jitao; Huang, Daiqing; Cutler, Adrian; Liu, Ziying; Pan, Youlian

2012-04-04

Nowadays, it is possible to collect expression levels of a set of genes from a set of biological samples during a series of time points. Such data have three dimensions: gene-sample-time (GST). Thus they are called 3D microarray gene expression data. To take advantage of the 3D data collected, and to fully understand the biological knowledge hidden in the GST data, novel subspace clustering algorithms have to be developed to effectively address the biological problem in the corresponding space. We developed a subspace clustering algorithm called Order Preserving Triclustering (OPTricluster), for 3D short time-series data mining. OPTricluster is able to identify 3D clusters with coherent evolution from a given 3D dataset using a combinatorial approach on the sample dimension, and the order preserving (OP) concept on the time dimension. The fusion of the two methodologies allows one to study similarities and differences between samples in terms of their temporal expression profile. OPTricluster has been successfully applied to four case studies: immune response in mice infected by malaria (Plasmodium chabaudi), systemic acquired resistance in Arabidopsis thaliana, similarities and differences between inner and outer cotyledon in Brassica napus during seed development, and to Brassica napus whole seed development. These studies showed that OPTricluster is robust to noise and is able to detect the similarities and differences between biological samples. Our analysis showed that OPTricluster generally outperforms other well known clustering algorithms such as the TRICLUSTER, gTRICLUSTER and K-means; it is robust to noise and can effectively mine the biological knowledge hidden in the 3D short time-series gene expression data.
Gene amplification of the transcription factor DP1 and CTNND1 in human lung cancer.

PubMed

Castillo, Sandra D; Angulo, Barbara; Suarez-Gauthier, Ana; Melchor, Lorenzo; Medina, Pedro P; Sanchez-Verde, Lydia; Torres-Lanzas, Juan; Pita, Guillermo; Benitez, Javier; Sanchez-Cespedes, Montse

2010-09-01

The search for novel oncogenes is important because they could be the target of future specific anticancer therapies. In the present paper we report the identification of novel amplified genes in lung cancer by means of global gene expression analysis. To screen for amplicons, we aligned the gene expression data according to the position of transcripts in the human genome and searched for clusters of over-expressed genes. We found several clusters with gene over-expression, suggesting an underlying genomic amplification. FISH and microarray analysis for DNA copy number in two clusters, at chromosomes 11q12 and 13q34, confirmed the presence of amplifications spanning about 0.4 and 1 Mb for 11q12 and 13q34, respectively. Amplification at these regions each occurred at a frequency of 3%. Moreover, quantitative RT-PCR of each individual transcript within the amplicons allowed us to verify the increased in gene expression of several genes. The p120ctn and DP1 proteins, encoded by two candidate oncogenes, CTNND1 and TFDP1, at 11q12 and 13q amplicons, respectively, showed very strong immunostaining in lung tumours with gene amplification. We then focused on the 13q34 amplicon and in the TFDP1 candidate oncogene. To further determine the oncogenic properties of DP1, we searched for lung cancer cell lines carrying TFDP1 amplification. Depletion of TFDP1 expression by small interference RNA in a lung cancer cell line (HCC33) with TFDP1 amplification and protein over-expression reduced cell viability by 50%. In conclusion, we report the identification of two novel amplicons, at 13q34 and 11q12, each occurring at a frequency of 3% of non-small cell lung cancers. TFDP1, which encodes the E2F-associated transcription factor DP1 is a candidate oncogene at 13q34. The data discussed in this publication have been deposited in NCBIs Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) and are accessible through GEO Series Accession No. GSE21168.
The sirodesmin biosynthetic gene cluster of the plant pathogenic fungus Leptosphaeria maculans.

PubMed

Gardiner, Donald M; Cozijnsen, Anton J; Wilson, Leanne M; Pedras, M Soledade C; Howlett, Barbara J

2004-09-01

Sirodesmin PL is a phytotoxin produced by the fungus Leptosphaeria maculans, which causes blackleg disease of canola (Brassica napus). This phytotoxin belongs to the epipolythiodioxopiperazine (ETP) class of toxins produced by fungi including mammalian and plant pathogens. We report the cloning of a cluster of genes with predicted roles in the biosynthesis of sirodesmin PL and show via gene disruption that one of these genes (encoding a two-module non-ribosomal peptide synthetase) is essential for sirodesmin PL biosynthesis. Of the nine genes in the cluster tested, all are co-regulated with the production of sirodesmin PL in culture. A similar cluster is present in the genome of the opportunistic human pathogen Aspergillus fumigatus and is most likely responsible for the production of gliotoxin, which is also an ETP. Homologues of the genes in the cluster were also identified in expressed sequence tags of the ETP producing fungus Chaetomium globosum. Two other fungi with publicly available genome sequences, Magnaporthe grisea and Fusarium graminearum, had similar gene clusters. A comparative analysis of all four clusters is presented. This is the first report of the genes responsible for the biosynthesis of an ETP. Copyright 2004 Blackwell Publishing Ltd
Structural Diversification of Lyngbyatoxin A by Host-Dependent Heterologous Expression of the tleABC Biosynthetic Gene Cluster.

PubMed

Zhang, Lihan; Hoshino, Shotaro; Awakawa, Takayoshi; Wakimoto, Toshiyuki; Abe, Ikuro

2016-08-03

Natural products have enormous structural diversity, yet little is known about how such diversity is achieved in nature. Here we report the structural diversification of a cyanotoxin-lyngbyatoxin A-and its biosynthetic intermediates by heterologous expression of the Streptomyces-derived tleABC biosynthetic gene cluster in three different Streptomyces hosts: S. lividans, S. albus, and S. avermitilis. Notably, the isolated lyngbyatoxin derivatives, including four new natural products, were biosynthesized by crosstalk between the heterologous tleABC gene cluster and the endogenous host enzymes. The simple strategy described here has expanded the structural diversity of lyngbyatoxin A and its biosynthetic intermediates, and provides opportunities for investigation of the currently underestimated hidden biosynthetic crosstalk. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Environmental history impacts gene expression during diapause development in the alfalfa leafcutting bee, Megachile rotundata.

PubMed

Yocum, George D; Childers, Anna K; Rinehart, Joseph P; Rajamohan, Arun; Pitts-Singer, Theresa L; Greenlee, Kendra J; Bowsher, Julia H

2018-05-10

Our understanding of the mechanisms controlling insect diapause has increased dramatically with the introduction of global gene expression techniques, such as RNA-seq. However, little attention has been given to how ecologically relevant field conditions may affect gene expression during diapause development because previous studies have focused on laboratory reared and maintained insects. To determine whether gene expression differs between laboratory and field conditions, prepupae of the alfalfa leafcutting bee, Megachile rotundata , entering diapause early or late in the growing season were collected. These two groups were further subdivided in early autumn into laboratory and field maintained groups, resulting in four experimental treatments of diapausing prepupae: early and late field, and early and late laboratory. RNA-seq and differential expression analyses were performed on bees from the four treatment groups in November, January, March and May. The number of treatment-specific differentially expressed genes (97 to 1249) outnumbered the number of differentially regulated genes common to all four treatments (14 to 229), indicating that exposure to laboratory or field conditions had a major impact on gene expression during diapause development. Principle component analysis and hierarchical cluster analysis yielded similar grouping of treatments, confirming that the treatments form distinct clusters. Our results support the conclusion that gene expression during the course of diapause development is not a simple ordered sequence, but rather a highly plastic response determined primarily by the environmental history of the individual insect. © 2018. Published by The Company of Biologists Ltd.
Clustering by soft-constraint affinity propagation: applications to gene-expression data.

PubMed

Leone, Michele; Sumedha; Weigt, Martin

2007-10-15

Similarity-measure-based clustering is a crucial problem appearing throughout scientific data analysis. Recently, a powerful new algorithm called Affinity Propagation (AP) based on message-passing techniques was proposed by Frey and Dueck (2007a). In AP, each cluster is identified by a common exemplar all other data points of the same cluster refer to, and exemplars have to refer to themselves. Albeit its proved power, AP in its present form suffers from a number of drawbacks. The hard constraint of having exactly one exemplar per cluster restricts AP to classes of regularly shaped clusters, and leads to suboptimal performance, e.g. in analyzing gene expression data. This limitation can be overcome by relaxing the AP hard constraints. A new parameter controls the importance of the constraints compared to the aim of maximizing the overall similarity, and allows to interpolate between the simple case where each data point selects its closest neighbor as an exemplar and the original AP. The resulting soft-constraint affinity propagation (SCAP) becomes more informative, accurate and leads to more stable clustering. Even though a new a priori free parameter is introduced, the overall dependence of the algorithm on external tuning is reduced, as robustness is increased and an optimal strategy for parameter selection emerges more naturally. SCAP is tested on biological benchmark data, including in particular microarray data related to various cancer types. We show that the algorithm efficiently unveils the hierarchical cluster structure present in the data sets. Further on, it allows to extract sparse gene expression signatures for each cluster.
A Dopaminergic Gene Cluster in the Prefrontal Cortex Predicts Performance Indicative of General Intelligence in Genetically Heterogeneous Mice

PubMed Central

Kolata, Stefan; Light, Kenneth; Wass, Christopher D.; Colas-Zelin, Danielle; Roy, Debasri; Matzel, Louis D.

2010-01-01

Background Genetically heterogeneous mice express a trait that is qualitatively and psychometrically analogous to general intelligence in humans, and as in humans, this trait co-varies with the processing efficacy of working memory (including its dependence on selective attention). Dopamine signaling in the prefrontal cortex (PFC) has been established to play a critical role in animals' performance in both working memory and selective attention tasks. Owing to this role of the PFC in the regulation of working memory, here we compared PFC gene expression profiles of 60 genetically diverse CD-1 mice that exhibited a wide range of general learning abilities (i.e., aggregate performance across five diverse learning tasks). Methodology/Principal Findings Animals' general cognitive abilities were first determined based on their aggregate performance across a battery of five diverse learning tasks. With a procedure designed to minimize false positive identifications, analysis of gene expression microarrays (comprised of ≈25,000 genes) identified a small number (<20) of genes that were differentially expressed across animals that exhibited fast and slow aggregate learning abilities. Of these genes, one functional cluster was identified, and this cluster (Darpp-32, Drd1a, and Rgs9) is an established modulator of dopamine signaling. Subsequent quantitative PCR found that expression of these dopaminegic genes plus one vascular gene (Nudt6) were significantly correlated with individual animal's general cognitive performance. Conclusions/Significance These results indicate that D1-mediated dopamine signaling in the PFC, possibly through its modulation of working memory, is predictive of general cognitive abilities. Furthermore, these results provide the first direct evidence of specific molecular pathways that might potentially regulate general intelligence. PMID:21103339
Dynamic gene expression analysis in a H1N1 influenza virus mouse pneumonia model.

PubMed

Bao, Yanyan; Gao, Yingjie; Shi, Yujing; Cui, Xiaolan

2017-06-01

H1N1, a major pathogenic subtype of influenza A virus, causes a respiratory infection in humans and livestock that can range from a mild infection to more severe pneumonia associated with acute respiratory distress syndrome. Understanding the dynamic changes in the genome and the related functional changes induced by H1N1 influenza virus infection is essential to elucidating the pathogenesis of this virus and thereby determining strategies to prevent future outbreaks. In this study, we filtered the significantly expressed genes in mouse pneumonia using mRNA microarray analysis. Using STC analysis, seven significant gene clusters were revealed, and using STC-GO analysis, we explored the significant functions of these seven gene clusters. The results revealed GOs related to H1N1 virus-induced inflammatory and immune functions, including innate immune response, inflammatory response, specific immune response, and cellular response to interferon-beta. Furthermore, the dynamic regulation relationships of the key genes in mouse pneumonia were revealed by dynamic gene network analysis, and the most important genes were filtered, including Dhx58, Cxcl10, Cxcl11, Zbp1, Ifit1, Ifih1, Trim25, Mx2, Oas2, Cd274, Irgm1, and Irf7. These results suggested that during mouse pneumonia, changes in the expression of gene clusters and the complex interactions among genes lead to significant changes in function. Dynamic gene expression analysis revealed key genes that performed important functions. These results are a prelude to advancements in mouse H1N1 influenza virus infection biology, as well as the use of mice as a model organism for human H1N1 influenza virus infection studies.
Identification of the transcriptional regulators by expression profiling infected with hepatitis B virus.

PubMed

Chai, Xiaoqiang; Han, Yanan; Yang, Jian; Zhao, Xianxian; Liu, Yewang; Hou, Xugang; Tang, Yiheng; Zhao, Shirong; Li, Xiao

2016-02-01

The molecular pathogenesis of infection by hepatitis B virus with human is extremely complex and heterogeneous. To date the molecular information is not clearly defined despite intensive research efforts. Thus, studies aimed at transcription and regulation during virus infection or combined researches of those already known to be beneficial are needed. With the purpose of identifying the transcriptional regulators related to infection of hepatitis B virus in gene level, the gene expression profiles from some normal individuals and hepatitis B patients were analyzed in our study. In this work, the differential expressed genes were selected primarily. The several genes among those were validated in an independent set by qRT-PCR. Then the differentially co-expression analysis was conducted to identify differentially co-expressed links and differential co-expressed genes. Next, the analysis of the regulatory impact factors was performed through mapping the links and regulatory data. In order to give a further insight to these regulators, the co-expression gene modules were identified using a threshold-based hierarchical clustering method. Incidentally, the construction of the regulatory network was generated using the computer software. A total of 137,284 differentially co-expressed links and 780 differential co-expressed genes were identified. These co-expressed genes were significantly enriched inflammatory response. The results of regulatory impact factors revealed several crucial regulators related to hepatocellular carcinoma and other high-rank regulators. Meanwhile, more than one hundred co-expression gene modules were identified using clustering method. In our study, some important transcriptional regulators were identified using a computational method, which may enhance the understanding of disease mechanisms and lead to an improved treatment of hepatitis B. However, further experimental studies are required to confirm these findings. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
DNA methylation and differentiation: HOX genes in muscle cells

PubMed Central

2013-01-01

Background Tight regulation of homeobox genes is essential for vertebrate development. In a study of genome-wide differential methylation, we recently found that homeobox genes, including those in the HOX gene clusters, were highly overrepresented among the genes with hypermethylation in the skeletal muscle lineage. Methylation was analyzed by reduced representation bisulfite sequencing (RRBS) of postnatal myoblasts, myotubes and adult skeletal muscle tissue and 30 types of non-muscle-cell cultures or tissues. Results In this study, we found that myogenic hypermethylation was present in specific subregions of all four HOX gene clusters and was associated with various chromatin epigenetic features. Although the 3′ half of the HOXD cluster was silenced and enriched in polycomb repression-associated H3 lysine 27 trimethylation in most examined cell types, including myoblasts and myotubes, myogenic samples were unusual in also displaying much DNA methylation in this region. In contrast, both HOXA and HOXC clusters displayed myogenic hypermethylation bordering a central region containing many genes preferentially expressed in myogenic progenitor cells and consisting largely of chromatin with modifications typical of promoters and enhancers in these cells. A particularly interesting example of myogenic hypermethylation was HOTAIR, a HOXC noncoding RNA gene, which can silence HOXD genes in trans via recruitment of polycomb proteins. In myogenic progenitor cells, the preferential expression of HOTAIR was associated with hypermethylation immediately downstream of the gene. Other HOX gene regions also displayed myogenic DNA hypermethylation despite being moderately expressed in myogenic cells. Analysis of representative myogenic hypermethylated sites for 5-hydroxymethylcytosine revealed little or none of this base, except for an intragenic site in HOXB5 which was specifically enriched in this base in skeletal muscle tissue, whereas myoblasts had predominantly 5-methylcytosine at the same CpG site. Conclusions Our results suggest that myogenic hypermethylation of HOX genes helps fine-tune HOX sense and antisense gene expression through effects on 5′ promoters, intragenic and intergenic enhancers and internal promoters. Myogenic hypermethylation might also affect the relative abundance of different RNA isoforms, facilitate transcription termination, help stop the spread of activation-associated chromatin domains and stabilize repressive chromatin structures. PMID:23916067
Fe-S Proteins that Regulate Gene Expression

PubMed Central

Mettert, Erin L.; Kiley, Patricia J.

2014-01-01

Iron-sulfur (Fe-S) cluster containing proteins that regulate gene expression are present in most organisms. The innate chemistry of their Fe-S cofactors makes these regulatory proteins ideal for sensing environmental signals, such as gases (e.g. O2 and NO), levels of Fe and Fe-S clusters, reactive oxygen species, and redox cycling compounds, to subsequently mediate an adaptive response. Here we review the recent findings that have provided invaluable insight into the mechanism and function of these highly significant Fe-S regulatory proteins. PMID:25450978
[Construction of screening system for mutation of negative regulatory genes in Streptomyces].

PubMed

Zhu, Yu; Feng, Chi; Tan, Huarong; Tian, Yuqing

2013-10-04

We aimed to create a novel report system for screening the mutation of the negative regulatory genes, especially for those repressing the expression of cryptic antibiotics clusters. We used marker-free gene disruption strategy, which combines with the "REDIRECT (Rapid Efficient Directed Recombination Time Saving)" technology and in vivo site-specific recombination by Streptomyces phage phiBT1 integrase, to construct a scbR2/inoA double mutant strain of S. coelicolor M145. This strain was used as the host of the report system. For the construction of the reporter plasmid, the ScbR2 repressed promoter of cpkO from CPK (cryptic polyketide) cluster was used to drive the expression of a promoterless conserved gene inoA of S. coelicolor. Then the reporter plasmid was introduced into the host strain described above to test the availability of inoA as a reporter gene in this system. The scbR2/inoA double mutant strain gave rise to a bald pheno type on MM medium in the absence of inositol, and produced yellow pigmented secondary metabolite by the disruption of scbR2 to release the repression of cpkO, a pathway specific activator gene situated in CPK cluster. After introducing the reporter plasmid into this test stain, the resulting strain recovered the phenotype as wild-type strain, indicating that the promoter of cpkO can drive the expression of inoA in scbR2 mutant and consequently restore the biosynthesis of inositol. Our results indicated that inoA can be used as a novel reporter gene for Streptomyces, especially for detecting the activation of the "silent" promoter. This report system might be available for screening the mutation of the negative regulatory genes for the cryptic secondary metabolic gene clusters.
Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient.

PubMed

Yao, Jianchao; Chang, Chunqi; Salmi, Mari L; Hung, Yeung Sam; Loraine, Ann; Roux, Stanley J

2008-06-18

Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. This study shows that SCC is an alternative to the Pearson correlation coefficient and the SD-weighted correlation coefficient, and is particularly useful for clustering replicated microarray data. This computational approach should be generally useful for proteomic data or other high-throughput analysis methodology.
Fast gene ontology based clustering for microarray experiments.

PubMed

Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

2008-11-21

Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.
Maximizing capture of gene co-expression relationships through pre-clustering of input expression samples: an Arabidopsis case study.

PubMed

Feltus, F Alex; Ficklin, Stephen P; Gibson, Scott M; Smith, Melissa C

2013-06-05

In genomics, highly relevant gene interaction (co-expression) networks have been constructed by finding significant pair-wise correlations between genes in expression datasets. These networks are then mined to elucidate biological function at the polygenic level. In some cases networks may be constructed from input samples that measure gene expression under a variety of different conditions, such as for different genotypes, environments, disease states and tissues. When large sets of samples are obtained from public repositories it is often unmanageable to associate samples into condition-specific groups, and combining samples from various conditions has a negative effect on network size. A fixed significance threshold is often applied also limiting the size of the final network. Therefore, we propose pre-clustering of input expression samples to approximate condition-specific grouping of samples and individual network construction of each group as a means for dynamic significance thresholding. The net effect is increase sensitivity thus maximizing the total co-expression relationships in the final co-expression network compendium. A total of 86 Arabidopsis thaliana co-expression networks were constructed after k-means partitioning of 7,105 publicly available ATH1 Affymetrix microarray samples. We term each pre-sorted network a Gene Interaction Layer (GIL). Random Matrix Theory (RMT), an un-supervised thresholding method, was used to threshold each of the 86 networks independently, effectively providing a dynamic (non-global) threshold for the network. The overall gene count across all GILs reached 19,588 genes (94.7% measured gene coverage) and 558,022 unique co-expression relationships. In comparison, network construction without pre-sorting of input samples yielded only 3,297 genes (15.9%) and 129,134 relationships. in the global network. Here we show that pre-clustering of microarray samples helps approximate condition-specific networks and allows for dynamic thresholding using un-supervised methods. Because RMT ensures only highly significant interactions are kept, the GIL compendium consists of 558,022 unique high quality A. thaliana co-expression relationships across almost all of the measurable genes on the ATH1 array. For A. thaliana, these networks represent the largest compendium to date of significant gene co-expression relationships, and are a means to explore complex pathway, polygenic, and pleiotropic relationships for this focal model plant. The networks can be explored at sysbio.genome.clemson.edu. Finally, this method is applicable to any large expression profile collection for any organism and is best suited where a knowledge-independent network construction method is desired.
Maximizing capture of gene co-expression relationships through pre-clustering of input expression samples: an Arabidopsis case study

PubMed Central

2013-01-01

Background In genomics, highly relevant gene interaction (co-expression) networks have been constructed by finding significant pair-wise correlations between genes in expression datasets. These networks are then mined to elucidate biological function at the polygenic level. In some cases networks may be constructed from input samples that measure gene expression under a variety of different conditions, such as for different genotypes, environments, disease states and tissues. When large sets of samples are obtained from public repositories it is often unmanageable to associate samples into condition-specific groups, and combining samples from various conditions has a negative effect on network size. A fixed significance threshold is often applied also limiting the size of the final network. Therefore, we propose pre-clustering of input expression samples to approximate condition-specific grouping of samples and individual network construction of each group as a means for dynamic significance thresholding. The net effect is increase sensitivity thus maximizing the total co-expression relationships in the final co-expression network compendium. Results A total of 86 Arabidopsis thaliana co-expression networks were constructed after k-means partitioning of 7,105 publicly available ATH1 Affymetrix microarray samples. We term each pre-sorted network a Gene Interaction Layer (GIL). Random Matrix Theory (RMT), an un-supervised thresholding method, was used to threshold each of the 86 networks independently, effectively providing a dynamic (non-global) threshold for the network. The overall gene count across all GILs reached 19,588 genes (94.7% measured gene coverage) and 558,022 unique co-expression relationships. In comparison, network construction without pre-sorting of input samples yielded only 3,297 genes (15.9%) and 129,134 relationships. in the global network. Conclusions Here we show that pre-clustering of microarray samples helps approximate condition-specific networks and allows for dynamic thresholding using un-supervised methods. Because RMT ensures only highly significant interactions are kept, the GIL compendium consists of 558,022 unique high quality A. thaliana co-expression relationships across almost all of the measurable genes on the ATH1 array. For A. thaliana, these networks represent the largest compendium to date of significant gene co-expression relationships, and are a means to explore complex pathway, polygenic, and pleiotropic relationships for this focal model plant. The networks can be explored at sysbio.genome.clemson.edu. Finally, this method is applicable to any large expression profile collection for any organism and is best suited where a knowledge-independent network construction method is desired. PMID:23738693
A Caenorhabditis elegans protein with a PRDM9-like SET domain localizes to chromatin-associated foci and promotes spermatocyte gene expression, sperm production and fertility.

PubMed

Engert, Christoph G; Droste, Rita; van Oudenaarden, Alexander; Horvitz, H Robert

2018-04-01

To better understand the tissue-specific regulation of chromatin state in cell-fate determination and animal development, we defined the tissue-specific expression of all 36 C. elegans presumptive lysine methyltransferase (KMT) genes using single-molecule fluorescence in situ hybridization (smFISH). Most KMTs were expressed in only one or two tissues. The germline was the tissue with the broadest KMT expression. We found that the germline-expressed C. elegans protein SET-17, which has a SET domain similar to that of the PRDM9 and PRDM7 SET-domain proteins, promotes fertility by regulating gene expression in primary spermatocytes. SET-17 drives the transcription of spermatocyte-specific genes from four genomic clusters to promote spermatid development. SET-17 is concentrated in stable chromatin-associated nuclear foci at actively transcribed msp (major sperm protein) gene clusters, which we term msp locus bodies. Our results reveal the function of a PRDM9/7-family SET-domain protein in spermatocyte transcription. We propose that the spatial intranuclear organization of chromatin factors might be a conserved mechanism in tissue-specific control of transcription.

A genome-wide analysis of the flax (Linum usitatissimum L.) dirigent protein family: from gene identification and evolution to differential regulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Corbin, Cyrielle; Drouet, Samantha; Markulin, Lucija

Identification of DIR encoding genes in flax genome. Analysis of phylogeny, gene/protein structures and evolution. Identification of new conserved motifs linked to biochemical functions. Investigation of spatio-temporal gene expression and response to stress. Dirigent proteins (DIRs) were discovered during 8-8' lignan biosynthesis studies, through identification of stereoselective coupling to afford either (+)- or (-)-pinoresinols from E-coniferyl alcohol. DIRs are also involved or potentially involved in terpenoid, allyl/propenyl phenol lignan, pterocarpan and lignin biosynthesis. DIRs have very large multigene families in different vascular plants including flax, with most still of unknown function. DIR studies typically focus on a small subset ofmore » genes and identification of biochemical/physiological functions. Herein, a genome-wide analysis and characterization of the predicted flax DIR 44-membered multigene family was performed, this species being a rich natural grain source of 8-8' linked secoisolariciresinol-derived lignan oligomers. All predicted DIR sequences, including their promoters, were analyzed together with their public gene expression datasets. Expression patterns of selected DIRs were examined using qPCR, as well as through clustering analysis of DIR gene expression. These analyses further implicated roles for specific DIRs in (-)-pinoresinol formation in seed-coats, as well as (+)-pinoresinol in vegetative organs and/or specific responses to stress. Phylogeny and gene expression analysis segregated flax DIRs into six distinct clusters with new cluster-specific motifs identified. We propose that these findings can serve as a foundation to further systematically determine functions of DIRs, i.e. other than those already known in lignan biosynthesis in flax and other species. Given the differential expression profiles and inducibility of the flax DIR family, we provisionally propose that some DIR genes of unknown function could be involved in different aspects of secondary cell wall biosynthesis and plant defense.« less
A genome-wide analysis of the flax (Linum usitatissimum L.) dirigent protein family: from gene identification and evolution to differential regulation.

PubMed

Corbin, Cyrielle; Drouet, Samantha; Markulin, Lucija; Auguin, Daniel; Lainé, Éric; Davin, Laurence B; Cort, John R; Lewis, Norman G; Hano, Christophe

2018-05-01

Identification of DIR encoding genes in flax genome. Analysis of phylogeny, gene/protein structures and evolution. Identification of new conserved motifs linked to biochemical functions. Investigation of spatio-temporal gene expression and response to stress. Dirigent proteins (DIRs) were discovered during 8-8' lignan biosynthesis studies, through identification of stereoselective coupling to afford either (+)- or (-)-pinoresinols from E-coniferyl alcohol. DIRs are also involved or potentially involved in terpenoid, allyl/propenyl phenol lignan, pterocarpan and lignin biosynthesis. DIRs have very large multigene families in different vascular plants including flax, with most still of unknown function. DIR studies typically focus on a small subset of genes and identification of biochemical/physiological functions. Herein, a genome-wide analysis and characterization of the predicted flax DIR 44-membered multigene family was performed, this species being a rich natural grain source of 8-8' linked secoisolariciresinol-derived lignan oligomers. All predicted DIR sequences, including their promoters, were analyzed together with their public gene expression datasets. Expression patterns of selected DIRs were examined using qPCR, as well as through clustering analysis of DIR gene expression. These analyses further implicated roles for specific DIRs in (-)-pinoresinol formation in seed-coats, as well as (+)-pinoresinol in vegetative organs and/or specific responses to stress. Phylogeny and gene expression analysis segregated flax DIRs into six distinct clusters with new cluster-specific motifs identified. We propose that these findings can serve as a foundation to further systematically determine functions of DIRs, i.e. other than those already known in lignan biosynthesis in flax and other species. Given the differential expression profiles and inducibility of the flax DIR family, we provisionally propose that some DIR genes of unknown function could be involved in different aspects of secondary cell wall biosynthesis and plant defense.
Accurate prediction of secondary metabolite gene clusters in filamentous fungi.

PubMed

Andersen, Mikael R; Nielsen, Jakob B; Klitgaard, Andreas; Petersen, Lene M; Zachariasen, Mia; Hansen, Tilde J; Blicher, Lene H; Gotfredsen, Charlotte H; Larsen, Thomas O; Nielsen, Kristian F; Mortensen, Uffe H

2013-01-02

Biosynthetic pathways of secondary metabolites from fungi are currently subject to an intense effort to elucidate the genetic basis for these compounds due to their large potential within pharmaceutics and synthetic biochemistry. The preferred method is methodical gene deletions to identify supporting enzymes for key synthases one cluster at a time. In this study, we design and apply a DNA expression array for Aspergillus nidulans in combination with legacy data to form a comprehensive gene expression compendium. We apply a guilt-by-association-based analysis to predict the extent of the biosynthetic clusters for the 58 synthases active in our set of experimental conditions. A comparison with legacy data shows the method to be accurate in 13 of 16 known clusters and nearly accurate for the remaining 3 clusters. Furthermore, we apply a data clustering approach, which identifies cross-chemistry between physically separate gene clusters (superclusters), and validate this both with legacy data and experimentally by prediction and verification of a supercluster consisting of the synthase AN1242 and the prenyltransferase AN11080, as well as identification of the product compound nidulanin A. We have used A. nidulans for our method development and validation due to the wealth of available biochemical data, but the method can be applied to any fungus with a sequenced and assembled genome, thus supporting further secondary metabolite pathway elucidation in the fungal kingdom.
Establishment of the Inducible Tet-On System for the Activation of the Silent Trichosetin Gene Cluster in Fusarium fujikuroi

PubMed Central

Janevska, Slavica; Arndt, Birgit; Baumann, Leonie; Apken, Lisa Helene; Mauriz Marques, Lucas Maciel; Humpf, Hans-Ulrich; Tudzynski, Bettina

2017-01-01

The PKS-NRPS-derived tetramic acid equisetin and its N-desmethyl derivative trichosetin exhibit remarkable biological activities against a variety of organisms, including plants and bacteria, e.g., Staphylococcus aureus. The equisetin biosynthetic gene cluster was first described in Fusarium heterosporum, a species distantly related to the notorious rice pathogen Fusarium fujikuroi. Here we present the activation and characterization of a homologous, but silent, gene cluster in F. fujikuroi. Bioinformatic analysis revealed that this cluster does not contain the equisetin N-methyltransferase gene eqxD and consequently, trichosetin was isolated as final product. The adaption of the inducible, tetracycline-dependent Tet-on promoter system from Aspergillus niger achieved a controlled overproduction of this toxic metabolite and a functional characterization of each cluster gene in F. fujikuroi. Overexpression of one of the two cluster-specific transcription factor (TF) genes, TF22, led to an activation of the three biosynthetic cluster genes, including the PKS-NRPS key gene. In contrast, overexpression of TF23, encoding a second Zn(II)2Cys6 TF, did not activate adjacent cluster genes. Instead, TF23 was induced by the final product trichosetin and was required for expression of the transporter-encoding gene MFS-T. TF23 and MFS-T likely act in consort and contribute to detoxification of trichosetin and therefore, self-protection of the producing fungus. PMID:28379186
Establishment of the Inducible Tet-On System for the Activation of the Silent Trichosetin Gene Cluster in Fusarium fujikuroi.

PubMed

Janevska, Slavica; Arndt, Birgit; Baumann, Leonie; Apken, Lisa Helene; Mauriz Marques, Lucas Maciel; Humpf, Hans-Ulrich; Tudzynski, Bettina

2017-04-05

The PKS-NRPS-derived tetramic acid equisetin and its N -desmethyl derivative trichosetin exhibit remarkable biological activities against a variety of organisms, including plants and bacteria, e.g., Staphylococcus aureus . The equisetin biosynthetic gene cluster was first described in Fusarium heterosporum , a species distantly related to the notorious rice pathogen Fusarium fujikuroi . Here we present the activation and characterization of a homologous, but silent, gene cluster in F. fujikuroi . Bioinformatic analysis revealed that this cluster does not contain the equisetin N -methyltransferase gene eqxD and consequently, trichosetin was isolated as final product. The adaption of the inducible, tetracycline-dependent Tet-on promoter system from Aspergillus niger achieved a controlled overproduction of this toxic metabolite and a functional characterization of each cluster gene in F. fujikuroi . Overexpression of one of the two cluster-specific transcription factor (TF) genes, TF22 , led to an activation of the three biosynthetic cluster genes, including the PKS-NRPS key gene. In contrast, overexpression of TF23 , encoding a second Zn(II)₂Cys₆ TF, did not activate adjacent cluster genes. Instead, TF23 was induced by the final product trichosetin and was required for expression of the transporter-encoding gene MFS-T . TF23 and MFS-T likely act in consort and contribute to detoxification of trichosetin and therefore, self-protection of the producing fungus.
Multiway real-time PCR gene expression profiling in yeast Saccharomyces cerevisiae reveals altered transcriptional response of ADH-genes to glucose stimuli.

PubMed

Ståhlberg, Anders; Elbing, Karin; Andrade-Garda, José Manuel; Sjögreen, Björn; Forootan, Amin; Kubista, Mikael

2008-04-16

The large sensitivity, high reproducibility and essentially unlimited dynamic range of real-time PCR to measure gene expression in complex samples provides the opportunity for powerful multivariate and multiway studies of biological phenomena. In multiway studies samples are characterized by their expression profiles to monitor changes over time, effect of treatment, drug dosage etc. Here we perform a multiway study of the temporal response of four yeast Saccharomyces cerevisiae strains with different glucose uptake rates upon altered metabolic conditions. We measured the expression of 18 genes as function of time after addition of glucose to four strains of yeast grown in ethanol. The data are analyzed by matrix-augmented PCA, which is a generalization of PCA for 3-way data, and the results are confirmed by hierarchical clustering and clustering by Kohonen self-organizing map. Our approach identifies gene groups that respond similarly to the change of nutrient, and genes that behave differently in mutant strains. Of particular interest is our finding that ADH4 and ADH6 show a behavior typical of glucose-induced genes, while ADH3 and ADH5 are repressed after glucose addition. Multiway real-time PCR gene expression profiling is a powerful technique which can be utilized to characterize functions of new genes by, for example, comparing their temporal response after perturbation in different genetic variants of the studied subject. The technique also identifies genes that show perturbed expression in specific strains.
Multiway real-time PCR gene expression profiling in yeast Saccharomyces cerevisiae reveals altered transcriptional response of ADH-genes to glucose stimuli

PubMed Central

Ståhlberg, Anders; Elbing, Karin; Andrade-Garda, José Manuel; Sjögreen, Björn; Forootan, Amin; Kubista, Mikael

2008-01-01

Background The large sensitivity, high reproducibility and essentially unlimited dynamic range of real-time PCR to measure gene expression in complex samples provides the opportunity for powerful multivariate and multiway studies of biological phenomena. In multiway studies samples are characterized by their expression profiles to monitor changes over time, effect of treatment, drug dosage etc. Here we perform a multiway study of the temporal response of four yeast Saccharomyces cerevisiae strains with different glucose uptake rates upon altered metabolic conditions. Results We measured the expression of 18 genes as function of time after addition of glucose to four strains of yeast grown in ethanol. The data are analyzed by matrix-augmented PCA, which is a generalization of PCA for 3-way data, and the results are confirmed by hierarchical clustering and clustering by Kohonen self-organizing map. Our approach identifies gene groups that respond similarly to the change of nutrient, and genes that behave differently in mutant strains. Of particular interest is our finding that ADH4 and ADH6 show a behavior typical of glucose-induced genes, while ADH3 and ADH5 are repressed after glucose addition. Conclusion Multiway real-time PCR gene expression profiling is a powerful technique which can be utilized to characterize functions of new genes by, for example, comparing their temporal response after perturbation in different genetic variants of the studied subject. The technique also identifies genes that show perturbed expression in specific strains. PMID:18412983
Accounting for noise when clustering biological data.

PubMed

Sloutsky, Roman; Jimenez, Nicolas; Swamidass, S Joshua; Naegle, Kristen M

2013-07-01

Clustering is a powerful and commonly used technique that organizes and elucidates the structure of biological data. Clustering data from gene expression, metabolomics and proteomics experiments has proven to be useful at deriving a variety of insights, such as the shared regulation or function of biochemical components within networks. However, experimental measurements of biological processes are subject to substantial noise-stemming from both technical and biological variability-and most clustering algorithms are sensitive to this noise. In this article, we explore several methods of accounting for noise when analyzing biological data sets through clustering. Using a toy data set and two different case studies-gene expression and protein phosphorylation-we demonstrate the sensitivity of clustering algorithms to noise. Several methods of accounting for this noise can be used to establish when clustering results can be trusted. These methods span a range of assumptions about the statistical properties of the noise and can therefore be applied to virtually any biological data source.
A method to identify differential expression profiles of time-course gene data with Fourier transformation.

PubMed

Kim, Jaehee; Ogden, Robert Todd; Kim, Haseong

2013-10-18

Time course gene expression experiments are an increasingly popular method for exploring biological processes. Temporal gene expression profiles provide an important characterization of gene function, as biological systems are both developmental and dynamic. With such data it is possible to study gene expression changes over time and thereby to detect differential genes. Much of the early work on analyzing time series expression data relied on methods developed originally for static data and thus there is a need for improved methodology. Since time series expression is a temporal process, its unique features such as autocorrelation between successive points should be incorporated into the analysis. This work aims to identify genes that show different gene expression profiles across time. We propose a statistical procedure to discover gene groups with similar profiles using a nonparametric representation that accounts for the autocorrelation in the data. In particular, we first represent each profile in terms of a Fourier basis, and then we screen out genes that are not differentially expressed based on the Fourier coefficients. Finally, we cluster the remaining gene profiles using a model-based approach in the Fourier domain. We evaluate the screening results in terms of sensitivity, specificity, FDR and FNR, compare with the Gaussian process regression screening in a simulation study and illustrate the results by application to yeast cell-cycle microarray expression data with alpha-factor synchronization.The key elements of the proposed methodology: (i) representation of gene profiles in the Fourier domain; (ii) automatic screening of genes based on the Fourier coefficients and taking into account autocorrelation in the data, while controlling the false discovery rate (FDR); (iii) model-based clustering of the remaining gene profiles. Using this method, we identified a set of cell-cycle-regulated time-course yeast genes. The proposed method is general and can be potentially used to identify genes which have the same patterns or biological processes, and help facing the present and forthcoming challenges of data analysis in functional genomics.
Analysis of Temporal-spatial Co-variation within Gene Expression Microarray Data in an Organogenesis Model

NASA Astrophysics Data System (ADS)

Ehler, Martin; Rajapakse, Vinodh; Zeeberg, Barry; Brooks, Brian; Brown, Jacob; Czaja, Wojciech; Bonner, Robert F.

The gene networks underlying closure of the optic fissure during vertebrate eye development are poorly understood. We used a novel clustering method based on Laplacian Eigenmaps, a nonlinear dimension reduction method, to analyze microarray data from laser capture microdissected (LCM) cells at the site and developmental stages (days 10.5 to 12.5) of optic fissure closure. Our new method provided greater biological specificity than classical clustering algorithms in terms of identifying more biological processes and functions related to eye development as defined by Gene Ontology at lower false discovery rates. This new methodology builds on the advantages of LCM to isolate pure phenotypic populations within complex tissues and allows improved ability to identify critical gene products expressed at lower copy number. The combination of LCM of embryonic organs, gene expression microarrays, and extracting spatial and temporal co-variations appear to be a powerful approach to understanding the gene regulatory networks that specify mammalian organogenesis.
Identification and characterization of Rhox13, a novel X-linked mouse homeobox gene

PubMed Central

Geyer, Christopher B.; Eddy, Edward M.

2008-01-01

Homeobox genes encode transcription factors whose expression organizes programs of development. A number of homeobox genes expressed in reproductive tissues have been identified recently, including a colinear cluster on the X chromosome in mice. This has led to an increased interest in understanding the role(s) of homeobox genes in regulating development of reproductive tissues including the testis, ovary, and placenta. Here we report the identification and characterization of a novel homeobox gene of the paired-like class on the X chromosome distal to the reproductive homeobox (Rhox) cluster in mice. Transcripts are found in the testis and ovary as early as 13.5 days post-coitum (dpc). Transcription ceases in the ovary by 3 days post-partum (dpp), but continues in the testis through adulthood. The Rhox13 gene encodes a 25.3 kDa protein expressed in the adult testis in germ cells at the basal aspect of the seminiferous epithelium. PMID:18675325
SMCHD1 regulates a limited set of gene clusters on autosomal chromosomes.

PubMed

Mason, Amanda G; Slieker, Roderick C; Balog, Judit; Lemmers, Richard J L F; Wong, Chao-Jen; Yao, Zizhen; Lim, Jong-Won; Filippova, Galina N; Ne, Enrico; Tawil, Rabi; Heijmans, Bas T; Tapscott, Stephen J; van der Maarel, Silvère M

2017-06-06

Facioscapulohumeral muscular dystrophy (FSHD) is in most cases caused by a contraction of the D4Z4 macrosatellite repeat on chromosome 4 (FSHD1) or by mutations in the SMCHD1 or DNMT3B gene (FSHD2). Both situations result in the incomplete epigenetic repression of the D4Z4-encoded retrogene DUX4 in somatic cells, leading to the aberrant expression of DUX4 in the skeletal muscle. In mice, Smchd1 regulates chromatin repression at different loci, having a role in CpG methylation establishment and/or maintenance. To investigate the global effects of harboring heterozygous SMCHD1 mutations on DNA methylation in humans, we combined 450k methylation analysis on mononuclear monocytes from female heterozygous SMCHD1 mutation carriers and unaffected controls with reduced representation bisulfite sequencing (RRBS) on FSHD2 and control myoblast cell lines. Candidate loci were then evaluated for SMCHD1 binding using ChIP-qPCR and expression was evaluated using RT-qPCR. We identified a limited number of clustered autosomal loci with CpG hypomethylation in SMCHD1 mutation carriers: the protocadherin (PCDH) cluster on chromosome 5, the transfer RNA (tRNA) and 5S rRNA clusters on chromosome 1, the HOXB and HOXD clusters on chromosomes 17 and 2, respectively, and the D4Z4 repeats on chromosomes 4 and 10. Furthermore, minor increases in RNA expression were seen in FSHD2 myoblasts for some of the PCDHβ cluster isoforms, tRNA isoforms, and a HOXB isoform in comparison to controls, in addition to the previously reported effects on DUX4 expression. SMCHD1 was bound at DNAseI hypersensitivity sites known to regulate the PCDHβ cluster and at the chromosome 1 tRNA cluster, with decreased binding in SMCHD1 mutation carriers at the PCDHβ cluster sites. Our study is the first to investigate the global methylation effects in humans resulting from heterozygous mutations in SMCHD1. Our results suggest that SMCHD1 acts as a repressor on a limited set of autosomal gene clusters, as an observed reduction in methylation associates with a loss of SMCHD1 binding and increased expression for some of the loci.
Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

2004-08-06

Background The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. Results We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene,more » and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Conclusions Measuring conservation of sequence features closely linked to function - such as binding-site clustering - makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less
Functional Organization of hsp70 Cluster in Camel (Camelus dromedarius) and Other Mammals

PubMed Central

Garbuz, David G.; Astakhova, Lubov N.; Zatsepina, Olga G.; Arkhipova, Irina R.; Nudler, Eugene; Evgen'ev, Michael B.

2011-01-01

Heat shock protein 70 (Hsp70) is a molecular chaperone providing tolerance to heat and other challenges at the cellular and organismal levels. We sequenced a genomic cluster containing three hsp70 family genes linked with major histocompatibility complex (MHC) class III region from an extremely heat tolerant animal, camel (Camelus dromedarius). Two hsp70 family genes comprising the cluster contain heat shock elements (HSEs), while the third gene lacks HSEs and should not be induced by heat shock. Comparison of the camel hsp70 cluster with the corresponding regions from several mammalian species revealed similar organization of genes forming the cluster. Specifically, the two heat inducible hsp70 genes are arranged in tandem, while the third constitutively expressed hsp70 family member is present in inverted orientation. Comparison of regulatory regions of hsp70 genes from camel and other mammals demonstrates that transcription factor matches with highest significance are located in the highly conserved 250-bp upstream region and correspond to HSEs followed by NF-Y and Sp1 binding sites. The high degree of sequence conservation leaves little room for putative camel-specific regulatory elements. Surprisingly, RT-PCR and 5′/3′-RACE analysis demonstrated that all three hsp70 genes are expressed in camel's muscle and blood cells not only after heat shock, but under normal physiological conditions as well, and may account for tolerance of camel cells to extreme environmental conditions. A high degree of evolutionary conservation observed for the hsp70 cluster always linked with MHC locus in mammals suggests an important role of such organization for coordinated functioning of these vital genes. PMID:22096537
Comparison of Ergot Alkaloid Biosynthesis Gene Clusters in Claviceps Species Indicates Loss of Late Pathway Steps in Evolution of C. fusiformis▿

PubMed Central

Lorenz, Nicole; Wilson, Ella V.; Machado, Caroline; Schardl, Christopher L.; Tudzynski, Paul

2007-01-01

The grass parasites Claviceps purpurea and Claviceps fusiformis produce ergot alkaloids (EA) in planta and in submerged culture. Whereas EA synthesis (EAS) in C. purpurea proceeds via clavine intermediates to lysergic acid and the complex ergopeptines, C. fusiformis produces only agroclavine and elymoclavine. In C. purpurea the EAS gene (EAS) cluster includes dmaW (encoding the first pathway step), cloA (elymoclavine oxidation to lysergic acid), and the lpsA/lpsB genes (ergopeptine formation). We analyzed the corresponding C. fusiformis EAS cluster to investigate the evolutionary basis for chemotypic differences between the Claviceps species. Other than three peptide synthetase genes (lpsC and the tandem paralogues lpsA1 and lpsA2), homologues of all C. purpurea EAS genes were identified in C. fusiformis, including homologues of lpsB and cloA, which in C. purpurea encode enzymes for steps after clavine synthesis. Rearrangement of the cluster was evident around lpsB, which is truncated in C. fusiformis. This and several frameshift mutations render CflpsB a pseudogene (CflpsBΨ). No obvious inactivating mutation was identified in CfcloA. All C. fusiformis EAS genes, including CflpsBΨ and CfcloA, were expressed in culture. Cross-complementation analyses demonstrated that CfcloA and CflpsBΨ were expressed in C. purpurea but did not encode functional enzymes. In contrast, CpcloA catalyzed lysergic acid biosynthesis in C. fusiformis, indicating that C. fusiformis terminates its EAS pathway at elymoclavine because the cloA gene product is inactive. We propose that the C. fusiformis EAS cluster evolved from a more complete cluster by loss of some lps genes and by rearrangements and mutations inactivating lpsB and cloA. PMID:17720822
Comparison of ergot alkaloid biosynthesis gene clusters in Claviceps species indicates loss of late pathway steps in evolution of C. fusiformis.

PubMed

Lorenz, Nicole; Wilson, Ella V; Machado, Caroline; Schardl, Christopher L; Tudzynski, Paul

2007-11-01

The grass parasites Claviceps purpurea and Claviceps fusiformis produce ergot alkaloids (EA) in planta and in submerged culture. Whereas EA synthesis (EAS) in C. purpurea proceeds via clavine intermediates to lysergic acid and the complex ergopeptines, C. fusiformis produces only agroclavine and elymoclavine. In C. purpurea the EAS gene (EAS) cluster includes dmaW (encoding the first pathway step), cloA (elymoclavine oxidation to lysergic acid), and the lpsA/lpsB genes (ergopeptine formation). We analyzed the corresponding C. fusiformis EAS cluster to investigate the evolutionary basis for chemotypic differences between the Claviceps species. Other than three peptide synthetase genes (lpsC and the tandem paralogues lpsA1 and lpsA2), homologues of all C. purpurea EAS genes were identified in C. fusiformis, including homologues of lpsB and cloA, which in C. purpurea encode enzymes for steps after clavine synthesis. Rearrangement of the cluster was evident around lpsB, which is truncated in C. fusiformis. This and several frameshift mutations render CflpsB a pseudogene (CflpsB(Psi)). No obvious inactivating mutation was identified in CfcloA. All C. fusiformis EAS genes, including CflpsB(Psi) and CfcloA, were expressed in culture. Cross-complementation analyses demonstrated that CfcloA and CflpsB(Psi) were expressed in C. purpurea but did not encode functional enzymes. In contrast, CpcloA catalyzed lysergic acid biosynthesis in C. fusiformis, indicating that C. fusiformis terminates its EAS pathway at elymoclavine because the cloA gene product is inactive. We propose that the C. fusiformis EAS cluster evolved from a more complete cluster by loss of some lps genes and by rearrangements and mutations inactivating lpsB and cloA.
Malignant pleural mesothelioma and mesothelial hyperplasia: A new molecular tool for the differential diagnosis.

PubMed

Bruno, Rossella; Alì, Greta; Giannini, Riccardo; Proietti, Agnese; Lucchi, Marco; Chella, Antonio; Melfi, Franca; Mussi, Alfredo; Fontanini, Gabriella

2017-01-10

Malignant pleural mesothelioma (MPM) is a rare asbestos related cancer, aggressive and unresponsive to therapies. Histological examination of pleural lesions is the gold standard of MPM diagnosis, although it is sometimes hard to discriminate the epithelioid type of MPM from benign mesothelial hyperplasia (MH).This work aims to define a new molecular tool for the differential diagnosis of MPM, using the expression profile of 117 genes deregulated in this tumour.The gene expression analysis was performed by nanoString System on tumour tissues from 36 epithelioid MPM and 17 MH patients, and on 14 mesothelial pleural samples analysed in a blind way. Data analysis included raw nanoString data normalization, unsupervised cluster analysis by Pearson correlation, non-parametric Mann Whitney U-test and molecular classification by the Uncorrelated Shrunken Centroid (USC) Algorithm.The Mann-Whitney U-test found 35 genes upregulated and 31 downregulated in MPM. The unsupervised cluster analysis revealed two clusters, one composed only of MPM and one only of MH samples, thus revealing class-specific gene profiles. The Uncorrelated Shrunken Centroid algorithm identified two classifiers, one including 22 genes and the other 40 genes, able to properly classify all the samples as benign or malignant using gene expression data; both classifiers were also able to correctly determine, in a blind analysis, the diagnostic categories of all the 14 unknown samples.In conclusion we delineated a diagnostic tool combining molecular data (gene expression) and computational analysis (USC algorithm), which can be applied in the clinical practice for the differential diagnosis of MPM.
Chromatin organization and global regulation of Hox gene clusters

PubMed Central

Montavon, Thomas; Duboule, Denis

2013-01-01

During development, a properly coordinated expression of Hox genes, within their different genomic clusters is critical for patterning the body plans of many animals with a bilateral symmetry. The fascinating correspondence between the topological organization of Hox clusters and their transcriptional activation in space and time has served as a paradigm for understanding the relationships between genome structure and function. Here, we review some recent observations, which revealed highly dynamic changes in the structure of chromatin at Hox clusters, in parallel with their activation during embryonic development. We discuss the relevance of these findings for our understanding of large-scale gene regulation. PMID:23650639
Microarray gene expression profiling using core biopsies of renal neoplasia.

PubMed

Rogers, Craig G; Ditlev, Jonathon A; Tan, Min-Han; Sugimura, Jun; Qian, Chao-Nan; Cooper, Jeff; Lane, Brian; Jewett, Michael A; Kahnoski, Richard J; Kort, Eric J; Teh, Bin T

2009-01-01

We investigate the feasibility of using microarray gene expression profiling technology to analyze core biopsies of renal tumors for classification of tumor histology. Core biopsies were obtained ex-vivo from 7 renal tumors-comprised of four histological subtypes-following radical nephrectomy using 18-gauge biopsy needles. RNA was isolated from these samples and, in the case of biopsy samples, amplified by in vitro transcription. Microarray analysis was then used to quantify the mRNA expression patterns in these samples relative to non-diseased renal tissue mRNA. Genes with significant variation across all non-biopsy tumor samples were identified, and the relationship between tumor and biopsy samples in terms of expression levels of these genes was then quantified in terms of Euclidean distance, and visualized by complete linkage clustering. Final pathologic assessment of kidney tumors demonstrated clear cell renal cell carcinoma (4), oncocytoma (1), angiomyolipoma (1) and adrenalcortical carcinoma (1). Five of the seven biopsy samples were most similar in terms of gene expression to the resected tumors from which they were derived in terms of Euclidean distance. All seven biopsies were assigned to the correct histological class by hierarchical clustering. We demonstrate the feasibility of gene expression profiling of core biopsies of renal tumors to classify tumor histology.
Microarray gene expression profiling using core biopsies of renal neoplasia

PubMed Central

Rogers, Craig G.; Ditlev, Jonathon A.; Tan, Min-Han; Sugimura, Jun; Qian, Chao-Nan; Cooper, Jeff; Lane, Brian; Jewett, Michael A.; Kahnoski, Richard J.; Kort, Eric J.; Teh, Bin T.

2009-01-01

We investigate the feasibility of using microarray gene expression profiling technology to analyze core biopsies of renal tumors for classification of tumor histology. Core biopsies were obtained ex-vivo from 7 renal tumors—comprised of four histological subtypes—following radical nephrectomy using 18-gauge biopsy needles. RNA was isolated from these samples and, in the case of biopsy samples, amplified by in vitro transcription. Microarray analysis was then used to quantify the mRNA expression patterns in these samples relative to non-diseased renal tissue mRNA. Genes with significant variation across all non-biopsy tumor samples were identified, and the relationship between tumor and biopsy samples in terms of expression levels of these genes was then quantified in terms of Euclidean distance, and visualized by complete linkage clustering. Final pathologic assessment of kidney tumors demonstrated clear cell renal cell carcinoma (4), oncocytoma (1), angiomyolipoma (1) and adrenalcortical carcinoma (1). Five of the seven biopsy samples were most similar in terms of gene expression to the resected tumors from which they were derived in terms of Euclidean distance. All seven biopsies were assigned to the correct histological class by hierarchical clustering. We demonstrate the feasibility of gene expression profiling of core biopsies of renal tumors to classify tumor histology. PMID:19966938

The Putative C2H2 Transcription Factor MtfA Is a Novel Regulator of Secondary Metabolism and Morphogenesis in Aspergillus nidulans

PubMed Central

Ramamoorthy, Vellaisamy; Dhingra, Sourabh; Kincaid, Alexander; Shantappa, Sourabha; Feng, Xuehuan; Calvo, Ana M.

2013-01-01

Secondary metabolism in the model fungus Aspergillus nidulans is controlled by the conserved global regulator VeA, which also governs morphological differentiation. Among the secondary metabolites regulated by VeA is the mycotoxin sterigmatocystin (ST). The presence of VeA is necessary for the biosynthesis of this carcinogenic compound. We identified a revertant mutant able to synthesize ST intermediates in the absence of VeA. The point mutation occurred at the coding region of a gene encoding a novel putative C2H2 zinc finger domain transcription factor that we denominated mtfA. The A. nidulans mtfA gene product localizes at nuclei independently of the illumination regime. Deletion of the mtfA gene restores mycotoxin biosynthesis in the absence of veA, but drastically reduced mycotoxin production when mtfA gene expression was altered, by deletion or overexpression, in A. nidulans strains with a veA wild-type allele. Our study revealed that mtfA regulates ST production by affecting the expression of the specific ST gene cluster activator aflR. Importantly, mtfA is also a regulator of other secondary metabolism gene clusters, such as genes responsible for the synthesis of terrequinone and penicillin. As in the case of ST, deletion or overexpression of mtfA was also detrimental for the expression of terrequinone genes. Deletion of mtfA also decreased the expression of the genes in the penicillin gene cluster, reducing penicillin production. However, in this case, over-expression of mtfA enhanced the transcription of penicillin genes, increasing penicillin production more than 5 fold with respect to the control. Importantly, in addition to its effect on secondary metabolism, mtfA also affects asexual and sexual development in A. nidulans. Deletion of mtfA results in a reduction of conidiation and sexual stage. We found mtfA putative orthologs conserved in other fungal species. PMID:24066102
Global Landscape of a Co-Expressed Gene Network in Barley and its Application to Gene Discovery in Triticeae Crops

PubMed Central

Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo

2011-01-01

Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235
Prediction of epigenetically regulated genes in breast cancer cell lines

DOE Office of Scientific and Technical Information (OSTI.GOV)

Loss, Leandro A; Sadanandam, Anguraj; Durinck, Steffen

Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines,more » which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fxed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis. Our two-step dimensionality reduction compressed 90% of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically signifcant negative correlation between methylation profles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identifed 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes. Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators.« less
Genome-wide identification, characterization of sugar transporter genes in the silkworm Bombyx mori and role in Bombyx mori nucleopolyhedrovirus (BmNPV) infection.

PubMed

Govindaraj, Lekha; Gupta, Tania; Esvaran, Vijaya Gowri; Awasthi, Arvind Kumar; Ponnuvel, Kangayam M

2016-04-01

Sugar transporters play an essential role in controlling carbohydrate transport and are responsible for mediating the movement of sugars into cells. These genes exist as large multigene families within the insect genome. In insects, sugar transporters not only have a role in sugar transport, but may also act as receptors for virus entry. Genome-wide annotation of silkworm Bombyx mori (B. mori) revealed 100 putative sugar transporter (BmST) genes exists as a large multigene family and were classified into 11 sub families, through phylogenetic analysis. Chromosomes 27, 26 and 20 were found to possess the highest number of BmST paralogous genes, harboring 22, 7 and 6 genes, respectively. These genes occurred in clusters exhibiting the phenomenon of tandem gene duplication. The ovary, silk gland, hemocytes, midgut and malphigian tubules were the different tissues/cells enriched with BmST gene expression. The BmST gene BGIBMGA001498 had maximum EST transcripts of 134 and expressed exclusively in the malphigian tubule. The expression of EST transcripts of the BmST clustered genes on chromosome 27 was distributed in various tissues like testis, ovary, silk gland, malphigian tubule, maxillary galea, prothoracic gland, epidermis, fat body and midgut. Three sugar transporter genes (BmST) were constitutively expressed in the susceptible race and were down regulated upon BmNPV infection at 12h post infection (hpi). The expression pattern of these three genes was validated through real-time PCR in the midgut tissues at different time intervals from 0 to 30hpi. In the susceptible B. mori race, expression of sugar transporter genes was constitutively expressed making the host succumb to viral infection. Copyright © 2015 Elsevier B.V. All rights reserved.
Deletion of the miR-143/145 Cluster Leads to Hydronephrosis in Mice

PubMed Central

Medrano, Silvia; Sequeira-Lopez, Maria Luisa S.; Gomez, R. Ariel

2015-01-01

Obstructive nephropathy, the leading cause of kidney failure in children, can be anatomic or functional. The underlying causes of functional hydronephrosis are not well understood. miRNAs, which are small noncoding RNAs, regulate gene expression at the post-transcriptional level. We found that miR-145-5p, a member of the miR-143/145 cluster that is highly expressed in smooth muscle cells of the renal vasculature, was present in the pelvicalyceal system and the ureter. To evaluate whether the miR-143/145 cluster is involved in urinary tract function we performed morphologic, functional, and gene expression studies in mice carrying a whole-body deletion of miR-143/145. miR-143/145–deficient mice developed hydronephrosis, characterized by severe papillary atrophy and dilatation of the pelvicalyceal system without obvious physical obstruction. Moreover, mutant mice showed abnormal ureteral peristalsis. The number of ureter contractions was significantly higher in miR-143/145–deficient mice. Peristalsis was replaced by incomplete, short, and more frequent contractions that failed to completely propagate in a proximal-distal direction. Microarray analysis showed 108 differentially expressed genes in ureters of miR-143/145–deficient mice. Ninety genes were up-regulated and 18 genes were down-regulated, including genes with potential regulatory roles in smooth muscle contraction and extracellular matrix-receptor interaction. We show that miR-143/145 are important for the normal peristalsis of the ureter and report an association between the expression of these miRNAs and hydronephrosis. PMID:25307343
A comparative study of ripening among berries of the grape cluster reveals an altered transcriptional programme and enhanced ripening rate in delayed berries

PubMed Central

Gouthu, Satyanarayana; O’Neil, Shawn T.; Di, Yanming; Ansarolia, Mitra; Megraw, Molly; Deluc, Laurent G.

2014-01-01

Transcriptional studies in relation to fruit ripening generally aim to identify the transcriptional states associated with physiological ripening stages and the transcriptional changes between stages within the ripening programme. In non-climacteric fruits such as grape, all ripening-related genes involved in this programme have not been identified, mainly due to the lack of mutants for comparative transcriptomic studies. A feature in grape cluster ripening (Vitis vinifera cv. Pinot noir), where all berries do not initiate the ripening at the same time, was exploited to study their shifted ripening programmes in parallel. Berries that showed marked ripening state differences in a véraison-stage cluster (ripening onset) ultimately reached similar ripeness states toward maturity, indicating the flexibility of the ripening programme. The expression variance between these véraison-stage berry classes, where 11% of the genes were found to be differentially expressed, was reduced significantly toward maturity, resulting in the synchronization of their transcriptional states. Defined quantitative expression changes (transcriptional distances) not only existed between the véraison transitional stages, but also between the véraison to maturity stages, regardless of the berry class. It was observed that lagging berries complete their transcriptional programme in a shorter time through altered gene expressions and ripening-related hormone dynamics, and enhance the rate of physiological ripening progression. Finally, the reduction in expression variance of genes can identify new genes directly associated with ripening and also assess the relevance of gene activity to the phase of the ripening programme. PMID:25135520
Cis-Regulatory Variants Affect CHRNA5 mRNA Expression in Populations of African and European Ancestry

PubMed Central

Wang, Jen-Chyong; Spiegel, Noah; Bertelsen, Sarah; Le, Nhung; McKenna, Nicholas; Budde, John P.; Harari, Oscar; Kapoor, Manav; Brooks, Andrew; Hancock, Dana; Tischfield, Jay; Foroud, Tatiana; Bierut, Laura J.; Steinbach, Joe Henry; Edenberg, Howard J.; Traynor, Bryan J.; Goate, Alison M.

2013-01-01

Variants within the gene cluster encoding α3, α5, and β4 nicotinic receptor subunits are major risk factors for substance dependence. The strongest impact on risk is associated with variation in the CHRNA5 gene, where at least two mechanisms are at work: amino acid variation and altered mRNA expression levels. The risk allele of the non-synonymous variant (rs16969968; D398N) primarily occurs on the haplotype containing the low mRNA expression allele. In populations of European ancestry, there are approximately 50 highly correlated variants in the CHRNA5-CHRNA3-CHRNB4 gene cluster and the adjacent PSMA4 gene region that are associated with CHRNA5 mRNA levels. It is not clear which of these variants contribute to the changes in CHRNA5 transcript level. Because populations of African ancestry have reduced linkage disequilibrium among variants spanning this gene cluster, eQTL mapping in subjects of African ancestry could potentially aid in defining the functional variants that affect CHRNA5 mRNA levels. We performed quantitative allele specific gene expression using frontal cortices derived from 49 subjects of African ancestry and 111 subjects of European ancestry. This method measures allele-specific transcript levels in the same individual, which eliminates other biological variation that occurs when comparing expression levels between different samples. This analysis confirmed that substance dependence associated variants have a direct cis-regulatory effect on CHRNA5 transcript levels in human frontal cortices of African and European ancestry and identified 10 highly correlated variants, located in a 9 kb region, that are potential functional variants modifying CHRNA5 mRNA expression levels. PMID:24303001
Xenopus microRNA genes are predominantly located within introns and are differentially expressed in adult frog tissues via post-transcriptional regulation

PubMed Central

Tang, Guo-Qing; Maxwell, E. Stuart

2008-01-01

The amphibian Xenopus provides a model organism for investigating microRNA expression during vertebrate embryogenesis and development. Searching available Xenopus genome databases using known human pre-miRNAs as query sequences, more than 300 genes encoding 142 Xenopus tropicalis miRNAs were identified. Analysis of Xenopus tropicalis miRNA genes revealed a predominate positioning within introns of protein-coding and nonprotein-coding RNA Pol II-transcribed genes. MiRNA genes were also located in pre-mRNA exons and positioned intergenically between known protein-coding genes. Many miRNA species were found in multiple locations and in more than one genomic context. MiRNA genes were also clustered throughout the genome, indicating the potential for the cotranscription and coordinate expression of miRNAs located in a given cluster. Northern blot analysis confirmed the expression of many identified miRNAs in both X. tropicalis and X. laevis. Comparison of X. tropicalis and X. laevis blots revealed comparable expression profiles, although several miRNAs exhibited species-specific expression in different tissues. More detailed analysis revealed that for some miRNAs, the tissue-specific expression profile of the pri-miRNA precursor was distinctly different from that of the mature miRNA profile. Differential miRNA precursor processing in both the nucleus and cytoplasm was implicated in the observed tissue-specific differences. These observations indicated that post-transcriptional processing plays an important role in regulating miRNA expression in the amphibian Xenopus. PMID:18032731
Final technical report for award NO. DE-FG02-95ER20206

DOE Office of Scientific and Technical Information (OSTI.GOV)

James P. Shapleigh

2010-02-23

ABSTRACT Initial work focused on the regulation of nitrite reductase, the defining reaction of denitrification as well as nitric oxide (NO) reductase. Expression of the genes encoding both proteins was controlled by NnrR. This regulator was shown to be responsive to NO. More recent work has shown NnrR function is also likely inhibited by oxygen. Therefore, it is this protein that sets the oxygen level at which nitrate respiration takes over from aerobic respiration. The gene encoding NO reductase appears to only require NnrR for expression. Expression of the gene encoding nitrite reductase is more complex. In addition to NnrR,more » a two component sensor regulator complex termed PrrA and PrrB is also required for expression. These proteins are global regulators and serve to link denitrification with other bioenergetic processes in the cell. They also provide an additional layer of oxygen dependent regulation. The sequencing of the R. sphaeroides 2.4.3 genome allowed us to identify several other genes regulated by NnrR. Surprisingly, most of the genes were not essential for denitrification. Their high level of conservation in related denitrifiers suggests they do provide a selectable benefit to the bacterium, however. We also examined the role of nitrate reductase in contributing to denitrification in R. sphaeroides. Strain 2.4.3 is unusual in having two distinct, but related clusters of genes encoding nitrate reductase. One of these genes clusters is expressed under high oxygen conditions but is repressed, likely by PrrB-PrrA, under low oxygen conditions. The other cluster is expressed only under low oxygen conditions. This cluster expresses the nitrate reductase used during denitrification. The high oxygen expressed cluster encodes a protein used for redox homeostasis. Surprisingly, both clusters are fully expressed even in the absence of nitrate. During the course of this work we found that the type strain of R. sphaeroides, 2.4.1, is a partial denitrifier because it has the nitrate and NO reductases but lacks nitrite reductase. Like 2.4.3 it uses NnrR to regulate NO reductase. This unexpected arrangement suggested that it may use NO reductase to detoxify NO produced in its environment. Using a green fluorescent protein based reporter system we were able to demonstrate that NO produced by a denitrifier such as 2.4.3 can induce expression of NO reductase in 2.4.1. We then went on to show that the NO produced by denitrifiers can induce a stress response in other non-denitrifying bacteria. This suggests that the NO produced during denitrification will have a significant impact on the non-denitrifiers present in the surrounding environment. We also expanded our studies to include the denitrifier Agrobacterium tumefaciens. We demonstrated that the expression of the nitrite and NO reductase genes in this bacterium follows the same general scheme as in R. sphaeroides. We also were able to show that this bacterium would induce NO reductase in response to the NO produced by plants. Importantly, we were able to demonstrate that A. tumefaciens had difficulty transitioning from aerobic respiration to denitrification if the transition was sudden. This difficulty manifested as an accumulation of NO. In some conditions cells were slowly able to switch modes of respiration but in other cases NO accumulations seemed to kill the cells. The difficulty in transition appears to be due to an inability to produce enough energy once the oxygen has been completely consumed.« less
The resemblance and disparity of gene expression in dormant and non-dormant seeds and crown buds of leafy spurge (Euphorbia esula)

USDA-ARS?s Scientific Manuscript database

Overlaps in transcriptome profiles between different phases of bud and seed dormancy have not been determined. Thus, we compared various phases of dormancy between seeds and buds to identify common genes and molecular processes. Cluster analysis of expression profiles for 201 selected genes indicate...
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, S.D.; Cooper, P.; Fung, J.

Genetic factors affecting post-natal g-globin expression - a major modifier of the severity of both b-thalassemia and sickle cell anemia, have been difficult to study. This is especially so in mice, an organism lacking a globin gene with an expression pattern equivalent to that of human g-globin. To model the human b-cluster in mice, with the goal of screening for loci affecting human g-globin expression in vivo, we introduced a human b-globin cluster YAC transgene into the genome of FVB mice . The b-cluster contained a Greek hereditary persistence of fetal hemoglobin (HPFH) g allele resulting in postnatal expression ofmore » human g-globin in transgenic mice. The level of human g-globin for various F1 hybrids derived from crosses between the FVB transgenics and other inbred mouse strains was assessed. The g-globin level of the C3HeB/FVB transgenic mice was noted to be significantly elevated. To map genes affecting postnatal g-globin expression, a 20 centiMorgan (cM) genome scan of a C3HeB/F VB transgenics [prime] FVB backcross was performed, followed by high-resolution marker analysis of promising loci. From this analysis we mapped a locus within a 2.2 cM interval of mouse chromosome 1 at a LOD score of 4.2 that contributes 10.4% of variation in g-globin expression level. Combining transgenic modeling of the human b-globin gene cluster with quantitative trait analysis, we have identified and mapped a murine locus that impacts on human g-globin expression in vivo.« less
Transcriptional Regulatory Network Analysis of MYB Transcription Factor Family Genes in Rice.

PubMed

Smita, Shuchi; Katiyar, Amit; Chinnusamy, Viswanathan; Pandey, Dev M; Bansal, Kailash C

2015-01-01

MYB transcription factor (TF) is one of the largest TF families and regulates defense responses to various stresses, hormone signaling as well as many metabolic and developmental processes in plants. Understanding these regulatory hierarchies of gene expression networks in response to developmental and environmental cues is a major challenge due to the complex interactions between the genetic elements. Correlation analyses are useful to unravel co-regulated gene pairs governing biological process as well as identification of new candidate hub genes in response to these complex processes. High throughput expression profiling data are highly useful for construction of co-expression networks. In the present study, we utilized transcriptome data for comprehensive regulatory network studies of MYB TFs by "top-down" and "guide-gene" approaches. More than 50% of OsMYBs were strongly correlated under 50 experimental conditions with 51 hub genes via "top-down" approach. Further, clusters were identified using Markov Clustering (MCL). To maximize the clustering performance, parameter evaluation of the MCL inflation score (I) was performed in terms of enriched GO categories by measuring F-score. Comparison of co-expressed cluster and clads analyzed from phylogenetic analysis signifies their evolutionarily conserved co-regulatory role. We utilized compendium of known interaction and biological role with Gene Ontology enrichment analysis to hypothesize function of coexpressed OsMYBs. In the other part, the transcriptional regulatory network analysis by "guide-gene" approach revealed 40 putative targets of 26 OsMYB TF hubs with high correlation value utilizing 815 microarray data. The putative targets with MYB-binding cis-elements enrichment in their promoter region, functional co-occurrence as well as nuclear localization supports our finding. Specially, enrichment of MYB binding regions involved in drought-inducibility implying their regulatory role in drought response in rice. Thus, the co-regulatory network analysis facilitated the identification of complex OsMYB regulatory networks, and candidate target regulon genes of selected guide MYB genes. The results contribute to the candidate gene screening, and experimentally testable hypotheses for potential regulatory MYB TFs, and their targets under stress conditions.
Plasticity of the Chemoreceptor Repertoire in Drosophila melanogaster

PubMed Central

Zhou, Shanshan; Stone, Eric A.; Mackay, Trudy F. C.; Anholt, Robert R. H.

2009-01-01

For most organisms, chemosensation is critical for survival and is mediated by large families of chemoreceptor proteins, whose expression must be tuned appropriately to changes in the chemical environment. We asked whether expression of chemoreceptor genes that are clustered in the genome would be regulated independently; whether expression of certain chemoreceptor genes would be especially sensitive to environmental changes; whether groups of chemoreceptor genes undergo coordinated rexpression; and how plastic the expression of chemoreceptor genes is with regard to sex, development, reproductive state, and social context. To answer these questions we used Drosophila melanogaster, because its chemosensory systems are well characterized and both the genotype and environment can be controlled precisely. Using customized cDNA microarrays, we showed that chemoreceptor genes that are clustered in the genome undergo independent transcriptional regulation at different developmental stages and between sexes. Expression of distinct subgroups of chemoreceptor genes is sensitive to reproductive state and social interactions. Furthermore, exposure of flies only to odor of the opposite sex results in altered transcript abundance of chemoreceptor genes. These genes are distinct from those that show transcriptional plasticity when flies are allowed physical contact with same or opposite sex members. We analyzed covariance in transcript abundance of chemosensory genes across all environmental conditions and found that they segregated into 20 relatively small, biologically relevant modules of highly correlated transcripts. This finely pixilated modular organization of the chemosensory subgenome enables fine tuning of the expression of the chemoreceptor repertoire in response to ecologically relevant environmental and physiological conditions. PMID:19816562
The gsdf gene locus harbors evolutionary conserved and clustered genes preferentially expressed in fish previtellogenic oocytes.

PubMed

Gautier, Aude; Le Gac, Florence; Lareyre, Jean-Jacques

2011-02-01

The gonadal soma-derived factor (GSDF) belongs to the transforming growth factor-β superfamily and is conserved in teleostean fish species. Gsdf is specifically expressed in the gonads, and gene expression is restricted to the granulosa and Sertoli cells in trout and medaka. The gsdf gene expression is correlated to early testis differentiation in medaka and was shown to stimulate primordial germ cell and spermatogonia proliferation in trout. In the present study, we show that the gsdf gene localizes to a syntenic chromosomal fragment conserved among vertebrates although no gsdf-related gene is detected on the corresponding genomic region in tetrapods. We demonstrate using quantitative RT-PCR that most of the genes localized in the synteny are specifically expressed in medaka gonads. Gsdf is the only gene of the synteny with a much higher expression in the testis compared to the ovary. In contrast, gene expression pattern analysis of the gsdf surrounding genes (nup54, aff1, klhl8, sdad1, and ptpn13) indicates that these genes are preferentially expressed in the female gonads. The tissue distribution of these genes is highly similar in medaka and zebrafish, two teleostean species that have diverged more than 110 million years ago. The cellular localization of these genes was determined in medaka gonads using the whole-mount in situ hybridization technique. We confirm that gsdf gene expression is restricted to Sertoli and granulosa cells in contact with the premeiotic and meiotic cells. The nup54 gene is expressed in spermatocytes and previtellogenic oocytes. Transcripts corresponding to the ovary-specific genes (aff1, klhl8, and sdad1) are detected only in previtellogenic oocytes. No expression was detected in the gonocytes in 10 dpf embryos. In conclusion, we show that the gsdf gene localizes to a syntenic chromosomal fragment harboring evolutionary conserved genes in vertebrates. These genes are preferentially expressed in previtelloogenic oocytes, and thus, they display a different cellular localization compared to that of the gsdf gene indicating that the later gene is not co-regulated. Interestingly, our study identifies new clustered genes that are specifically expressed in previtellogenic oocytes (nup54, aff1, klhl8, sdad1). Copyright Â© 2010 Elsevier B.V. All rights reserved.
Single-cell RNA sequencing reveals gene expression signatures of breast cancer-associated endothelial cells.

PubMed

Sun, Zhengda; Wang, Chih-Yang; Lawson, Devon A; Kwek, Serena; Velozo, Hugo Gonzalez; Owyong, Mark; Lai, Ming-Derg; Fong, Lawrence; Wilson, Mark; Su, Hua; Werb, Zena; Cooke, Daniel L

2018-02-16

Tumor endothelial cells (TEC) play an indispensible role in tumor growth and metastasis although much of the detailed mechanism still remains elusive. In this study we characterized and compared the global gene expression profiles of TECs and control ECs isolated from human breast cancerous tissues and reduction mammoplasty tissues respectively by single cell RNA sequencing (scRNA-seq). Based on the qualified scRNA-seq libraries that we made, we found that 1302 genes were differentially expressed between these two EC phenotypes. Both principal component analysis (PCA) and heat map-based hierarchical clustering separated the cancerous versus control ECs as two distinctive clusters, and MetaCore disease biomarker analysis indicated that these differentially expressed genes are highly correlated with breast neoplasm diseases. Gene Set Enrichment Analysis software (GSEA) enriched these genes to extracellular matrix (ECM) signal pathways and highlighted 127 ECM-associated genes. External validation verified some of these ECM-associated genes are not only generally overexpressed in various cancer tissues but also specifically overexpressed in colorectal cancer ECs and lymphoma ECs. In conclusion, our data demonstrated that ECM-associated genes play pivotal roles in breast cancer EC biology and some of them could serve as potential TEC biomarkers for various cancers.
Gene network analysis identifies rumen epithelial cell proliferation, differentiation and metabolic pathways perturbed by diet and correlated with methane production

PubMed Central

Xiang, Ruidong; McNally, Jody; Rowe, Suzanne; Jonker, Arjan; Pinares-Patino, Cesar S.; Oddy, V. Hutton; Vercoe, Phil E.; McEwan, John C.; Dalrymple, Brian P.

2016-01-01

Ruminants obtain nutrients from microbial fermentation of plant material, primarily in their rumen, a multilayered forestomach. How the different layers of the rumen wall respond to diet and influence microbial fermentation, and how these process are regulated, is not well understood. Gene expression correlation networks were constructed from full thickness rumen wall transcriptomes of 24 sheep fed two different amounts and qualities of a forage and measured for methane production. The network contained two major negatively correlated gene sub-networks predominantly representing the epithelial and muscle layers of the rumen wall. Within the epithelium sub-network gene clusters representing lipid/oxo-acid metabolism, general metabolism and proliferating and differentiating cells were identified. The expression of cell cycle and metabolic genes was positively correlated with dry matter intake, ruminal short chain fatty acid concentrations and methane production. A weak correlation between lipid/oxo-acid metabolism genes and methane yield was observed. Feed consumption level explained the majority of gene expression variation, particularly for the cell cycle genes. Many known stratified epithelium transcription factors had significantly enriched targets in the epithelial gene clusters. The expression patterns of the transcription factors and their targets in proliferating and differentiating skin is mirrored in the rumen, suggesting conservation of regulatory systems. PMID:27966600
Function and Regulation of the Formate Dehydrogenase Genes of the Methanogenic Archaeon Methanococcus maripaludis

PubMed Central

Wood, Gwendolyn E.; Haydock, Andrew K.; Leigh, John A.

2003-01-01

Methanococcus maripaludis is a mesophilic species of Archaea capable of producing methane from two substrates: hydrogen plus carbon dioxide and formate. To study the latter, we identified the formate dehydrogenase genes of M. maripaludis and found that the genome contains two gene clusters important for formate utilization. Phylogenetic analysis suggested that the two formate dehydrogenase gene sets arose from duplication events within the methanococcal lineage. The first gene cluster encodes homologs of formate dehydrogenase α (FdhA) and β (FdhB) subunits and a putative formate transporter (FdhC) as well as a carbonic anhydrase analog. The second gene cluster encodes only FdhA and FdhB homologs. Mutants lacking either fdhA gene exhibited a partial growth defect on formate, whereas a double mutant was completely unable to grow on formate as a sole methanogenic substrate. Investigation of fdh gene expression revealed that transcription of both gene clusters is controlled by the presence of H2 and not by the presence of formate. PMID:12670979
Tissue-specific promoter utilisation of the kallikrein-related peptidase genes, KLK5 and KLK7, and cellular localisation of the encoded proteins suggest roles in exocrine pancreatic function.

PubMed

Dong, Ying; Matigian, Nick; Harvey, Tracey J; Samaratunga, Hemamali; Hooper, John D; Clements, Judith A

2008-02-01

Abstract Tissue kallikrein (kallikrein 1) was first identified in pancreas and is the namesake of the kallikrein-related peptidase (KLK) family. KLK1 and the other 14 members of the human KLK family are encoded by 15 serine protease genes clustered at chromosome 19q13.4. Our Northern blot analysis of 19 normal human tissues for expression of KLK4 to KLK15 identified pancreas as a common expression site for the gene cluster spanning KLK5 to KLK13, as well as for KLK15 which is located adjacent to KLK1. Consistent with previous reports detailing the ability of KLK genes to generate organ- and disease-specific transcripts, detailed molecular and in silico analyses indicated that KLK5 and KLK7 generate transcripts in pancreas variant from those in skin or ovary. Consistently, we identified in the promoters of these KLK genes motifs which conform with consensus binding sites for transcription factors conferring pancreatic expression. In addition, immunohistochemical analysis revealed predominant localisation of KLK5 and KLK7 in acinar cells of the exocrine pancreas, suggesting roles for these enzymes in digestion. Our data also support expression patterns derived from gene duplication events in the human KLK cluster. These findings suggest that, in addition to KLK1, other related KLK enzymes will function in the exocrine pancreas.
Genome Engineering and Modification Toward Synthetic Biology for the Production of Antibiotics.

PubMed

Zou, Xuan; Wang, Lianrong; Li, Zhiqiang; Luo, Jie; Wang, Yunfu; Deng, Zixin; Du, Shiming; Chen, Shi

2018-01-01

Antibiotic production is often governed by large gene clusters composed of genes related to antibiotic scaffold synthesis, tailoring, regulation, and resistance. With the expansion of genome sequencing, a considerable number of antibiotic gene clusters has been isolated and characterized. The emerging genome engineering techniques make it possible towards more efficient engineering of antibiotics. In addition to genomic editing, multiple synthetic biology approaches have been developed for the exploration and improvement of antibiotic natural products. Here, we review the progress in the development of these genome editing techniques used to engineer new antibiotics, focusing on three aspects of genome engineering: direct cloning of large genomic fragments, genome engineering of gene clusters, and regulation of gene cluster expression. This review will not only summarize the current uses of genomic engineering techniques for cloning and assembly of antibiotic gene clusters or for altering antibiotic synthetic pathways but will also provide perspectives on the future directions of rebuilding biological systems for the design of novel antibiotics. © 2017 Wiley Periodicals, Inc.
Novel clustering of items from the Autism Diagnostic Interview-Revised to define phenotypes within autism spectrum disorders

PubMed Central

Hu, Valerie W.; Steinberg, Mara E.

2009-01-01

Heterogeneity in phenotypic presentation of ASD has been cited as one explanation for the difficulty in pinpointing specific genes involved in autism. Recent studies have attempted to reduce the “noise” in genetic and other biological data by reducing the phenotypic heterogeneity of the sample population. The current study employs multiple clustering algorithms on 123 item scores from the Autism Diagnostic Interview-Revised (ADI-R) diagnostic instrument of nearly 2000 autistic individuals to identify subgroups of autistic probands with clinically relevant behavioral phenotypes in order to isolate more homogeneous groups of subjects for gene expression analyses. Our combined cluster analyses suggest optimal division of the autistic probands into 4 phenotypic clusters based on similarity of symptom severity across the 123 selected item scores. One cluster is characterized by severe language deficits, while another exhibits milder symptoms across the domains. A third group possesses a higher frequency of savant skills while the fourth group exhibited intermediate severity across all domains. Grouping autistic individuals by multivariate cluster analysis of ADI-R scores reveals meaningful phenotypes of subgroups within the autistic spectrum which we show, in a related (accompanying) study, to be associated with distinct gene expression profiles. PMID:19455643

Expressed Sequence Tag Analysis of the Human Pathogen Paracoccidioides brasiliensis Yeast Phase: Identification of Putative Homologues of Candida albicans Virulence and Pathogenicity Genes

PubMed Central

Goldman, Gustavo H.; dos Reis Marques, Everaldo; Custódio Duarte Ribeiro, Diógenes; Ângelo de Souza Bernardes, Luciano; Quiapin, Andréa Carla; Vitorelli, Patrícia Marostica; Savoldi, Marcela; Semighini, Camile P.; de Oliveira, Regina C.; Nunes, Luiz R.; Travassos, Luiz R.; Puccia, Rosana; Batista, Wagner L.; Ferreira, Leslie Ecker; Moreira, Júlio C.; Bogossian, Ana Paula; Tekaia, Fredj; Nobrega, Marina Pasetto; Nobrega, Francisco G.; Goldman, Maria Helena S.

2003-01-01

Paracoccidioides brasiliensis, a thermodimorphic fungus, is the causative agent of the prevalent systemic mycosis in Latin America, paracoccidioidomycosis. We present here a survey of expressed genes in the yeast pathogenic phase of P. brasiliensis. We obtained 13,490 expressed sequence tags from both 5′ and 3′ ends. Clustering analysis yielded the partial sequences of 4,692 expressed genes that were functionally classified by similarity to known genes. We have identified several Candida albicans virulence and pathogenicity homologues in P. brasiliensis. Furthermore, we have analyzed the expression of some of these genes during the dimorphic yeast-mycelium-yeast transition by real-time quantitative reverse transcription-PCR. Clustering analysis of the mycelium-yeast transition revealed three groups: (i) RBT, hydrophobin, and isocitrate lyase; (ii) malate dehydrogenase, contigs Pb1067 and Pb1145, GPI, and alternative oxidase; and (iii) ubiquitin, delta-9-desaturase, HSP70, HSP82, and HSP104. The first two groups displayed high mRNA expression in the mycelial phase, whereas the third group showed higher mRNA expression in the yeast phase. Our results suggest the possible conservation of pathogenicity and virulence mechanisms among fungi, expand considerably gene identification in P. brasiliensis, and provide a broader basis for further progress in understanding its biological peculiarities. PMID:12582121
Porcine Tissue-Specific Regulatory Networks Derived from Meta-Analysis of the Transcriptome

PubMed Central

Pérez-Montarelo, Dafne; Hudson, Nicholas J.; Fernández, Ana I.; Ramayo-Caldas, Yuliaxis; Dalrymple, Brian P.; Reverter, Antonio

2012-01-01

The processes that drive tissue identity and differentiation remain unclear for most tissue types. So are the gene networks and transcription factors (TF) responsible for the differential structure and function of each particular tissue, and this is particularly true for non model species with incomplete genomic resources. To better understand the regulation of genes responsible for tissue identity in pigs, we have inferred regulatory networks from a meta-analysis of 20 gene expression studies spanning 480 Porcine Affymetrix chips for 134 experimental conditions on 27 distinct tissues. We developed a mixed-model normalization approach with a covariance structure that accommodated the disparity in the origin of the individual studies, and obtained the normalized expression of 12,320 genes across the 27 tissues. Using this resource, we constructed a network, based on the co-expression patterns of 1,072 TF and 1,232 tissue specific genes. The resulting network is consistent with the known biology of tissue development. Within the network, genes clustered by tissue and tissues clustered by site of embryonic origin. These clusters were significantly enriched for genes annotated in key relevant biological processes and confirm gene functions and interactions from the literature. We implemented a Regulatory Impact Factor (RIF) metric to identify the key regulators in skeletal muscle and tissues from the central nervous systems. The normalization of the meta-analysis, the inference of the gene co-expression network and the RIF metric, operated synergistically towards a successful search for tissue-specific regulators. Novel among these findings are evidence suggesting a novel key role of ERCC3 as a muscle regulator. Together, our results recapitulate the known biology behind tissue specificity and provide new valuable insights in a less studied but valuable model species. PMID:23049964
An integrated bioinformatics approach to improve two-color microarray quality-control: impact on biological conclusions.

PubMed

van Haaften, Rachel I M; Luceri, Cristina; van Erk, Arie; Evelo, Chris T A

2009-06-01

Omics technology used for large-scale measurements of gene expression is rapidly evolving. This work pointed out the need of an extensive bioinformatics analyses for array quality assessment before and after gene expression clustering and pathway analysis. A study focused on the effect of red wine polyphenols on rat colon mucosa was used to test the impact of quality control and normalisation steps on the biological conclusions. The integration of data visualization, pathway analysis and clustering revealed an artifact problem that was solved with an adapted normalisation. We propose a possible point to point standard analysis procedure, based on a combination of clustering and data visualization for the analysis of microarray data.
The Role of Vitamin D in the Transcriptional Program of Human Pregnancy

PubMed Central

Al-Garawi, Amal; Carey, Vincent J.; Chhabra, Divya; Morrow, Jarrett; Lasky-Su, Jessica; Qiu, Weiliang; Laranjo, Nancy; Litonjua, Augusto A.; Weiss, Scott T.

2016-01-01

Background Patterns of gene expression of human pregnancy are poorly understood. In a trial of vitamin D supplementation in pregnant women, peripheral blood transcriptomes were measured longitudinally on 30 women and used to characterize gene co-expression networks. Objective Studies suggest that increased maternal Vitamin D levels may reduce the risk of asthma in early life, yet the underlying mechanisms have not been examined. In this study, we used a network-based approach to examine changes in gene expression profiles during the course of normal pregnancy and evaluated their association with maternal Vitamin D levels. Design The VDAART study is a randomized clinical trial of vitamin D supplementation in pregnancy for reduction of pediatric asthma risk. The trial enrolled 881 women at 10–18 weeks of gestation. Longitudinal gene expression measures were obtained on thirty pregnant women, using RNA isolated from peripheral blood samples obtained in the first and third trimesters. Differentially expressed genes were identified using significance of analysis of microarrays (SAM), and clustered using a weighted gene co-expression network analysis (WGCNA). Gene-set enrichment was performed to identify major biological pathways. Results Comparison of transcriptional profiles between first and third trimesters of pregnancy identified 5839 significantly differentially expressed genes (FDR<0.05). Weighted gene co-expression network analysis clustered these transcripts into 14 co-expression modules of which two showed significant correlation with maternal vitamin D levels. Pathway analysis of these two modules revealed genes enriched in immune defense pathways and extracellular matrix reorganization as well as genes enriched in notch signaling and transcription factor networks. Conclusion Our data show that gene expression profiles of healthy pregnant women change during the course of pregnancy and suggest that maternal Vitamin D levels influence transcriptional profiles. These alterations of the maternal transcriptome may contribute to fetal immune imprinting and reduce allergic sensitization in early life. Trial Registration clinicaltrials.gov NCT00920621 PMID:27711190
Genome-Wide Analysis of NBS-LRR Genes in Sorghum Genome Revealed Several Events Contributing to NBS-LRR Gene Evolution in Grass Species

PubMed Central

Yang, Xiping; Wang, Jianping

2016-01-01

The nucleotide-binding site (NBS)–leucine-rich repeat (LRR) gene family is crucially important for offering resistance to pathogens. To explore evolutionary conservation and variability of NBS-LRR genes across grass species, we identified 88, 107, 24, and 44 full-length NBS-LRR genes in sorghum, rice, maize, and Brachypodium, respectively. A comprehensive analysis was performed on classification, genome organization, evolution, expression, and regulation of these NBS-LRR genes using sorghum as a representative of grass species. In general, the full-length NBS-LRR genes are highly clustered and duplicated in sorghum genome mainly due to local duplications. NBS-LRR genes have basal expression levels and are highly potentially targeted by miRNA. The number of NBS-LRR genes in the four grass species is positively correlated with the gene clustering rate. The results provided a valuable genomic resource and insights for functional and evolutionary studies of NBS-LRR genes in grass species. PMID:26792976
Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants

DOE PAGES

Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan; ...

2017-04-01

Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we will need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can bemore » used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters.« less
Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants1[OPEN

PubMed Central

Zhang, Peifen; Kim, Taehyong; Banf, Michael; Chavali, Arvind K.; Nilo-Poyanco, Ricardo; Bernard, Thomas

2017-01-01

Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can be used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters. PMID:28228535
Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan

Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we will need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can bemore » used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters.« less
Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants.

PubMed

Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan; Kim, Taehyong; Banf, Michael; Chae, Lee; Dreher, Kate; Chavali, Arvind K; Nilo-Poyanco, Ricardo; Bernard, Thomas; Kahn, Daniel; Rhee, Seung Y

2017-04-01

Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can be used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters. © 2017 American Society of Plant Biologists. All Rights Reserved.
The Epipolythiodiketopiperazine Gene Cluster in Claviceps purpurea: Dysfunctional Cytochrome P450 Enzyme Prevents Formation of the Previously Unknown Clapurines.

PubMed

Dopstadt, Julian; Neubauer, Lisa; Tudzynski, Paul; Humpf, Hans-Ulrich

2016-01-01

Claviceps purpurea is an important food contaminant and well known for the production of the toxic ergot alkaloids. Apart from that, little is known about its secondary metabolism and not all toxic substances going along with the food contamination with Claviceps are known yet. We explored the metabolite profile of a gene cluster in C. purpurea with a high homology to gene clusters, which are responsible for the formation of epipolythiodiketopiperazine (ETP) toxins in other fungi. By overexpressing the transcription factor, we were able to activate the cluster in the standard C. purpurea strain 20.1. Although all necessary genes for the formation of the characteristic disulfide bridge were expressed in the overexpression mutants, the fungus did not produce any ETPs. Isolation of pathway intermediates showed that the common biosynthetic pathway stops after the first steps. Our results demonstrate that hydroxylation of the diketopiperazine backbone is the critical step during the ETP biosynthesis. Due to a dysfunctional enzyme, the fungus is not able to produce toxic ETPs. Instead, the pathway end-products are new unusual metabolites with a unique nitrogen-sulfur bond. By heterologous expression of the Leptosphaeria maculans cytochrome P450 encoding gene sirC, we were able to identify the end-products of the ETP cluster in C. purpurea. The thioclapurines are so far unknown ETPs, which might contribute to the toxicity of other C. purpurea strains with a potentially intact ETP cluster.
The Epipolythiodiketopiperazine Gene Cluster in Claviceps purpurea: Dysfunctional Cytochrome P450 Enzyme Prevents Formation of the Previously Unknown Clapurines

PubMed Central

Tudzynski, Paul; Humpf, Hans-Ulrich

2016-01-01

Claviceps purpurea is an important food contaminant and well known for the production of the toxic ergot alkaloids. Apart from that, little is known about its secondary metabolism and not all toxic substances going along with the food contamination with Claviceps are known yet. We explored the metabolite profile of a gene cluster in C. purpurea with a high homology to gene clusters, which are responsible for the formation of epipolythiodiketopiperazine (ETP) toxins in other fungi. By overexpressing the transcription factor, we were able to activate the cluster in the standard C. purpurea strain 20.1. Although all necessary genes for the formation of the characteristic disulfide bridge were expressed in the overexpression mutants, the fungus did not produce any ETPs. Isolation of pathway intermediates showed that the common biosynthetic pathway stops after the first steps. Our results demonstrate that hydroxylation of the diketopiperazine backbone is the critical step during the ETP biosynthesis. Due to a dysfunctional enzyme, the fungus is not able to produce toxic ETPs. Instead, the pathway end-products are new unusual metabolites with a unique nitrogen-sulfur bond. By heterologous expression of the Leptosphaeria maculans cytochrome P450 encoding gene sirC, we were able to identify the end-products of the ETP cluster in C. purpurea. The thioclapurines are so far unknown ETPs, which might contribute to the toxicity of other C. purpurea strains with a potentially intact ETP cluster. PMID:27390873
Analysis of gene expression levels in individual bacterial cells without image segmentation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kwak, In Hae; Son, Minjun; Hagen, Stephen J., E-mail: sjhagen@ufl.edu

2012-05-11

Highlights: Black-Right-Pointing-Pointer We present a method for extracting gene expression data from images of bacterial cells. Black-Right-Pointing-Pointer The method does not employ cell segmentation and does not require high magnification. Black-Right-Pointing-Pointer Fluorescence and phase contrast images of the cells are correlated through the physics of phase contrast. Black-Right-Pointing-Pointer We demonstrate the method by characterizing noisy expression of comX in Streptococcus mutans. -- Abstract: Studies of stochasticity in gene expression typically make use of fluorescent protein reporters, which permit the measurement of expression levels within individual cells by fluorescence microscopy. Analysis of such microscopy images is almost invariably based on amore » segmentation algorithm, where the image of a cell or cluster is analyzed mathematically to delineate individual cell boundaries. However segmentation can be ineffective for studying bacterial cells or clusters, especially at lower magnification, where outlines of individual cells are poorly resolved. Here we demonstrate an alternative method for analyzing such images without segmentation. The method employs a comparison between the pixel brightness in phase contrast vs fluorescence microscopy images. By fitting the correlation between phase contrast and fluorescence intensity to a physical model, we obtain well-defined estimates for the different levels of gene expression that are present in the cell or cluster. The method reveals the boundaries of the individual cells, even if the source images lack the resolution to show these boundaries clearly.« less
Differential Gene Expression in Normal Human Mammary Epithelial Cells Treated with Malathion Monitored by DNA Microarrays

PubMed Central

Gwinn, Maureen R.; Whipkey, Diana L.; Tennant, Lora B.; Weston, Ainsley

2005-01-01

Organophosphate pesticides are a major source of occupational exposure in the United States. Moreover, malathion has been sprayed over major urban populations in an effort to control mosquitoes carrying West Nile virus. Previous research, reviewed by the U.S. Environmental Protection Agency, on the genotoxicity and carcinogenicity of malathion has been inconclusive, although malathion is a known endocrine disruptor. Here, interindividual variations and commonality of gene expression signatures have been studied in normal human mammary epithelial cells from four women undergoing reduction mammoplasty. The cell strains were obtained from the discarded tissues through the Cooperative Human Tissue Network (sponsors: National Cancer Institute and National Disease Research Interchange). Interindividual variation of gene expression patterns in response to malathion was observed in various clustering patterns for the four cell strains. Further clustering identified three genes with increased expression after treatment in all four cell strains. These genes were two aldo–keto reductases (AKR1C1 and AKR1C2) and an estrogen-responsive gene (EBBP). Decreased expression of six RNA species was seen at various time points in all cell strains analyzed: plasminogen activator (PLAT), centromere protein F (CPF), replication factor C (RFC3), thymidylate synthetase (TYMS), a putative mitotic checkpoint kinase (BUB1), and a gene of unknown function (GenBank accession no. AI859865). Expression changes in all these genes, detected by DNA microarrays, have been verified by real-time polymerase chain reaction. Differential changes in expression of these genes may yield biomarkers that provide insight into interindividual variation in malathion toxicity. PMID:16079077
Transcriptome analyses of the Giardia lamblia life cycle

PubMed Central

Birkeland, Shanda R.; Preheim, Sarah P.; Davids, Barbara J.; Cipriano, Michael J.; Palm, Daniel; Reiner, David S.; Svärd, Staffan G.; Gillin, Frances D.; McArthur, Andrew G.

2010-01-01

We quantified mRNA abundance from 10 stages in the Giardia lamblia life cycle in vitro using Serial Analysis of Gene Expression (SAGE). 163 abundant transcripts were expressed constitutively. 71 transcripts were upregulated specifically during excystation and 42 during encystation. Nonetheless, the transcriptomes of cysts and trophozoites showed major differences. SAGE detected co-expressed clusters of 284 transcripts differentially expressed in cysts and excyzoites and 287 transcripts in vegetative trophozoites and encysting cells. All clusters included known genes and pathways as well as proteins unique to Giardia or diplomonads. SAGE analysis of the Giardia life cycle identified a number of kinases, phosphatases, and DNA replication proteins involved in excystation and encystation, which could be important for examining the roles of cell signaling in giardial differentiation. Overall, these data pave the way for directed gene discovery and a better understanding of the biology of Giardia lamblia. PMID:20570699
Identification of the Main Regulator Responsible for Synthesis of the Typical Yellow Pigment Produced by Trichoderma reesei

PubMed Central

Derntl, Christian; Rassinger, Alice; Srebotnik, Ewald; Mach, Robert L.

2016-01-01

ABSTRACT The industrially used ascomycete Trichoderma reesei secretes a typical yellow pigment during cultivation, while other Trichoderma species do not. A comparative genomic analysis suggested that a putative secondary metabolism cluster, containing two polyketide-synthase encoding genes, is responsible for the yellow pigment synthesis. This cluster is conserved in a set of rather distantly related fungi, including Acremonium chrysogenum and Penicillium chrysogenum. In an attempt to silence the cluster in T. reesei, two genes of the cluster encoding transcription factors were individually deleted. For a complete genetic proof-of-function, the genes were reinserted into the genomes of the respective deletion strains. The deletion of the first transcription factor (termed yellow pigment regulator 1 [Ypr1]) resulted in the full abolishment of the yellow pigment formation and the expression of most genes of this cluster. A comparative high-pressure liquid chromatography (HPLC) analysis of supernatants of the ypr1 deletion and its parent strain suggested the presence of several yellow compounds in T. reesei that are all derived from the same cluster. A subsequent gas chromatography/mass spectrometry analysis strongly indicated the presence of sorbicillin in the major HPLC peak. The presence of the second transcription factor, termed yellow pigment regulator 2 (Ypr2), reduces the yellow pigment formation and the expression of most cluster genes, including the gene encoding the activator Ypr1. IMPORTANCE Trichoderma reesei is used for industry-scale production of carbohydrate-active enzymes. During growth, it secretes a typical yellow pigment. This is not favorable for industrial enzyme production because it makes the downstream process more complicated and thus increases operating costs. In this study, we demonstrate which regulators influence the synthesis of the yellow pigment. Based on these data, we also provide indication as to which genes are under the control of these regulators and are finally responsible for the biosynthesis of the yellow pigment. These genes are organized in a cluster that is also found in other industrially relevant fungi, such as the two antibiotic producers Penicillium chrysogenum and Acremonium chrysogenum. The targeted manipulation of a secondary metabolism cluster is an important option for any biotechnologically applied microorganism. PMID:27520818
Comprehensive cluster analysis with Transitivity Clustering.

PubMed

Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan

2011-03-01

Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.
Massive Collection of Full-Length Complementary DNA Clones and Microarray Analyses:. Keys to Rice Transcriptome Analysis

NASA Astrophysics Data System (ADS)

Kikuchi, Shoshi

2009-02-01

Completion of the high-precision genome sequence analysis of rice led to the collection of about 35,000 full-length cDNA clones and the determination of their complete sequences. Mapping of these full-length cDNA sequences has given us information on (1) the number of genes expressed in the rice genome; (2) the start and end positions and exon-intron structures of rice genes; (3) alternative transcripts; (4) possible encoded proteins; (5) non-protein-coding (np) RNAs; (6) the density of gene localization on the chromosome; (7) setting the parameters of gene prediction programs; and (8) the construction of a microarray system that monitors global gene expression. Manual curation for rice gene annotation by using mapping information on full-length cDNA and EST assemblies has revealed about 32,000 expressed genes in the rice genome. Analysis of major gene families, such as those encoding membrane transport proteins (pumps, ion channels, and secondary transporters), along with the evolution from bacteria to higher animals and plants, reveals how gene numbers have increased through adaptation to circumstances. Family-based gene annotation also gives us a new way of comparing organisms. Massive amounts of data on gene expression under many kinds of physiological conditions are being accumulated in rice oligoarrays (22K and 44K) based on full-length cDNA sequences. Cluster analyses of genes that have the same promoter cis-elements, that have similar expression profiles, or that encode enzymes in the same metabolic pathways or signal transduction cascades give us clues to understanding the networks of gene expression in rice. As a tool for that purpose, we recently developed "RiCES", a tool for searching for cis-elements in the promoter regions of clustered genes.
Elevated Mirc1/Mir17-92 cluster expression negatively regulates autophagy and CFTR (cystic fibrosis transmembrane conductance regulator) function in CF macrophages.

PubMed

Tazi, Mia F; Dakhlallah, Duaa A; Caution, Kyle; Gerber, Madelyn M; Chang, Sheng-Wei; Khalil, Hany; Kopp, Benjamin T; Ahmed, Amr E; Krause, Kathrin; Davis, Ian; Marsh, Clay; Lovett-Racke, Amy E; Schlesinger, Larry S; Cormet-Boyaka, Estelle; Amer, Amal O

2016-11-01

Cystic fibrosis (CF) is a fatal, genetic disorder that critically affects the lungs and is directly caused by mutations in the CF transmembrane conductance regulator (CFTR) gene, resulting in defective CFTR function. Macroautophagy/autophagy is a highly regulated biological process that provides energy during periods of stress and starvation. Autophagy clears pathogens and dysfunctional protein aggregates within macrophages. However, this process is impaired in CF patients and CF mice, as their macrophages exhibit limited autophagy activity. The study of microRNAs (Mirs), and other noncoding RNAs, continues to offer new therapeutic targets. The objective of this study was to elucidate the role of Mirs in dysregulated autophagy-related genes in CF macrophages, and then target them to restore this host-defense function and improve CFTR channel function. We identified the Mirc1/Mir17-92 cluster as a potential negative regulator of autophagy as CF macrophages exhibit decreased autophagy protein expression and increased cluster expression when compared to wild-type (WT) counterparts. The absence or reduced expression of the cluster increases autophagy protein expression, suggesting the canonical inverse relationship between Mirc1/Mir17-92 and autophagy gene expression. An in silico study for targets of Mirs that comprise the cluster suggested that the majority of the Mirs target autophagy mRNAs. Those targets were validated by luciferase assays. Notably, the ability of macrophages expressing mutant F508del CFTR to transport halide through their membranes is compromised and can be restored by downregulation of these inherently elevated Mirs, via restoration of autophagy. In vivo, downregulation of Mir17 and Mir20a partially restored autophagy expression and hence improved the clearance of Burkholderia cenocepacia. Thus, these data advance our understanding of mechanisms underlying the pathobiology of CF and provide a new therapeutic platform for restoring CFTR function and autophagy in patients with CF.
Calcisponges have a ParaHox gene and dynamic expression of dispersed NK homeobox genes.

PubMed

Fortunato, Sofia A V; Adamski, Marcin; Ramos, Olivia Mendivil; Leininger, Sven; Liu, Jing; Ferrier, David E K; Adamska, Maja

2014-10-30

Sponges are simple animals with few cell types, but their genomes paradoxically contain a wide variety of developmental transcription factors, including homeobox genes belonging to the Antennapedia (ANTP) class, which in bilaterians encompass Hox, ParaHox and NK genes. In the genome of the demosponge Amphimedon queenslandica, no Hox or ParaHox genes are present, but NK genes are linked in a tight cluster similar to the NK clusters of bilaterians. It has been proposed that Hox and ParaHox genes originated from NK cluster genes after divergence of sponges from the lineage leading to cnidarians and bilaterians. On the other hand, synteny analysis lends support to the notion that the absence of Hox and ParaHox genes in Amphimedon is a result of secondary loss (the ghost locus hypothesis). Here we analysed complete suites of ANTP-class homeoboxes in two calcareous sponges, Sycon ciliatum and Leucosolenia complicata. Our phylogenetic analyses demonstrate that these calcisponges possess orthologues of bilaterian NK genes (Hex, Hmx and Msx), a varying number of additional NK genes and one ParaHox gene, Cdx. Despite the generation of scaffolds spanning multiple genes, we find no evidence of clustering of Sycon NK genes. All Sycon ANTP-class genes are developmentally expressed, with patterns suggesting their involvement in cell type specification in embryos and adults, metamorphosis and body plan patterning. These results demonstrate that ParaHox genes predate the origin of sponges, thus confirming the ghost locus hypothesis, and highlight the need to analyse the genomes of multiple sponge lineages to obtain a complete picture of the ancestral composition of the first animal genome.
Deciphering the Anti-Aflatoxinogenic Properties of Eugenol Using a Large-Scale q-PCR Approach

PubMed Central

Caceres, Isaura; El Khoury, Rhoda; Medina, Ángel; Lippi, Yannick; Naylies, Claire; Atoui, Ali; El Khoury, André; Oswald, Isabelle P.; Bailly, Jean-Denis; Puel, Olivier

2016-01-01

Produced by several species of Aspergillus, Aflatoxin B1 (AFB1) is a carcinogenic mycotoxin contaminating many crops worldwide. The utilization of fungicides is currently one of the most common methods; nevertheless, their use is not environmentally or economically sound. Thus, the use of natural compounds able to block aflatoxinogenesis could represent an alternative strategy to limit food and feed contamination. For instance, eugenol, a 4-allyl-2-methoxyphenol present in many essential oils, has been identified as an anti-aflatoxin molecule. However, its precise mechanism of action has yet to be clarified. The production of AFB1 is associated with the expression of a 70 kB cluster, and not less than 21 enzymatic reactions are necessary for its production. Based on former empirical data, a molecular tool composed of 60 genes targeting 27 genes of aflatoxin B1 cluster and 33 genes encoding the main regulatory factors potentially involved in its production, was developed. We showed that AFB1 inhibition in Aspergillus flavus following eugenol addition at 0.5 mM in a Malt Extract Agar (MEA) medium resulted in a complete inhibition of the expression of all but one gene of the AFB1 biosynthesis cluster. This transcriptomic effect followed a down-regulation of the complex composed by the two internal regulatory factors, AflR and AflS. This phenomenon was also influenced by an over-expression of veA and mtfA, two genes that are directly linked to AFB1 cluster regulation. PMID:27128940

Distinct Gene Expression Patterns between Nasal Mucosal Cells and Blood Collected from Allergic Rhinitis Sufferers.

PubMed

Watts, Annabelle M; West, Nicholas P; Cripps, Allan W; Smith, Pete K; Cox, Amanda J

2018-06-19

Investigations of gene expression in allergic rhinitis (AR) typically rely on invasive nasal biopsies (site of inflammation) or blood samples (systemic immunity) to obtain sufficient genetic material for analysis. New methodologies to circumvent the need for invasive sample collection offer promise to further the understanding of local immune mechanisms relevant in AR. A within-subject design was employed to compare immune gene expression profiles obtained from nasal washing/brushing and whole blood samples collected during peak pollen season. Twelve adults (age: 46.3 ± 12.3 years) with more than a 2-year history of AR and a confirmed grass pollen allergy participated in the study. Gene expression analysis was performed using a panel of 760 immune genes with the NanoString nCounter platform on nasal lavage/brushing cell lysates and compared to RNA extracted from blood. A total of 355 genes were significantly differentially expressed between sample types (9.87 to -9.71 log2 fold change). The top 3 genes significantly upregulated in nasal lysate samples were Mucin 1 (MUC1), Tight Junction Protein 1 (TJP1), and Lipocalin-2 (LCN2). The top 3 genes significantly upregulated in blood samples were cluster of differentiation 3e (CD3E), FYN Proto-Oncogene Src Family Tyrosine Kinase (FYN) and cluster of differentiation 3d (CD3D). Overall, the blood and nasal lavage samples showed vastly distinct gene expression profiles and functional gene pathways which reflect their anatomical and functional origins. Evaluating immune gene expression of the nasal mucosa in addition to blood samples may be beneficial in understanding AR pathophysiology and response to allergen challenge. © 2018 S. Karger AG, Basel.
Genes with a spike expression are clustered in chromosome (sub)bands and spike (sub)bands have a powerful prognostic value in patients with multiple myeloma

PubMed Central

Kassambara, Alboukadel; Hose, Dirk; Moreaux, Jérôme; Walker, Brian A.; Protopopov, Alexei; Reme, Thierry; Pellestor, Franck; Pantesco, Véronique; Jauch, Anna; Morgan, Gareth; Goldschmidt, Hartmut; Klein, Bernard

2012-01-01

Background Genetic abnormalities are common in patients with multiple myeloma, and may deregulate gene products involved in tumor survival, proliferation, metabolism and drug resistance. In particular, translocations may result in a high expression of targeted genes (termed spike expression) in tumor cells. We identified spike genes in multiple myeloma cells of patients with newly-diagnosed myeloma and investigated their prognostic value. Design and Methods Genes with a spike expression in multiple myeloma cells were picked up using box plot probe set signal distribution and two selection filters. Results In a cohort of 206 newly diagnosed patients with multiple myeloma, 2587 genes/expressed sequence tags with a spike expression were identified. Some spike genes were associated with some transcription factors such as MAF or MMSET and with known recurrent translocations as expected. Spike genes were not associated with increased DNA copy number and for a majority of them, involved unknown mechanisms. Of spiked genes, 36.7% clustered significantly in 149 out of 862 documented chromosome (sub)bands, of which 53 had prognostic value (35 bad, 18 good). Their prognostic value was summarized with a spike band score that delineated 23.8% of patients with a poor median overall survival (27.4 months versus not reached, P<0.001) using the training cohort of 206 patients. The spike band score was independent of other gene expression profiling-based risk scores, t(4;14), or del17p in an independent validation cohort of 345 patients. Conclusions We present a new approach to identify spike genes and their relationship to patients’ survival. PMID:22102711
Gastrointestinal Fibroblasts Have Specialized, Diverse Transcriptional Phenotypes: A Comprehensive Gene Expression Analysis of Human Fibroblasts

PubMed Central

Ishii, Genichiro; Aoyagi, Kazuhiko; Sasaki, Hiroki; Ochiai, Atsushi

2015-01-01

Background Fibroblasts are the principal stromal cells that exist in whole organs and play vital roles in many biological processes. Although the functional diversity of fibroblasts has been estimated, a comprehensive analysis of fibroblasts from the whole body has not been performed and their transcriptional diversity has not been sufficiently explored. The aim of this study was to elucidate the transcriptional diversity of human fibroblasts within the whole body. Methods Global gene expression analysis was performed on 63 human primary fibroblasts from 13 organs. Of these, 32 fibroblasts from gastrointestinal organs (gastrointestinal fibroblasts: GIFs) were obtained from a pair of 2 anatomical sites: the submucosal layer (submucosal fibroblasts: SMFs) and the subperitoneal layer (subperitoneal fibroblasts: SPFs). Using hierarchical clustering analysis, we elucidated identifiable subgroups of fibroblasts and analyzed the transcriptional character of each subgroup. Results In unsupervised clustering, 2 major clusters that separate GIFs and non-GIFs were observed. Organ- and anatomical site-dependent clusters within GIFs were also observed. The signature genes that discriminated GIFs from non-GIFs, SMFs from SPFs, and the fibroblasts of one organ from another organ consisted of genes associated with transcriptional regulation, signaling ligands, and extracellular matrix remodeling. Conclusions GIFs are characteristic fibroblasts with specific gene expressions from transcriptional regulation, signaling ligands, and extracellular matrix remodeling related genes. In addition, the anatomical site- and organ-dependent diversity of GIFs was also discovered. These features of GIFs contribute to their specific physiological function and homeostatic maintenance, and create a functional diversity of the gastrointestinal tract. PMID:26046848
Comparative analysis of gene expression profiles of OPN signaling pathway in four kinds of liver diseases.

PubMed

Wang, Gaiping; Chen, Shasha; Zhao, Congcong; Li, Xiaofang; Zhao, Weiming; Yang, Jing; Chang, Cuifang; Xu, Cunshuan

2016-09-01

To explore the relevance of OPN signalling pathway to the occurrence and development of nonalcoholic fatty liver disease (NAFLD), liver cirrhosis (LC), hepatic cancer (HC) and acute hepatic failure (AHF) at transcriptional level, Rat Genome 230 2.0 Array was used to detect expression profiles of OPN signalling pathway-related genes in four kinds of liver diseases. The results showed that 23, 33, 59 and 74 genes were significantly changed in the above four kinds of liver diseases, respectively. H-clustering analysis showed that the expression profiles of OPN signalling-related genes were notably different in four kinds of liver diseases. Subsequently, a total of above-mentioned 147 genes were categorized into four clusters by k-means according to the similarity of gene expression, and expression analysis systematic explorer (EASE) functional enrichment analysis revealed that OPN signalling pathway-related genes were involved in cell adhesion and migration, cell proliferation, apoptosis, stress and inflammatory reaction, etc. Finally, ingenuity pathway analysis (IPA) software was used to predict the functions of OPN signalling-related genes, and the results indicated that the activities of ROS production, cell adhesion and migration, cell proliferation were remarkably increased, while that of apoptosis, stress and inflammatory reaction were reduced in four kinds of liver diseases. In summary, the above physiological activities changed more obviously in LC, HC and AHF than in NAFLD.
Lineage-specific evolution of cnidarian Wnt ligands.

PubMed

Hensel, Katrin; Lotan, Tamar; Sanders, Steve M; Cartwright, Paulyn; Frank, Uri

2014-09-01

We have studied the evolution of Wnt genes in cnidarians and the expression pattern of all Wnt ligands in the hydrozoan Hydractinia echinata. Current views favor a scenario in which 12 Wnt sub-families were jointly inherited by cnidarians and bilaterians from their last common ancestor. Our phylogenetic analyses clustered all medusozoan genes in distinct, well-supported clades, but many orthologous relationships between medusozoan Wnts and anthozoan and bilaterian Wnt genes were poorly supported. Only seven anthozoan genes, Wnt2, Wnt4, Wnt5, Wnt6, Wnt 10, Wnt11, and Wnt16 were recovered with strong support with bilaterian genes and of those, only the Wnt2, Wnt5, Wnt11, and Wnt16 clades also included medusozoan genes. Although medusozoan Wnt8 genes clustered with anthozoan and bilaterian genes, this was not well supported. In situ hybridization studies revealed poor conservation of expression patterns of putative Wnt orthologs within Cnidaria. In polyps, only Wnt1, Wnt3, and Wnt7 were expressed at the same position in the studied cnidarian models Hydra, Hydractinia, and Nematostella. Different expression patterns are consistent with divergent functions. Our data do not fully support previous assertions regarding Wnt gene homology, and suggest a more complex history of Wnt family genes than previously suggested. This includes high rates of sequence divergence and lineage-specific duplications of Wnt genes within medusozoans, followed by functional divergence over evolutionary time scales. © 2014 Wiley Periodicals, Inc.
Discovery of a Phosphonoacetic Acid Derived Natural Product by Pathway Refactoring.

PubMed

Freestone, Todd S; Ju, Kou-San; Wang, Bin; Zhao, Huimin

2017-02-17

The activation of silent natural product gene clusters is a synthetic biology problem of great interest. As the rate at which gene clusters are identified outpaces the discovery rate of new molecules, this unknown chemical space is rapidly growing, as too are the rewards for developing technologies to exploit it. One class of natural products that has been underrepresented is phosphonic acids, which have important medical and agricultural uses. Hundreds of phosphonic acid biosynthetic gene clusters have been identified encoding for unknown molecules. Although methods exist to elicit secondary metabolite gene clusters in native hosts, they require the strain to be amenable to genetic manipulation. One method to circumvent this is pathway refactoring, which we implemented in an effort to discover new phosphonic acids from a gene cluster from Streptomyces sp. strain NRRL F-525. By reengineering this cluster for expression in the production host Streptomyces lividans, utility of refactoring is demonstrated with the isolation of a novel phosphonic acid, O-phosphonoacetic acid serine, and the characterization of its biosynthesis. In addition, a new biosynthetic branch point is identified with a phosphonoacetaldehyde dehydrogenase, which was used to identify additional phosphonic acid gene clusters that share phosphonoacetic acid as an intermediate.
Identification of a novel prophage-like gene cluster actively expressed in both virulent and avirulent strains of Leptospira interrogans serovar Lai.

PubMed

Qin, Jin-Hong; Zhang, Qing; Zhang, Zhi-Ming; Zhong, Yi; Yang, Yang; Hu, Bao-Yu; Zhao, Guo-Ping; Guo, Xiao-Kui

2008-06-01

DNA microarray analysis was used to compare the differential gene expression profiles between Leptospira interrogans serovar Lai type strain 56601 and its corresponding attenuated strain IPAV. A 22-kb genomic island covering a cluster of 34 genes (i.e., genes LA0186 to LA0219) was actively expressed in both strains but concomitantly upregulated in strain 56601 in contrast to that of IPAV. Reverse transcription-PCR assays proved that the gene cluster comprised five transcripts. Gene annotation of this cluster revealed characteristics of a putative prophage-like remnant with at least 8 of 34 sequences encoding prophage-like proteins, of which the LA0195 protein is probably a putative prophage CI-like regulator. The transcription initiation activities of putative promoter-regulatory sequences of transcripts I, II, and III, all proximal to the LA0195 gene, were further analyzed in the Escherichia coli promoter probe vector pKK232-8 by assaying the reporter chloramphenicol acetyltransferase (CAT) activities. The strong promoter activities of both transcripts I and II indicated by the E. coli CAT assay were well correlated with the in vitro sequence-specific binding of the recombinant LA0195 protein to the corresponding promoter probes detected by the electrophoresis mobility shift assay. On the other hand, the promoter activity of transcript III was very low in E. coli and failed to show active binding to the LA0195 protein in vitro. These results suggested that the LA0195 protein is likely involved in the transcription of transcripts I and II. However, the identical complete DNA sequences of this prophage remnant from these two strains strongly suggests that possible regulatory factors or signal transduction systems residing outside of this region within the genome may be responsible for the differential expression profiling in these two strains.
Discovery and characterization of miRNA genes in atlantic salmon (Salmo salar) by use of a deep sequencing approach

PubMed Central

2013-01-01

Background MicroRNAs (miRNAs) are an abundant class of endogenous small RNA molecules that downregulate gene expression at the posttranscriptional level. They play important roles in multiple biological processes by regulating genes that control developmental timing, growth, stem cell division and apoptosis by binding to the mRNA of target genes. Despite the position Atlantic salmon (Salmo salar) has as an economically important domesticated animal, there has been little research on miRNAs in this species. Knowledge about miRNAs and their target genes may be used to control health and to improve performance of economically important traits. However, before their biological function can be unravelled they must be identified and annotated. The aims of this study were to identify and characterize miRNA genes in Atlantic salmon by deep sequencing analysis of small RNA libraries from nine different tissues. Results A total of 180 distinct mature miRNAs belonging to 106 families of evolutionary conserved miRNAs, and 13 distinct novel mature miRNAs were discovered and characterized. The mature miRNAs corresponded to 521 putative precursor sequences located at unique genome locations. About 40% of these precursors were part of gene clusters, and the majority of the Salmo salar gene clusters discovered were conserved across species. Comparison of expression levels in samples from different tissues applying DESeq indicated that there were tissue specific expression differences in three conserved and one novel miRNA. Ssa-miR 736 was detected in heart tissue only, while two other clustered miRNAs (ssa-miR 212 and132) seems to be at a higher expression level in brain tissue. These observations correlate well with their expected functions as regulators of signal pathways in cardiac and neuronal cells, respectively. Ssa-miR 8163 is one of the novel miRNAs discovered and its function remains unknown. However, differential expression analysis using DESeq suggests that this miRNA is enriched in liver tissue and the precursor was mapped to intron 7 of the transferrin gene. Conclusions The identification and annotation of evolutionary conserved and novel Salmo salar miRNAs as well as the characterization of miRNA gene clusters provide biological knowledge that will greatly facilitate further functional studies on miRNAs in this species. PMID:23865519
Conserved gene clusters in bacterial genomes provide further support for the primacy of RNA

NASA Technical Reports Server (NTRS)

Siefert, J. L.; Martin, K. A.; Abdi, F.; Widger, W. R.; Fox, G. E.

1997-01-01

Five complete bacterial genome sequences have been released to the scientific community. These include four (eu)Bacteria, Haemophilus influenzae, Mycoplasma genitalium, M. pneumoniae, and Synechocystis PCC 6803, as well as one Archaeon, Methanococcus jannaschii. Features of organization shared by these genomes are likely to have arisen very early in the history of the bacteria and thus can be expected to provide further insight into the nature of early ancestors. Results of a genome comparison of these five organisms confirm earlier observations that gene order is remarkably unpreserved. There are, nevertheless, at least 16 clusters of two or more genes whose order remains the same among the four (eu)Bacteria and these are presumed to reflect conserved elements of coordinated gene expression that require gene proximity. Eight of these gene orders are essentially conserved in the Archaea as well. Many of these clusters are known to be regulated by RNA-level mechanisms in Escherichia coli, which supports the earlier suggestion that this type of regulation of gene expression may have arisen very early. We conclude that although the last common ancestor may have had a DNA genome, it likely was preceded by progenotes with an RNA genome.
RNA-Seq Analysis of Developing Pecan (Carya illinoinensis) Embryos Reveals Parallel Expression Patterns among Allergen and Lipid Metabolism Genes.

PubMed

Mattison, Christopher P; Rai, Ruhi; Settlage, Robert E; Hinchliffe, Doug J; Madison, Crista; Bland, John M; Brashear, Suzanne; Graham, Charles J; Tarver, Matthew R; Florane, Christopher; Bechtel, Peter J

2017-02-22

The pecan nut is a nutrient-rich part of a healthy diet full of beneficial fatty acids and antioxidants, but can also cause allergic reactions in people suffering from food allergy to the nuts. The transcriptome of a developing pecan nut was characterized to identify the gene expression occurring during the process of nut development and to highlight those genes involved in fatty acid metabolism and those that commonly act as food allergens. Pecan samples were collected at several time points during the embryo development process including the water, gel, dough, and mature nut stages. Library preparation and sequencing were performed using Illumina-based mRNA HiSeq with RNA from four time points during the growing season during August and September 2012. Sequence analysis with Trinotate software following the Trinity protocol identified 133,000 unigenes with 52,267 named transcripts and 45,882 annotated genes. A total of 27,312 genes were defined by GO annotation. Gene expression clustering analysis identified 12 different gene expression profiles, each containing a number of genes. Three pecan seed storage proteins that commonly act as allergens, Car i 1, Car i 2, and Car i 4, were significantly up-regulated during the time course. Up-regulated fatty acid metabolism genes that were identified included acyl-[ACP] desaturase and omega-6 desaturase genes involved in oleic and linoleic acid metabolism. Notably, a few of the up-regulated acyl-[ACP] desaturase and omega-6 desaturase genes that were identified have expression patterns similar to the allergen genes based upon gene expression clustering and qPCR analysis. These findings suggest the possibility of coordinated accumulation of lipids and allergens during pecan nut embryogenesis.
Inference from clustering with application to gene-expression microarrays.

PubMed

Dougherty, Edward R; Barrera, Junior; Brun, Marcel; Kim, Seungchan; Cesar, Roberto M; Chen, Yidong; Bittner, Michael; Trent, Jeffrey M

2002-01-01

There are many algorithms to cluster sample data points based on nearness or a similarity measure. Often the implication is that points in different clusters come from different underlying classes, whereas those in the same cluster come from the same class. Stochastically, the underlying classes represent different random processes. The inference is that clusters represent a partition of the sample points according to which process they belong. This paper discusses a model-based clustering toolbox that evaluates cluster accuracy. Each random process is modeled as its mean plus independent noise, sample points are generated, the points are clustered, and the clustering error is the number of points clustered incorrectly according to the generating random processes. Various clustering algorithms are evaluated based on process variance and the key issue of the rate at which algorithmic performance improves with increasing numbers of experimental replications. The model means can be selected by hand to test the separability of expected types of biological expression patterns. Alternatively, the model can be seeded by real data to test the expected precision of that output or the extent of improvement in precision that replication could provide. In the latter case, a clustering algorithm is used to form clusters, and the model is seeded with the means and variances of these clusters. Other algorithms are then tested relative to the seeding algorithm. Results are averaged over various seeds. Output includes error tables and graphs, confusion matrices, principal-component plots, and validation measures. Five algorithms are studied in detail: K-means, fuzzy C-means, self-organizing maps, hierarchical Euclidean-distance-based and correlation-based clustering. The toolbox is applied to gene-expression clustering based on cDNA microarrays using real data. Expression profile graphics are generated and error analysis is displayed within the context of these profile graphics. A large amount of generated output is available over the web.
Identification of specific gene expression profiles in fibroblasts derived from middle ear cholesteatoma.

PubMed

Yoshikawa, Mamoru; Kojima, Hiromi; Wada, Kota; Tsukidate, Toshiharu; Okada, Naoko; Saito, Hirohisa; Moriyama, Hiroshi

2006-07-01

To investigate the role of fibroblasts in the pathogenesis of cholesteatoma. Tissue specimens were obtained from our patients. Middle ear cholesteatoma-derived fibroblasts (MECFs) and postauricular skin-derived fibroblasts (SFs) as controls were then cultured for a few weeks. These fibroblasts were stimulated with interleukin (IL) 1alpha and/or IL-1beta before gene expression assays. We used the human genome U133A probe array (GeneChip) and real-time polymerase chain reaction to examine and compare the gene expression profiles of the MECFs and SFs. Six patients who had undergone tympanoplasty. The IL-1alpha-regulated genes were classified into 4 distinct clusters on the basis of profiles differentially regulated by SF and MECF using a hierarchical clustering analysis. The messenger RNA expressions of LARC (liver and activation-regulated chemokine), GMCSF (granulocyte-macrophage colony-stimulating factor), epiregulin, ICAM1 (intercellular adhesion molecule 1), and TGFA (transforming growth factor alpha) were more strongly up-regulated by IL-1alpha and/or IL-1beta in MECF than in SF, suggesting that these fibroblasts derived from different tissues retained their typical gene expression profiles. Fibroblasts may play a role in hyperkeratosis of middle ear cholesteatoma by releasing molecules involved in inflammation and epidermal growth. These fibroblasts may retain tissue-specific characteristics presumably controlled by epigenetic mechanisms.
Identification and handling of artifactual gene expression profiles emerging in microarray hybridization experiments

PubMed Central

Brodsky, Leonid; Leontovich, Andrei; Shtutman, Michael; Feinstein, Elena

2004-01-01

Mathematical methods of analysis of microarray hybridizations deal with gene expression profiles as elementary units. However, some of these profiles do not reflect a biologically relevant transcriptional response, but rather stem from technical artifacts. Here, we describe two technically independent but rationally interconnected methods for identification of such artifactual profiles. Our diagnostics are based on detection of deviations from uniformity, which is assumed as the main underlying principle of microarray design. Method 1 is based on detection of non-uniformity of microarray distribution of printed genes that are clustered based on the similarity of their expression profiles. Method 2 is based on evaluation of the presence of gene-specific microarray spots within the slides’ areas characterized by an abnormal concentration of low/high differential expression values, which we define as ‘patterns of differentials’. Applying two novel algorithms, for nested clustering (method 1) and for pattern detection (method 2), we can make a dual estimation of the profile’s quality for almost every printed gene. Genes with artifactual profiles detected by method 1 may then be removed from further analysis. Suspicious differential expression values detected by method 2 may be either removed or weighted according to the probabilities of patterns that cover them, thus diminishing their input in any further data analysis. PMID:14999086
An Overview of Hox Genes in Lophotrochozoa: Evolution and Functionality

PubMed Central

Barucca, Marco; Canapa, Adriana; Biscotti, Maria Assunta

2016-01-01

Hox genes are regulators of animal embryonic development. Changes in the number and sequence of Hox genes as well as in their expression patterns have been related to the evolution of the body plan. Lophotrochozoa is a clade of Protostomia characterized by several phyla which show a wide morphological diversity. Despite that the works summarized in this review emphasize the fragmentary nature of the data available regarding the presence and expression of Hox genes, they also offer interesting insight into the evolution of the Hox cluster and the role played by Hox genes in several phyla. However, the number of genes involved in the cluster of the lophotrochozoan ancestor is still a question of debate. The data presented here suggest that at least nine genes were present while two other genes, Lox4 and Post-2, may either have been present in the ancestor or may have arisen as a result of duplication in the Brachiopoda-Mollusca-Annelida lineage. Spatial and temporal collinearity is a feature of Hox gene expression which was probably present in the ancestor of deuterostomes and protostomes. However, in Lophotrochozoa, it has been detected in only a few species belonging to Annelida and Mollusca. PMID:29615580
The Fdb3 transcription factor of the Fusarium Detoxification of Benzoxazolinone gene cluster is required for MBOA but not BOA degradation in Fusarium pseudograminearum.

PubMed

Kettle, Andrew J; Carere, Jason; Batley, Jacqueline; Manners, John M; Kazan, Kemal; Gardiner, Donald M

2016-03-01

A number of cereals produce the benzoxazolinone class of phytoalexins. Fusarium species pathogenic towards these hosts can typically degrade these compounds via an aminophenol intermediate, and the ability to do so is encoded by a group of genes found in the Fusarium Detoxification of Benzoxazolinone (FDB) cluster. A zinc finger transcription factor encoded by one of the FDB cluster genes (FDB3) has been proposed to regulate the expression of other genes in the cluster and hence is potentially involved in benzoxazolinone degradation. Herein we show that Fdb3 is essential for the ability of Fusarium pseudograminearum to efficiently detoxify the predominant wheat benzoxazolinone, 6-methoxy-benzoxazolin-2-one (MBOA), but not benzoxazoline-2-one (BOA). Furthermore, additional genes thought to be part of the FDB gene cluster, based upon transcriptional response to benzoxazolinones, are regulated by Fdb3. However, deletion mutants for these latter genes remain capable of benzoxazolinone degradation, suggesting that they are not essential for this process. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.
Identification of suitable genes contributes to lung adenocarcinoma clustering by multiple meta-analysis methods.

PubMed

Yang, Ze-Hui; Zheng, Rui; Gao, Yuan; Zhang, Qiang

2016-09-01

With the widespread application of high-throughput technology, numerous meta-analysis methods have been proposed for differential expression profiling across multiple studies. We identified the suitable differentially expressed (DE) genes that contributed to lung adenocarcinoma (ADC) clustering based on seven popular multiple meta-analysis methods. Seven microarray expression profiles of ADC and normal controls were extracted from the ArrayExpress database. The Bioconductor was used to perform the data preliminary preprocessing. Then, DE genes across multiple studies were identified. Hierarchical clustering was applied to compare the classification performance for microarray data samples. The classification efficiency was compared based on accuracy, sensitivity and specificity. Across seven datasets, 573 ADC cases and 222 normal controls were collected. After filtering out unexpressed and noninformative genes, 3688 genes were remained for further analysis. The classification efficiency analysis showed that DE genes identified by sum of ranks method separated ADC from normal controls with the best accuracy, sensitivity and specificity of 0.953, 0.969 and 0.932, respectively. The gene set with the highest classification accuracy mainly participated in the regulation of response to external stimulus (P = 7.97E-04), cyclic nucleotide-mediated signaling (P = 0.01), regulation of cell morphogenesis (P = 0.01) and regulation of cell proliferation (P = 0.01). Evaluation of DE genes identified by different meta-analysis methods in classification efficiency provided a new perspective to the choice of the suitable method in a given application. Varying meta-analysis methods always present varying abilities, so synthetic consideration should be taken when providing meta-analysis methods for particular research. © 2015 John Wiley & Sons Ltd.
Engineered human skin substitutes undergo large-scale genomic reprogramming and normal skin-like maturation after transplantation to athymic mice.

PubMed

Klingenberg, Jennifer M; McFarland, Kevin L; Friedman, Aaron J; Boyce, Steven T; Aronow, Bruce J; Supp, Dorothy M

2010-02-01

Bioengineered skin substitutes can facilitate wound closure in severely burned patients, but deficiencies limit their outcomes compared with native skin autografts. To identify gene programs associated with their in vivo capabilities and limitations, we extended previous gene expression profile analyses to now compare engineered skin after in vivo grafting with both in vitro maturation and normal human skin. Cultured skin substitutes were grafted on full-thickness wounds in athymic mice, and biopsy samples for microarray analyses were collected at multiple in vitro and in vivo time points. Over 10,000 transcripts exhibited large-scale expression pattern differences during in vitro and in vivo maturation. Using hierarchical clustering, 11 different expression profile clusters were partitioned on the basis of differential sample type and temporal stage-specific activation or repression. Analyses show that the wound environment exerts a massive influence on gene expression in skin substitutes. For example, in vivo-healed skin substitutes gained the expression of many native skin-expressed genes, including those associated with epidermal barrier and multiple categories of cell-cell and cell-basement membrane adhesion. In contrast, immunological, trichogenic, and endothelial gene programs were largely lacking. These analyses suggest important areas for guiding further improvement of engineered skin for both increased homology with native skin and enhanced wound healing.
Genes associated with thermosensitive genic male sterility in rice identified by comparative expression profiling.

PubMed

Pan, Yufang; Li, Qiaofeng; Wang, Zhizheng; Wang, Yang; Ma, Rui; Zhu, Lili; He, Guangcun; Chen, Rongzhi

2014-12-16

Thermosensitive genic male sterile (TGMS) lines and photoperiod-sensitive genic male sterile (PGMS) lines have been successfully used in hybridization to improve rice yields. However, the molecular mechanisms underlying male sterility transitions in most PGMS/TGMS rice lines are unclear. In the recently developed TGMS-Co27 line, the male sterility is based on co-suppression of a UDP-glucose pyrophosphorylase gene (Ugp1), but further study is needed to fully elucidate the molecular mechanisms involved. Microarray-based transcriptome profiling of TGMS-Co27 and wild-type Hejiang 19 (H1493) plants grown at high and low temperatures revealed that 15462 probe sets representing 8303 genes were differentially expressed in the two lines, under the two conditions, or both. Environmental factors strongly affected global gene expression. Some genes important for pollen development were strongly repressed in TGMS-Co27 at high temperature. More significantly, series-cluster analysis of differentially expressed genes (DEGs) between TGMS-Co27 plants grown under the two conditions showed that low temperature induced the expression of a gene cluster. This cluster was found to be essential for sterility transition. It includes many meiosis stage-related genes that are probably important for thermosensitive male sterility in TGMS-Co27, inter alia: Arg/Ser-rich domain (RS)-containing zinc finger proteins, polypyrimidine tract-binding proteins (PTBs), DEAD/DEAH box RNA helicases, ZOS (C2H2 zinc finger proteins of Oryza sativa), at least one polyadenylate-binding protein and some other RNA recognition motif (RRM) domain-containing proteins involved in post-transcriptional processes, eukaryotic initiation factor 5B (eIF5B), ribosomal proteins (L37, L1p/L10e, L27 and L24), aminoacyl-tRNA synthetases (ARSs), eukaryotic elongation factor Tu (eEF-Tu) and a peptide chain release factor protein involved in translation. The differential expression of 12 DEGs that are important for pollen development, low temperature responses or TGMS was validated by quantitative RT-PCR (qRT-PCR). Temperature strongly affects global gene expression and may be the common regulator of fertility in PGMS/TGMS rice lines. The identified expression changes reflect perturbations in the transcriptomic regulation of pollen development networks in TGMS-Co27. Findings from this and previous studies indicate that sets of genes involved in post-transcriptional and translation processes are involved in thermosensitive male sterility transitions in TGMS-Co27.
Engineering of EPA/DHA omega-3 fatty acid production by Lactococcus lactis subsp. cremoris MG1363.

PubMed

Amiri-Jami, Mitra; Lapointe, Gisele; Griffiths, Mansel W

2014-04-01

Eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) have been shown to be of major importance in human health. Therefore, these essential polyunsaturated fatty acids have received considerable attention in both human and farm animal nutrition. Currently, fish and fish oils are the main dietary sources of EPA/DHA. To generate sustainable novel sources for EPA and DHA, the 35-kb EPA/DHA synthesis gene cluster was isolated from a marine bacterium, Shewanella baltica MAC1. To streamline the introduction of the genes into food-grade microorganisms such as lactic acid bacteria, unnecessary genes located upstream and downstream of the EPA/DHA gene cluster were deleted. Recombinant Escherichia coli harboring the 20-kb gene cluster produced 3.5- to 6.1-fold more EPA than those carrying the 35-kb DNA fragment coding for EPA/DHA synthesis. The 20-kb EPA/DHA gene cluster was cloned into a modified broad-host-range low copy number vector, pIL252m (4.7 kb, Ery) and expressed in Lactococcus lactis subsp. cremoris MG1363. Recombinant L. lactis produced DHA (1.35 ± 0.5 mg g(-1) cell dry weight) and EPA (0.12 ± 0.04 mg g(-1) cell dry weight). This is believed to be the first successful cloning and expression of EPA/DHA synthesis gene cluster in lactic acid bacteria. Our findings advance the future use of EPA/DHA-producing lactic acid bacteria in such applications as dairy starters, silage adjuncts, and animal feed supplements.
Investigating a multigene prognostic assay based on significant pathways for Luminal A breast cancer through gene expression profile analysis.

PubMed

Gao, Haiyan; Yang, Mei; Zhang, Xiaolan

2018-04-01

The present study aimed to investigate potential recurrence-risk biomarkers based on significant pathways for Luminal A breast cancer through gene expression profile analysis. Initially, the gene expression profiles of Luminal A breast cancer patients were downloaded from The Cancer Genome Atlas database. The differentially expressed genes (DEGs) were identified using a Limma package and the hierarchical clustering analysis was conducted for the DEGs. In addition, the functional pathways were screened using Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses and rank ratio calculation. The multigene prognostic assay was exploited based on the statistically significant pathways and its prognostic function was tested using train set and verified using the gene expression data and survival data of Luminal A breast cancer patients downloaded from the Gene Expression Omnibus. A total of 300 DEGs were identified between good and poor outcome groups, including 176 upregulated genes and 124 downregulated genes. The DEGs may be used to effectively distinguish Luminal A samples with different prognoses verified by hierarchical clustering analysis. There were 9 pathways screened as significant pathways and a total of 18 DEGs involved in these 9 pathways were identified as prognostic biomarkers. According to the survival analysis and receiver operating characteristic curve, the obtained 18-gene prognostic assay exhibited good prognostic function with high sensitivity and specificity to both the train and test samples. In conclusion the 18-gene prognostic assay including the key genes, transcription factor 7-like 2, anterior parietal cortex and lymphocyte enhancer factor-1 may provide a new method for predicting outcomes and may be conducive to the promotion of precision medicine for Luminal A breast cancer.

Comparative genome sequencing reveals genomic signature of extreme desiccation tolerance in the anhydrobiotic midge

PubMed Central

Gusev, Oleg; Suetsugu, Yoshitaka; Cornette, Richard; Kawashima, Takeshi; Logacheva, Maria D.; Kondrashov, Alexey S.; Penin, Aleksey A.; Hatanaka, Rie; Kikuta, Shingo; Shimura, Sachiko; Kanamori, Hiroyuki; Katayose, Yuichi; Matsumoto, Takashi; Shagimardanova, Elena; Alexeev, Dmitry; Govorun, Vadim; Wisecaver, Jennifer; Mikheyev, Alexander; Koyanagi, Ryo; Fujie, Manabu; Nishiyama, Tomoaki; Shigenobu, Shuji; Shibata, Tomoko F.; Golygina, Veronika; Hasebe, Mitsuyasu; Okuda, Takashi; Satoh, Nori; Kikawada, Takahiro

2014-01-01

Anhydrobiosis represents an extreme example of tolerance adaptation to water loss, where an organism can survive in an ametabolic state until water returns. Here we report the first comparative analysis examining the genomic background of extreme desiccation tolerance, which is exclusively found in larvae of the only anhydrobiotic insect, Polypedilum vanderplanki. We compare the genomes of P. vanderplanki and a congeneric desiccation-sensitive midge P. nubifer. We determine that the genome of the anhydrobiotic species specifically contains clusters of multi-copy genes with products that act as molecular shields. In addition, the genome possesses several groups of genes with high similarity to known protective proteins. However, these genes are located in distinct paralogous clusters in the genome apart from the classical orthologues of the corresponding genes shared by both chironomids and other insects. The transcripts of these clustered paralogues contribute to a large majority of the mRNA pool in the desiccating larvae and most likely define successful anhydrobiosis. Comparison of expression patterns of orthologues between two chironomid species provides evidence for the existence of desiccation-specific gene expression systems in P. vanderplanki. PMID:25216354
Participation of the arcRACME protein in self-activation of the arc operon located in the arginine catabolism mobile element in pandemic clone USA300.

PubMed

Rozo, Zayda Lorena Corredor; Márquez-Ortiz, Ricaurte Alejandro; Castro, Betsy Esperanza; Gómez, Natasha Vanegas; Escobar-Pérez, Javier

2017-07-01

Staphylococcus aureus pandemic clone USA300 has, in addition to its constitutive arginine catabolism (arc) gene cluster, an arginine catabolism mobile element (ACME) carrying another such cluster, which gives this clone advantages in colonisation and infection. Gene arcR, which encodes an oxygen-sensitive transcriptional regulator, is inside ACME and downstream of the constitutive arc gene cluster, and this situation may have an impact on its activation. Different relative expression behaviours are proven here for arcRACME and the arcACME operon compared to the constitutive ones. We also show that the artificially expressed recombinant ArcRACME protein binds to the promoter region of the arcACME operon; this mechanism can be related to a positive feedback model, which may be responsible for increased anaerobic survival of the USA300 clone during infection-related processes.
The human RHOX gene cluster: target genes and functional analysis of gene variants in infertile men.

PubMed

Borgmann, Jennifer; Tüttelmann, Frank; Dworniczak, Bernd; Röpke, Albrecht; Song, Hye-Won; Kliesch, Sabine; Wilkinson, Miles F; Laurentino, Sandra; Gromoll, Jörg

2016-11-15

The X-linked reproductive homeobox (RHOX) gene cluster encodes transcription factors preferentially expressed in reproductive tissues. This gene cluster has important roles in male fertility based on phenotypic defects of Rhox-mutant mice and the finding that aberrant RHOX promoter methylation is strongly associated with abnormal human sperm parameters. However, little is known about the molecular mechanism of RHOX function in humans. Using gene expression profiling, we identified genes regulated by members of the human RHOX gene cluster. Some genes were uniquely regulated by RHOXF1 or RHOXF2/2B, while others were regulated by both of these transcription factors. Several of these regulated genes encode proteins involved in processes relevant to spermatogenesis; e.g. stress protection and cell survival. One of the target genes of RHOXF2/2B is RHOXF1, suggesting cross-regulation to enhance transcriptional responses. The potential role of RHOX in human infertility was addressed by sequencing all RHOX exons in a group of 250 patients with severe oligozoospermia. This revealed two mutations in RHOXF1 (c.515G > A and c.522C > T) and four in RHOXF2/2B (-73C > G, c.202G > A, c.411C > T and c.679G > A), of which only one (c.202G > A) was found in a control group of men with normal sperm concentration. Functional analysis demonstrated that c.202G > A and c.679G > A significantly impaired the ability of RHOXF2/2B to regulate downstream genes. Molecular modelling suggested that these mutations alter RHOXF2/F2B protein conformation. By combining clinical data with in vitro functional analysis, we demonstrate how the X-linked RHOX gene cluster may function in normal human spermatogenesis and we provide evidence that it is impaired in human male fertility.
Gene Cluster Encoding Cholate Catabolism in Rhodococcus spp.

PubMed Central

Wilbrink, Maarten H.; Casabon, Israël; Stewart, Gordon R.; Liu, Jie; van der Geize, Robert; Eltis, Lindsay D.

2012-01-01

Bile acids are highly abundant steroids with important functions in vertebrate digestion. Their catabolism by bacteria is an important component of the carbon cycle, contributes to gut ecology, and has potential commercial applications. We found that Rhodococcus jostii RHA1 grows well on cholate, as well as on its conjugates, taurocholate and glycocholate. The transcriptome of RHA1 growing on cholate revealed 39 genes upregulated on cholate, occurring in a single gene cluster. Reverse transcriptase quantitative PCR confirmed that selected genes in the cluster were upregulated 10-fold on cholate versus on cholesterol. One of these genes, kshA3, encoding a putative 3-ketosteroid-9α-hydroxylase, was deleted and found essential for growth on cholate. Two coenzyme A (CoA) synthetases encoded in the cluster, CasG and CasI, were heterologously expressed. CasG was shown to transform cholate to cholyl-CoA, thus initiating side chain degradation. CasI was shown to form CoA derivatives of steroids with isopropanoyl side chains, likely occurring as degradation intermediates. Orthologous gene clusters were identified in all available Rhodococcus genomes, as well as that of Thermomonospora curvata. Moreover, Rhodococcus equi 103S, Rhodococcus ruber Chol-4 and Rhodococcus erythropolis SQ1 each grew on cholate. In contrast, several mycolic acid bacteria lacking the gene cluster were unable to grow on cholate. Our results demonstrate that the above-mentioned gene cluster encodes cholate catabolism and is distinct from a more widely occurring gene cluster encoding cholesterol catabolism. PMID:23024343
Hierarchical Bayesian modelling of gene expression time series across irregularly sampled replicates and clusters.

PubMed

Hensman, James; Lawrence, Neil D; Rattray, Magnus

2013-08-20

Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering. Existing methodologies fail to capture either the temporal or replicated nature of the experiments, and often impose constraints on the data collection process, such as regularly spaced samples, or similar sampling schema across replications. We propose hierarchical Gaussian processes as a general model of gene expression time-series, with application to a variety of problems. In particular, we illustrate the method's capacity for missing data imputation, data fusion and clustering.The method can impute data which is missing both systematically and at random: in a hold-out test on real data, performance is significantly better than commonly used imputation methods. The method's ability to model inter- and intra-cluster variance leads to more biologically meaningful clusters. The approach removes the necessity for evenly spaced samples, an advantage illustrated on a developmental Drosophila dataset with irregular replications. The hierarchical Gaussian process model provides an excellent statistical basis for several gene-expression time-series tasks. It has only a few additional parameters over a regular GP, has negligible additional complexity, is easily implemented and can be integrated into several existing algorithms. Our experiments were implemented in python, and are available from the authors' website: http://staffwww.dcs.shef.ac.uk/people/J.Hensman/.
Analysis of bHLH coding genes using gene co-expression network approach.

PubMed

Srivastava, Swati; Sanchita; Singh, Garima; Singh, Noopur; Srivastava, Gaurava; Sharma, Ashok

2016-07-01

Network analysis provides a powerful framework for the interpretation of data. It uses novel reference network-based metrices for module evolution. These could be used to identify module of highly connected genes showing variation in co-expression network. In this study, a co-expression network-based approach was used for analyzing the genes from microarray data. Our approach consists of a simple but robust rank-based network construction. The publicly available gene expression data of Solanum tuberosum under cold and heat stresses were considered to create and analyze a gene co-expression network. The analysis provide highly co-expressed module of bHLH coding genes based on correlation values. Our approach was to analyze the variation of genes expression, according to the time period of stress through co-expression network approach. As the result, the seed genes were identified showing multiple connections with other genes in the same cluster. Seed genes were found to be vary in different time periods of stress. These analyzed seed genes may be utilized further as marker genes for developing the stress tolerant plant species.
Rapid construction of a Bacterial Artificial Chromosomal (BAC) expression vector using designer DNA fragments.

PubMed

Chen, Chao; Zhao, Xinqing; Jin, Yingyu; Zhao, Zongbao Kent; Suh, Joo-Won

2014-11-01

Bacterial artificial chromosomal (BAC) vectors are increasingly being used in cloning large DNA fragments containing complex biosynthetic pathways to facilitate heterologous production of microbial metabolites for drug development. To express inserted genes using Streptomyces species as the production hosts, an integration expression cassette is required to be inserted into the BAC vector, which includes genetic elements encoding a phage-specific attachment site, an integrase, an origin of transfer, a selection marker and a promoter. Due to the large sizes of DNA inserted into the BAC vectors, it is normally inefficient and time-consuming to assemble these fragments by routine PCR amplifications and restriction-ligations. Here we present a rapid method to insert fragments to construct BAC-based expression vectors. A DNA fragment of about 130 bp was designed, which contains upstream and downstream homologous sequences of both BAC vector and pIB139 plasmid carrying the whole integration expression cassette. In-Fusion cloning was performed using the designer DNA fragment to modify pIB139, followed by λ-RED-mediated recombination to obtain the BAC-based expression vector. We demonstrated the effectiveness of this method by rapid construction of a BAC-based expression vector with an insert of about 120 kb that contains the entire gene cluster for biosynthesis of immunosuppressant FK506. The empty BAC-based expression vector constructed in this study can be conveniently used for construction of BAC libraries using either microbial pure culture or environmental DNA, and the selected BAC clones can be directly used for heterologous expression. Alternatively, if a BAC library has already been constructed using a commercial BAC vector, the selected BAC vectors can be manipulated using the method described here to get the BAC-based expression vectors with desired gene clusters for heterologous expression. The rapid construction of a BAC-based expression vector facilitates heterologous expression of large gene clusters for drug discovery. Copyright © 2014 Elsevier Inc. All rights reserved.
Growth promotion of the opportunistic human pathogen, Staphylococcus lugdunensis, by heme, hemoglobin, and coculture with Staphylococcus aureus

PubMed Central

Brozyna, Jeremy R; Sheldon, Jessica R; Heinrichs, David E

2014-01-01

Staphylococcus lugdunensis is both a commensal of humans and an opportunistic pathogen. Little is currently known about the molecular mechanisms underpinning the virulence of this bacterium. Here, we demonstrate that in contrast to S. aureus,S. lugdunensis makes neither staphyloferrin A (SA) nor staphyloferrin B (SB) in response to iron deprivation, owing to the absence of the SB gene cluster, and a large deletion in the SA biosynthetic gene cluster. As a result, the species grows poorly in serum-containing media, and this defect was complemented by introduction of the S. aureusSA gene cluster into S. lugdunensis. S. lugdunensis expresses the HtsABC and SirABC transporters for SA and SB, respectively; the latter gene set is found within the isd (heme acquisition) gene cluster. An isd deletion strain was significantly debilitated for iron acquisition from both heme and hemoglobin, and was also incapable of utilizing ferric-SB as an iron source, while an hts mutant could not grow on ferric-SA as an iron source. In iron-restricted coculture experiments, S. aureus significantly enhanced the growth of S. lugdunensis, in a manner dependent on staphyloferrin production by S. aureus, and the expression of the cognate transporters by S. lugdunensis. PMID:24515974
Genomic and Functional Analyses of the 2-Aminophenol Catabolic Pathway and Partial Conversion of Its Substrate into Picolinic Acid in Burkholderia xenovorans LB400

PubMed Central

Agulló, Loreine; González, Myriam; Seeger, Michael

2013-01-01

2-aminophenol (2-AP) is a toxic nitrogen-containing aromatic pollutant. Burkholderia xenovorans LB400 possess an amn gene cluster that encodes the 2-AP catabolic pathway. In this report, the functionality of the 2-aminophenol pathway of B. xenovorans strain LB400 was analyzed. The amnRJBACDFEHG cluster located at chromosome 1 encodes the enzymes for the degradation of 2-aminophenol. The absence of habA and habB genes in LB400 genome correlates with its no growth on nitrobenzene. RT-PCR analyses in strain LB400 showed the co-expression of amnJB, amnBAC, amnACD, amnDFE and amnEHG genes, suggesting that the amn cluster is an operon. RT-qPCR showed that the amnB gene expression was highly induced by 2-AP, whereas a basal constitutive expression was observed in glucose, indicating that these amn genes are regulated. We propose that the predicted MarR-type transcriptional regulator encoded by the amnR gene acts as repressor of the amn gene cluster using a MarR-type regulatory binding sequence. This report showed that LB400 resting cells degrade completely 2-AP. The amn gene cluster from strain LB400 is highly identical to the amn gene cluster from P. knackmussi strain B13, which could not grow on 2-AP. However, we demonstrate that B. xenovorans LB400 is able to grow using 2-AP as sole nitrogen source and glucose as sole carbon source. An amnBA − mutant of strain LB400 was unable to grow with 2-AP as nitrogen source and glucose as carbon source and to degrade 2-AP. This study showed that during LB400 growth on 2-AP this substrate was partially converted into picolinic acid (PA), a well-known antibiotic. The addition of PA at lag or mid-exponential phase inhibited LB400 growth. The MIC of PA for strain LB400 is 2 mM. Overall, these results demonstrate that B. xenovorans strain LB400 posses a functional 2-AP catabolic central pathway, which could lead to the production of picolinic acid. PMID:24124510
Differential Expression of Anthocyanin Biosynthetic Genes and Transcription Factor PcMYB10 in Pears (Pyrus communis L.)

PubMed Central

Li, Xi-Hong; Wu, Mao-Yu; Wang, Ai-Li; Jiang, Yu-Qian; Jiang, Yun-Hong

2012-01-01

Anthocyanin biosynthesis in various plants is affected by environmental conditions and controlled by the transcription level of the corresponding genes. In pears (Pyrus communis cv. ‘Wujiuxiang’), anthocyanin biosynthesis is significantly induced during low temperature storage compared with that at room temperature. We further examined the transcriptional levels of anthocyanin biosynthetic genes in ‘Wujiuxiang’ pears during developmental ripening and temperature-induced storage. The expression of genes that encode flavanone 3-hydroxylase, dihydroflavonol 4-reductase, anthocyanidin synthase, UDP-glucose: flavonoid 3-O-glucosyltransferase, and R2R3 MYB transcription factor (PcMYB10) was strongly positively correlated with anthocyanin accumulation in ‘Wujiuxiang’ pears in response to both developmental and cold-temperature induction. Hierarchical clustering analysis revealed the expression patterns of the set of target genes, of which PcMYB10 and most anthocyanin biosynthetic genes were related to the same cluster. The present work may help explore the molecular mechanism that regulates anthocyanin biosynthesis and its response to abiotic stress at the transcriptional level in plants. PMID:23029391
Integrating microarray analysis and the soybean genome to understand the soybeans iron deficiency response

PubMed Central

2009-01-01

Background Soybeans grown in the upper Midwestern United States often suffer from iron deficiency chlorosis, which results in yield loss at the end of the season. To better understand the effect of iron availability on soybean yield, we identified genes in two near isogenic lines with changes in expression patterns when plants were grown in iron sufficient and iron deficient conditions. Results Transcriptional profiles of soybean (Glycine max, L. Merr) near isogenic lines Clark (PI548553, iron efficient) and IsoClark (PI547430, iron inefficient) grown under Fe-sufficient and Fe-limited conditions were analyzed and compared using the Affymetrix® GeneChip® Soybean Genome Array. There were 835 candidate genes in the Clark (PI548553) genotype and 200 candidate genes in the IsoClark (PI547430) genotype putatively involved in soybean's iron stress response. Of these candidate genes, fifty-eight genes in the Clark genotype were identified with a genetic location within known iron efficiency QTL and 21 in the IsoClark genotype. The arrays also identified 170 single feature polymorphisms (SFPs) specific to either Clark or IsoClark. A sliding window analysis of the microarray data and the 7X genome assembly coupled with an iterative model of the data showed the candidate genes are clustered in the genome. An analysis of 5' untranslated regions in the promoter of candidate genes identified 11 conserved motifs in 248 differentially expressed genes, all from the Clark genotype, representing 129 clusters identified earlier, confirming the cluster analysis results. Conclusion These analyses have identified the first genes with expression patterns that are affected by iron stress and are located within QTL specific to iron deficiency stress. The genetic location and promoter motif analysis results support the hypothesis that the differentially expressed genes are co-regulated. The combined results of all analyses lead us to postulate iron inefficiency in soybean is a result of a mutation in a transcription factor(s), which controls the expression of genes required in inducing an iron stress response. PMID:19678937
Evolution of coding and non-coding genes in HOX clusters of a marsupial.

PubMed

Yu, Hongshi; Lindsay, James; Feng, Zhi-Ping; Frankenberg, Stephen; Hu, Yanqiu; Carone, Dawn; Shaw, Geoff; Pask, Andrew J; O'Neill, Rachel; Papenfuss, Anthony T; Renfree, Marilyn B

2012-06-18

The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial.
Evolution of coding and non-coding genes in HOX clusters of a marsupial

PubMed Central

2012-01-01

Background The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Results Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. Conclusions This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial. PMID:22708672
Differential Retention of Gene Functions in a Secondary Metabolite Cluster.

PubMed

Reynolds, Hannah T; Slot, Jason C; Divon, Hege H; Lysøe, Erik; Proctor, Robert H; Brown, Daren W

2017-08-01

In fungi, distribution of secondary metabolite (SM) gene clusters is often associated with host- or environment-specific benefits provided by SMs. In the plant pathogen Alternaria brassicicola (Dothideomycetes), the DEP cluster confers an ability to synthesize the SM depudecin, a histone deacetylase inhibitor that contributes weakly to virulence. The DEP cluster includes genes encoding enzymes, a transporter, and a transcription regulator. We investigated the distribution and evolution of the DEP cluster in 585 fungal genomes and found a wide but sporadic distribution among Dothideomycetes, Sordariomycetes, and Eurotiomycetes. We confirmed DEP gene expression and depudecin production in one fungus, Fusarium langsethiae. Phylogenetic analyses suggested 6-10 horizontal gene transfers (HGTs) of the cluster, including a transfer that led to the presence of closely related cluster homologs in Alternaria and Fusarium. The analyses also indicated that HGTs were frequently followed by loss/pseudogenization of one or more DEP genes. Independent cluster inactivation was inferred in at least four fungal classes. Analyses of transitions among functional, pseudogenized, and absent states of DEP genes among Fusarium species suggest enzyme-encoding genes are lost at higher rates than the transporter (DEP3) and regulatory (DEP6) genes. The phenotype of an experimentally-induced DEP3 mutant of Fusarium did not support the hypothesis that selective retention of DEP3 and DEP6 protects fungi from exogenous depudecin. Together, the results suggest that HGT and gene loss have contributed significantly to DEP cluster distribution, and that some DEP genes provide a greater fitness benefit possibly due to a differential tendency to form network connections. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.
Medicago truncatula shows distinct patterns of mycorrhiza-related gene expression after inoculation with three different arbuscular mycorrhizal fungi.

PubMed

Feddermann, Nadja; Boller, Thomas; Salzer, Peter; Elfstrand, Sara; Wiemken, Andres; Elfstrand, Malin

2008-02-01

Different arbuscular mycorrhizal fungi (AMF) alter growth and nutrition of a given plant differently. Plant gene expression patterns in response to fungal colonization show a certain overlap when colonized by fungi of the Glomeraceae. However, little is known of plant responses to fungi of different fungal taxa, e.g. the Gigasporaceae. We therefore compared the impact of colonization by three taxonomically different AMF species (Glomus intraradices, Glomus mosseae and Scutellospora castanea) on Medicago truncatula at the physiological and transcriptional level using quantitative-PCR. Each AMF developed a species-typical colonization pattern, with a colonization degree of 60% for G. intraradices and 30% for G. mosseae. Both species developed appressoria, intraradical hyphae, arbuscules and vesicles. S. castanea showed a colonization degree of 10% and developed appressoria, intraradical hyphae, arbuscules and arbusculate coils. All AMF enhanced the plant biomass accumulation and nutritional status although not in correlation with the colonization degree. The expression of 10 mycorrhiza-specific or mycorrhiza-associated plant genes could be separated into two clusters. The first cluster, containing arbuscule-induced genes, was highly induced in interactions with G. intraradices and G. mosseae but also slightly induced by S. castanea. The second cluster of genes contained genes that were induced primarily by S. castanea. In conclusion, genes that respond to colonization by fungi of the genus Glomus also respond to Scutellospora. However, there is also a group of genes that is significantly induced only by Scutellospora and not by Glomus species in this study. Our data indicate that genes may be differentially regulated in response to the different AM fungi.
Comparative Analysis of Tocopherol Biosynthesis Genes and Its Transcriptional Regulation in Soybean Seeds.

PubMed

T, Vinutha; Bansal, Navita; Kumari, Khushboo; Prashat G, Rama; Sreevathsa, Rohini; Krishnan, Veda; Kumari, Sweta; Dahuja, Anil; Lal, S K; Sachdev, Archana; Praveen, Shelly

2017-12-20

Tocopherols composed of four isoforms (α, β, γ, and δ) and its biosynthesis comprises of three pathways: methylerythritol 4-phosphate (MEP), shikimate (SK) and tocopherol-core pathways regulated by 25 enzymes. To understand pathway regulatory mechanism at transcriptional level, gene expression profile of tocopherol-biosynthesis genes in two soybean genotypes was carried out, the results showed significantly differential expression of 5 genes: 1-deoxy-d-xylulose-5-P-reductoisomerase (DXR), geranyl geranyl reductase (GGDR) from MEP, arogenate dehydrogenase (TyrA), tyrosine aminotransferase (TAT) from SK and γ-tocopherol methyl transferase 3 (γ-TMT3) from tocopherol-core pathways. Expression data were further analyzed for total tocopherol (T-toc) and α-tocopherol (α-toc) content by coregulation network and gene clustering approaches, the results showed least and strong association of γ-TMT3/tocopherol cyclase (TC) and DXR/DXS, respectively, with gene clusters of tocopherol biosynthesis suggested the specific role of γ-TMT3/TC in determining tocopherol accumulation and intricacy of DXR/DXS genes in coordinating precursor pathways toward tocopherol biosynthesis in soybean seeds. Thus, the present study provides insight into the major role of these genes regulating the tocopherol synthesis in soybean seeds.
Female Drosophila melanogaster gene expression and mate choice: the X chromosome harbours candidate genes underlying sexual isolation.

PubMed

Bailey, Richard I; Innocenti, Paolo; Morrow, Edward H; Friberg, Urban; Qvarnström, Anna

2011-02-28

The evolution of female choice mechanisms favouring males of their own kind is considered a crucial step during the early stages of speciation. However, although the genomics of mate choice may influence both the likelihood and speed of speciation, the identity and location of genes underlying assortative mating remain largely unknown. We used mate choice experiments and gene expression analysis of female Drosophila melanogaster to examine three key components influencing speciation. We show that the 1,498 genes in Zimbabwean female D. melanogaster whose expression levels differ when mating with more (Zimbabwean) versus less (Cosmopolitan strain) preferred males include many with high expression in the central nervous system and ovaries, are disproportionately X-linked and form a number of clusters with low recombination distance. Significant involvement of the brain and ovaries is consistent with the action of a combination of pre- and postcopulatory female choice mechanisms, while sex linkage and clustering of genes lead to high potential evolutionary rate and sheltering against the homogenizing effects of gene exchange between populations. Taken together our results imply favourable genomic conditions for the evolution of reproductive isolation through mate choice in Zimbabwean D. melanogaster and suggest that mate choice may, in general, act as an even more important engine of speciation than previously realized.
The WRKY transcription factor family and senescence in switchgrass.

PubMed

Rinerson, Charles I; Scully, Erin D; Palmer, Nathan A; Donze-Reiner, Teresa; Rabara, Roel C; Tripathi, Prateek; Shen, Qingxi J; Sattler, Scott E; Rohila, Jai S; Sarath, Gautam; Rushton, Paul J

2015-11-09

Early aerial senescence in switchgrass (Panicum virgatum) can significantly limit biomass yields. WRKY transcription factors that can regulate senescence could be used to reprogram senescence and enhance biomass yields. All potential WRKY genes present in the version 1.0 of the switchgrass genome were identified and curated using manual and bioinformatic methods. Expression profiles of WRKY genes in switchgrass flag leaf RNA-Seq datasets were analyzed using clustering and network analyses tools to identify both WRKY and WRKY-associated gene co-expression networks during leaf development and senescence onset. We identified 240 switchgrass WRKY genes including members of the RW5 and RW6 families of resistance proteins. Weighted gene co-expression network analysis of the flag leaf transcriptomes across development readily separated clusters of co-expressed genes into thirteen modules. A visualization highlighted separation of modules associated with the early and senescence-onset phases of flag leaf growth. The senescence-associated module contained 3000 genes including 23 WRKYs. Putative promoter regions of senescence-associated WRKY genes contained several cis-element-like sequences suggestive of responsiveness to both senescence and stress signaling pathways. A phylogenetic comparison of senescence-associated WRKY genes from switchgrass flag leaf with senescence-associated WRKY genes from other plants revealed notable hotspots in Group I, IIb, and IIe of the phylogenetic tree. We have identified and named 240 WRKY genes in the switchgrass genome. Twenty three of these genes show elevated mRNA levels during the onset of flag leaf senescence. Eleven of the WRKY genes were found in hotspots of related senescence-associated genes from multiple species and thus represent promising targets for future switchgrass genetic improvement. Overall, individual WRKY gene expression profiles could be readily linked to developmental stages of flag leaves.
A method to identify differential expression profiles of time-course gene data with Fourier transformation

PubMed Central

2013-01-01

Background Time course gene expression experiments are an increasingly popular method for exploring biological processes. Temporal gene expression profiles provide an important characterization of gene function, as biological systems are both developmental and dynamic. With such data it is possible to study gene expression changes over time and thereby to detect differential genes. Much of the early work on analyzing time series expression data relied on methods developed originally for static data and thus there is a need for improved methodology. Since time series expression is a temporal process, its unique features such as autocorrelation between successive points should be incorporated into the analysis. Results This work aims to identify genes that show different gene expression profiles across time. We propose a statistical procedure to discover gene groups with similar profiles using a nonparametric representation that accounts for the autocorrelation in the data. In particular, we first represent each profile in terms of a Fourier basis, and then we screen out genes that are not differentially expressed based on the Fourier coefficients. Finally, we cluster the remaining gene profiles using a model-based approach in the Fourier domain. We evaluate the screening results in terms of sensitivity, specificity, FDR and FNR, compare with the Gaussian process regression screening in a simulation study and illustrate the results by application to yeast cell-cycle microarray expression data with alpha-factor synchronization. The key elements of the proposed methodology: (i) representation of gene profiles in the Fourier domain; (ii) automatic screening of genes based on the Fourier coefficients and taking into account autocorrelation in the data, while controlling the false discovery rate (FDR); (iii) model-based clustering of the remaining gene profiles. Conclusions Using this method, we identified a set of cell-cycle-regulated time-course yeast genes. The proposed method is general and can be potentially used to identify genes which have the same patterns or biological processes, and help facing the present and forthcoming challenges of data analysis in functional genomics. PMID:24134721
Secondary metabolite gene expression and interplay of bacterial functions in a tropical freshwater cyanobacterial bloom.

PubMed

Penn, Kevin; Wang, Jia; Fernando, Samodha C; Thompson, Janelle R

2014-09-01

Cyanobacterial harmful algal blooms (cyanoHABs) appear to be increasing in frequency on a global scale. The Cyanobacteria in blooms can produce toxic secondary metabolites that make freshwater dangerous for drinking and recreation. To characterize microbial activities in a cyanoHAB, transcripts from a eutrophic freshwater reservoir in Singapore were sequenced for six samples collected over one day-night period. Transcripts from the Cyanobacterium Microcystis dominated all samples and were accompanied by at least 533 genera primarily from the Cyanobacteria, Proteobacteria, Bacteroidetes and Actinobacteria. Within the Microcystis population, abundant transcripts were from genes for buoyancy, photosynthesis and synthesis of the toxin microviridin, suggesting that these are necessary for competitive dominance in the Reservoir. During the day, Microcystis transcripts were enriched in photosynthesis and energy metabolism while at night enriched pathways included DNA replication and repair and toxin biosynthesis. Microcystis was the dominant source of transcripts from polyketide and non-ribosomal peptide synthase (PKS and NRPS, respectively) gene clusters. Unexpectedly, expression of all PKS/NRPS gene clusters, including for the toxins microcystin and aeruginosin, occurred throughout the day-night cycle. The most highly expressed PKS/NRPS gene cluster from Microcystis is not associated with any known product. The four most abundant phyla in the reservoir were enriched in different functions, including photosynthesis (Cyanobacteria), breakdown of complex organic molecules (Proteobacteria), glycan metabolism (Bacteroidetes) and breakdown of plant carbohydrates, such as cellobiose (Actinobacteria). These results provide the first estimate of secondary metabolite gene expression, functional partitioning and functional interplay in a freshwater cyanoHAB.

Transcriptional Changes in the Transition from Vegetative Cells to Asexual Development in the Model Fungus Aspergillus nidulans

PubMed Central

Garzia, Aitor; Etxebeste, Oier; Rodríguez-Romero, Julio; Fischer, Reinhard; Espeso, Eduardo A.

2013-01-01

Morphogenesis encompasses programmed changes in gene expression that lead to the development of specialized cell types. In the model fungus Aspergillus nidulans, asexual development involves the formation of characteristic cell types, collectively known as the conidiophore. With the aim of determining the transcriptional changes that occur upon induction of asexual development, we have applied massive mRNA sequencing to compare the expression pattern of 19-h-old submerged vegetative cells (hyphae) with that of similar hyphae after exposure to the air for 5 h. We found that the expression of 2,222 (20.3%) of the predicted 10,943 A. nidulans transcripts was significantly modified after air exposure, 2,035 being downregulated and 187 upregulated. The activation during this transition of genes that belong specifically to the asexual developmental pathway was confirmed. Another remarkable quantitative change occurred in the expression of genes involved in carbon or nitrogen primary metabolism. Genes participating in polar growth or sexual development were transcriptionally repressed, as were those belonging to the HogA/SakA stress response mitogen-activated protein (MAP) kinase pathway. We also identified significant expression changes in several genes purportedly involved in redox balance, transmembrane transport, secondary metabolite production, or transcriptional regulation, mainly binuclear-zinc cluster transcription factors. Genes coding for these four activities were usually grouped in metabolic clusters, which may bring regulatory implications for the induction of asexual development. These results provide a blueprint for further stage-specific gene expression studies during conidiophore development. PMID:23264642
Transcriptome analysis identifies genes involved in ethanol response of Saccharomyces cerevisiae in Agave tequilana juice.

PubMed

Ramirez-Córdova, Jesús; Drnevich, Jenny; Madrigal-Pulido, Jaime Alberto; Arrizon, Javier; Allen, Kirk; Martínez-Velázquez, Moisés; Alvarez-Maya, Ikuri

2012-08-01

During ethanol fermentation, yeast cells are exposed to stress due to the accumulation of ethanol, cell growth is altered and the output of the target product is reduced. For Agave beverages, like tequila, no reports have been published on the global gene expression under ethanol stress. In this work, we used microarray analysis to identify Saccharomyces cerevisiae genes involved in the ethanol response. Gene expression of a tequila yeast strain of S. cerevisiae (AR5) was explored by comparing global gene expression with that of laboratory strain S288C, both after ethanol exposure. Additionally, we used two different culture conditions, cells grown in Agave tequilana juice as a natural fermentation media or grown in yeast-extract peptone dextrose as artificial media. Of the 6368 S. cerevisiae genes in the microarray, 657 genes were identified that had different expression responses to ethanol stress due to strain and/or media. A cluster of 28 genes was found over-expressed specifically in the AR5 tequila strain that could be involved in the adaptation to tequila yeast fermentation, 14 of which are unknown such as yor343c, ylr162w, ygr182c, ymr265c, yer053c-a or ydr415c. These could be the most suitable genes for transforming tequila yeast to increase ethanol tolerance in the tequila fermentation process. Other genes involved in response to stress (RFC4, TSA1, MLH1, PAU3, RAD53) or transport (CYB2, TIP20, QCR9) were expressed in the same cluster. Unknown genes could be good candidates for the development of recombinant yeasts with ethanol tolerance for use in industrial tequila fermentation.
Two Virus-Induced MicroRNAs Known Only from Teleost Fishes Are Orthologues of MicroRNAs Involved in Cell Cycle Control in Humans

PubMed Central

Schyth, Brian Dall; Bela-ong, Dennis Berbulla; Jalali, Seyed Amir Hossein; Kristensen, Lasse Bøgelund Juel; Einer-Jensen, Katja; Pedersen, Finn Skou; Lorenzen, Niels

2015-01-01

MicroRNAs (miRNAs) are ~22 base pair-long non-coding RNAs which regulate gene expression in the cytoplasm of eukaryotic cells by binding to specific target regions in mRNAs to mediate transcriptional blocking or mRNA cleavage. Through their fundamental roles in cellular pathways, gene regulation mediated by miRNAs has been shown to be involved in almost all biological phenomena, including development, metabolism, cell cycle, tumor formation, and host-pathogen interactions. To address the latter in a primitive vertebrate host, we here used an array platform to analyze the miRNA response in rainbow trout (Oncorhynchus mykiss) following inoculation with the virulent fish rhabdovirus Viral hemorrhagic septicaemia virus. Two clustered miRNAs, miR-462 and miR-731 (herein referred to as miR-462 cluster), described only in teleost fishes, were found to be strongly upregulated, indicating their involvement in fish-virus interactions. We searched for homologues of the two teleost miRNAs in other vertebrate species and investigated whether findings related to ours have been reported for these homologues. Gene synteny analysis along with gene sequence conservation suggested that the teleost fish miR-462 and miR-731 had evolved from the ancestral miR-191 and miR-425 (herein called miR-191 cluster), respectively. Whereas the miR-462 cluster locus is found between two protein-coding genes (intergenic) in teleost fish genomes, the miR-191 cluster locus is found within an intron of a protein-coding gene (intragenic) in the human genome. Interferon (IFN)-inducible and immune-related promoter elements found upstream of the teleost miR-462 cluster locus suggested roles in immune responses to viral pathogens in fish, while in humans, the miR-191 cluster functionally associated with cell cycle regulation. Stimulation of fish cell cultures with the IFN inducer poly I:C accordingly upregulated the expression of miR-462 and miR-731, while no stimulatory effect on miR-191 and miR-425 expression was observed in human cell lines. Despite high sequence conservation, evolution has thus resulted in different regulation and presumably also different functional roles of these orthologous miRNA clusters in different vertebrate lineages. PMID:26207374
TU-CD-BRB-12: Radiogenomics of MRI-Guided Prostate Cancer Biopsy Habitats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stoyanova, R; Lynne, C; Abraham, S

2015-06-15

Purpose: Diagnostic prostate biopsies are subject to sampling bias. We hypothesize that quantitative imaging with multiparametric (MP)-MRI can more accurately direct targeted biopsies to index lesions associated with highest risk clinical and genomic features. Methods: Regionally distinct prostate habitats were delineated on MP-MRI (T2-weighted, perfusion and diffusion imaging). Directed biopsies were performed on 17 habitats from 6 patients using MRI-ultrasound fusion. Biopsy location was characterized with 52 radiographic features. Transcriptome-wide analysis of 1.4 million RNA probes was performed on RNA from each habitat. Genomics features with insignificant expression values (<0.25) and interquartile range <0.5 were filtered, leaving total of 212more » genes. Correlation between imaging features, genes and a 22 feature genomic classifier (GC), developed as a prognostic assay for metastasis after radical prostatectomy was investigated. Results: High quality genomic data was derived from 17 (100%) biopsies. Using the 212 ‘unbiased’ genes, the samples clustered by patient origin in unsupervised analysis. When only prostate cancer related genomic features were used, hierarchical clustering revealed samples clustered by needle-biopsy Gleason score (GS). Similarly, principal component analysis of the imaging features, found the primary source of variance segregated the samples into high (≥7) and low (6) GS. Pearson’s correlation analysis of genes with significant expression showed two main patterns of gene expression clustering prostate peripheral and transitional zone MRI features. Two-way hierarchical clustering of GC with radiomics features resulted in the expected groupings of high and low expressed genes in this metastasis signature. Conclusions: MP-MRI-targeted diagnostic biopsies can potentially improve risk stratification by directing pathological and genomic analysis to clinically significant index lesions. As determinant lesions are more reliably identified, targeting with radiotherapy should improve outcome. This is the first demonstration of a link between quantitative imaging features (radiomics) with genomic features in MRI-directed prostate biopsies. The research was supported by NIH- NCI R01 CA 189295 and R01 CA 189295; E Davicioni is partial owner of GenomeDx Biosciences, Inc. M Takhar, N Erho, L Lam, C Buerki and E Davicioni are current employees at GenomeDx Biosciences, Inc.« less
Mutation of the RDR1 gene caused genome-wide changes in gene expression, regional variation in small RNA clusters and localized alteration in DNA methylation in rice.

PubMed

Wang, Ningning; Zhang, Di; Wang, Zhenhui; Xun, Hongwei; Ma, Jian; Wang, Hui; Huang, Wei; Liu, Ying; Lin, Xiuyun; Li, Ning; Ou, Xiufang; Zhang, Chunyu; Wang, Ming-Bo; Liu, Bao

2014-06-30

Endogenous small (sm) RNAs (primarily si- and miRNAs) are important trans/cis-acting regulators involved in diverse cellular functions. In plants, the RNA-dependent RNA polymerases (RDRs) are essential for smRNA biogenesis. It has been established that RDR2 is involved in the 24 nt siRNA-dependent RNA-directed DNA methylation (RdDM) pathway. Recent studies have suggested that RDR1 is involved in a second RdDM pathway that relies mostly on 21 nt smRNAs and functions to silence a subset of genomic loci that are usually refractory to the normal RdDM pathway in Arabidopsis. Whether and to what extent the homologs of RDR1 may have similar functions in other plants remained unknown. We characterized a loss-of-function mutant (Osrdr1) of the OsRDR1 gene in rice (Oryza sativa L.) derived from a retrotransposon Tos17 insertion. Microarray analysis identified 1,175 differentially expressed genes (5.2% of all expressed genes in the shoot-tip tissue of rice) between Osrdr1 and WT, of which 896 and 279 genes were up- and down-regulated, respectively, in Osrdr1. smRNA sequencing revealed regional alterations in smRNA clusters across the rice genome. Some of the regions with altered smRNA clusters were associated with changes in DNA methylation. In addition, altered expression of several miRNAs was detected in Osrdr1, and at least some of which were associated with altered expression of predicted miRNA target genes. Despite these changes, no phenotypic difference was identified in Osrdr1 relative to WT under normal condition; however, ephemeral phenotypic fluctuations occurred under some abiotic stress conditions. Our results showed that OsRDR1 plays a role in regulating a substantial number of endogenous genes with diverse functions in rice through smRNA-mediated pathways involving DNA methylation, and which participates in abiotic stress response.
Gene selection and cancer type classification of diffuse large-B-cell lymphoma using a bivariate mixture model for two-species data.

PubMed

Su, Yuhua; Nielsen, Dahlia; Zhu, Lei; Richards, Kristy; Suter, Steven; Breen, Matthew; Motsinger-Reif, Alison; Osborne, Jason

2013-01-05

: A bivariate mixture model utilizing information across two species was proposed to solve the fundamental problem of identifying differentially expressed genes in microarray experiments. The model utility was illustrated using a dog and human lymphoma data set prepared by a group of scientists in the College of Veterinary Medicine at North Carolina State University. A small number of genes were identified as being differentially expressed in both species and the human genes in this cluster serve as a good predictor for classifying diffuse large-B-cell lymphoma (DLBCL) patients into two subgroups, the germinal center B-cell-like diffuse large B-cell lymphoma and the activated B-cell-like diffuse large B-cell lymphoma. The number of human genes that were observed to be significantly differentially expressed (21) from the two-species analysis was very small compared to the number of human genes (190) identified with only one-species analysis (human data). The genes may be clinically relevant/important, as this small set achieved low misclassification rates of DLBCL subtypes. Additionally, the two subgroups defined by this cluster of human genes had significantly different survival functions, indicating that the stratification based on gene-expression profiling using the proposed mixture model provided improved insight into the clinical differences between the two cancer subtypes.
Epigenetic repression of HOXB cluster in oral cancer cell lines.

PubMed

Xavier, Flávia Caló Aquino; Destro, Maria Fernanda de Souza Setubal; Duarte, Carina Magalhães Esteves; Nunes, Fabio Daumas

2014-08-01

Aberrant DNA methylation is a fundamental transcriptional control mechanism in carcinogenesis. The expression of homeobox genes is usually controlled by an epigenetic mechanism, such as the methylation of CpG islands in the promoter region. The aim of this study was to describe the differential methylation pattern of HOX genes in oral squamous cell carcinoma (OSCC) cell lines and transcript status in a group of hypermethylated and hypomethylated genes. Quantitative analysis of DNA methylation was performed on two OSCC cell lines (SCC4 and SCC9) using a method denominated Human Homeobox Genes EpiTect Methyl qPCR Arrays, which allowed fast, precise methylation detection of 24 HOX specific genes without bisulfite conversion. Methylation greater than 50% was detected in HOXA11, HOXA6, HOXA7, HOXA9, HOXB1, HOXB2, HOXB3, HOXB4, HOXB5, HOXB6, HOXC8 and HOXD10. Both cell lines demonstrated similar hypermethylation status for eight HOX genes. A similar pattern of promoter hypermethylation and hypomethylation was demonstrated for the HOXB cluster and HOXA cluster, respectively. Moreover, the hypermethylation profile of the HOXB cluster, especially HOXB4, was correlated with decreased transcript expression, which was restored following treatment with 5-aza-2'-deoxycytidine. The homeobox methylation profile in OSCC cell lines is consistent with an epigenetic biomarker. Copyright © 2014 Elsevier Ltd. All rights reserved.
Spatial enhancer clustering and regulation of enhancer-proximal genes by cohesin

PubMed Central

Ing-Simmons, Elizabeth; Seitan, Vlad C.; Faure, Andre J.; Flicek, Paul; Carroll, Thomas; Dekker, Job; Fisher, Amanda G.; Lenhard, Boris

2015-01-01

In addition to mediating sister chromatid cohesion during the cell cycle, the cohesin complex associates with CTCF and with active gene regulatory elements to form long-range interactions between its binding sites. Genome-wide chromosome conformation capture had shown that cohesin's main role in interphase genome organization is in mediating interactions within architectural chromosome compartments, rather than specifying compartments per se. However, it remains unclear how cohesin-mediated interactions contribute to the regulation of gene expression. We have found that the binding of CTCF and cohesin is highly enriched at enhancers and in particular at enhancer arrays or “super-enhancers” in mouse thymocytes. Using local and global chromosome conformation capture, we demonstrate that enhancer elements associate not just in linear sequence, but also in 3D, and that spatial enhancer clustering is facilitated by cohesin. The conditional deletion of cohesin from noncycling thymocytes preserved enhancer position, H3K27ac, H4K4me1, and enhancer transcription, but weakened interactions between enhancers. Interestingly, ∼50% of deregulated genes reside in the vicinity of enhancer elements, suggesting that cohesin regulates gene expression through spatial clustering of enhancer elements. We propose a model for cohesin-dependent gene regulation in which spatial clustering of enhancer elements acts as a unified mechanism for both enhancer-promoter “connections” and “insulation.” PMID:25677180
Transcriptome analysis of salinity stress responses in common wheat using a 22k oligo-DNA microarray.

PubMed

Kawaura, Kanako; Mochida, Keiichi; Yamazaki, Yukiko; Ogihara, Yasunari

2006-04-01

In this study, we constructed a 22k wheat oligo-DNA microarray. A total of 148,676 expressed sequence tags of common wheat were collected from the database of the Wheat Genomics Consortium of Japan. These were grouped into 34,064 contigs, which were then used to design an oligonucleotide DNA microarray. Following a multistep selection of the sense strand, 21,939 60-mer oligo-DNA probes were selected for attachment on the microarray slide. This 22k oligo-DNA microarray was used to examine the transcriptional response of wheat to salt stress. More than 95% of the probes gave reproducible hybridization signals when targeted with RNAs extracted from salt-treated wheat shoots and roots. With the microarray, we identified 1,811 genes whose expressions changed more than 2-fold in response to salt. These included genes known to mediate response to salt, as well as unknown genes, and they were classified into 12 major groups by hierarchical clustering. These gene expression patterns were also confirmed by real-time reverse transcription-PCR. Many of the genes with unknown function were clustered together with genes known to be involved in response to salt stress. Thus, analysis of gene expression patterns combined with gene ontology should help identify the function of the unknown genes. Also, functional analysis of these wheat genes should provide new insight into the response to salt stress. Finally, these results indicate that the 22k oligo-DNA microarray is a reliable method for monitoring global gene expression patterns in wheat.
Correlation of mRNA and protein levels: Cell type-specific gene expression of cluster designation antigens in the prostate

PubMed Central

Pascal, Laura E; True, Lawrence D; Campbell, David S; Deutsch, Eric W; Risk, Michael; Coleman, Ilsa M; Eichner, Lillian J; Nelson, Peter S; Liu, Alvin Y

2008-01-01

Background: Expression levels of mRNA and protein by cell types exhibit a range of correlations for different genes. In this study, we compared levels of mRNA abundance for several cluster designation (CD) genes determined by gene arrays using magnetic sorted and laser-capture microdissected human prostate cells with levels of expression of the respective CD proteins determined by immunohistochemical staining in the major cell types of the prostate – basal epithelial, luminal epithelial, stromal fibromuscular, and endothelial – and for prostate precursor/stem cells and prostate carcinoma cells. Immunohistochemical stains of prostate tissues from more than 50 patients were scored for informative CD antigen expression and compared with cell-type specific transcriptomes. Results: Concordance between gene and protein expression findings based on 'present' vs. 'absent' calls ranged from 46 to 68%. Correlation of expression levels was poor to moderate (Pearson correlations ranged from 0 to 0.63). Divergence between the two data types was most frequently seen for genes whose array signals exceeded background (> 50) but lacked immunoreactivity by immunostaining. This could be due to multiple factors, e.g. low levels of protein expression, technological sensitivities, sample processing, probe set definition or anatomical origin of tissue and actual biological differences between transcript and protein abundance. Conclusion: Agreement between these two very different methodologies has great implications for their respective use in both molecular studies and clinical trials employing molecular biomarkers. PMID:18501003
Dysregulation and functional roles of miR-183-96-182 cluster in cancer cell proliferation, invasion and metastasis

PubMed Central

Ma, Yi; Liang, A-Juan; Fan, Yu-Ping; Huang, Yi-Ran; Zhao, Xiao-Ming; Sun, Yun; Chen, Xiang-Feng

2016-01-01

Previous studies have reported aberrant expression of the miR-183-96-182 cluster in a variety of tumors, which indicates its' diagnostic or prognostic value. However, a key characteristic of the miR-183-96-182 cluster is its varied expression levels, and pleomorphic functional roles in different tumors or under different conditions. In most tumor types, the cluster is highly expressed and promotes tumorigenesis, cancer progression and metastasis; yet tumor suppressive effects have also been reported in some tumors. In the present study, we discuss the upstream regulators and the downstream target genes of miR-183-96-182 cluster, and highlight the dysregulation and functional roles of this cluster in various tumor cells. Newer insights summarized in this review will help readers understand the different facets of the miR-183-96-182 cluster in cancer development and progression. PMID:27081087
Identification of the acclimation genes in transcriptomic responses to heat stress of White Pekin duck.

PubMed

Kim, Jun-Mo; Lim, Kyu-Sang; Byun, Mijeong; Lee, Kyung-Tai; Yang, Young-Rok; Park, Mina; Lim, Dajeong; Chai, Han-Ha; Bang, Han-Tae; Hwangbo, Jong; Choi, Yang-Ho; Cho, Yong-Min; Park, Jong-Eun

2017-11-01

White Pekin duck is an important meat resource in the livestock industries. However, the temperature increase due to global warming has become a serious environmental factor in duck production, because of hyperthermia. Therefore, identifying the gene regulations and understanding the molecular mechanism for adaptation to the warmer environment will provide insightful information on the acclimation system of ducks. This study examined transcriptomic responses to heat stress treatments (3 and 6 h at 35 °C) and control (C, 25 °C) using RNA-sequencing analysis of genes from the breast muscle tissue. Based on three distinct differentially expressed gene (DEG) sets (3H/C, 6H/C, and 6H/3H), the expression patterns of significant DEGs (absolute log2 > 1.0 and false discovery rate < 0.05) were clustered into three responsive gene groups divided into upregulated and downregulated genes. Next, we analyzed the clusters that showed relatively higher expression levels in 3H/C and lower levels in 6H/C with much lower or opposite levels in 6H/3H; we referred to these clusters as the adaptable responsive gene group. These genes were significantly enriched in the ErbB signaling pathway, neuroactive ligand-receptor interaction and type II diabetes mellitus in the KEGG pathways (P < 0.01). From the functional enrichment analysis and significantly regulated genes observed in the enriched pathways, we think that the adaptable responsive genes are responsible for the acclimation mechanism of ducks and suggest that the regulation of phosphoinositide 3-kinase genes including PIK3R6, PIK3R5, and PIK3C2B has an important relationship with the mechanisms of adaptation to heat stress in ducks.
Identification of hub subnetwork based on topological features of genes in breast cancer

PubMed Central

ZHUANG, DA-YONG; JIANG, LI; HE, QING-QING; ZHOU, PENG; YUE, TAO

2015-01-01

The aim of this study was to provide functional insight into the identification of hub subnetworks by aggregating the behavior of genes connected in a protein-protein interaction (PPI) network. We applied a protein network-based approach to identify subnetworks which may provide new insight into the functions of pathways involved in breast cancer rather than individual genes. Five groups of breast cancer data were downloaded and analyzed from the Gene Expression Omnibus (GEO) database of high-throughput gene expression data to identify gene signatures using the genome-wide global significance (GWGS) method. A PPI network was constructed using Cytoscape and clusters that focused on highly connected nodes were obtained using the molecular complex detection (MCODE) clustering algorithm. Pathway analysis was performed to assess the functional relevance of selected gene signatures based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Topological centrality was used to characterize the biological importance of gene signatures, pathways and clusters. The results revealed that, cluster1, as well as the cell cycle and oocyte meiosis pathways were significant subnetworks in the analysis of degree and other centralities, in which hub nodes mostly distributed. The most important hub nodes, with top ranked centrality, were also similar with the common genes from the above three subnetwork intersections, which was viewed as a hub subnetwork with more reproducible than individual critical genes selected without network information. This hub subnetwork attributed to the same biological process which was essential in the function of cell growth and death. This increased the accuracy of identifying gene interactions that took place within the same functional process and was potentially useful for the development of biomarkers and networks for breast cancer. PMID:25573623
MicroRNA MiR-17 retards tissue growth and represses fibronectin expression.

PubMed

Shan, Sze Wan; Lee, Daniel Y; Deng, Zhaoqun; Shatseva, Tatiana; Jeyapalan, Zina; Du, William W; Zhang, Yaou; Xuan, Jim W; Yee, Siu-Pok; Siragam, Vinayakumar; Yang, Burton B

2009-08-01

MicroRNAs (miRNAs) are single-stranded regulatory RNAs, frequently expressed as clusters. Previous studies have demonstrated that the six-miRNA cluster miR-17~92 has important roles in tissue development and cancers. However, the precise role of each miRNA in the cluster is unknown. Here we show that overexpression of miR-17 results in decreased cell adhesion, migration and proliferation. Transgenic mice overexpressing miR-17 showed overall growth retardation, smaller organs and greatly reduced haematopoietic cell lineages. We found that fibronectin and the fibronectin type-III domain containing 3A (FNDC3A) are two targets that have their expression repressed by miR-17, both in vitro and in transgenic mice. Several lines of evidence support the notion that miR-17 causes cellular defects through its repression of fibronectin expression. Our single miRNA expression assay may be evolved to allow the manipulation of individual miRNA functions in vitro and in vivo. We anticipate that this could serve as a model for studying gene regulation by miRNAs in the development of gene therapy.
De novo transcriptome profiling of cold-stressed siliques during pod filling stages in Indian mustard (Brassica juncea L.)

PubMed Central

Sinha, Somya; Raxwal, Vivek K.; Joshi, Bharat; Jagannath, Arun; Katiyar-Agarwal, Surekha; Goel, Shailendra; Kumar, Amar; Agarwal, Manu

2015-01-01

Low temperature is a major abiotic stress that impedes plant growth and development. Brassica juncea is an economically important oil seed crop and is sensitive to freezing stress during pod filling subsequently leading to abortion of seeds. To understand the cold stress mediated global perturbations in gene expression, whole transcriptome of B. juncea siliques that were exposed to sub-optimal temperature was sequenced. Manually self-pollinated siliques at different stages of development were subjected to either short (6 h) or long (12 h) durations of chilling stress followed by construction of RNA-seq libraries and deep sequencing using Illumina's NGS platform. De-novo assembly of B. juncea transcriptome resulted in 133,641 transcripts, whose combined length was 117 Mb and N50 value was 1428 bp. We identified 13,342 differentially regulated transcripts by pair-wise comparison of 18 transcriptome libraries. Hierarchical clustering along with Spearman correlation analysis identified that the differentially expressed genes segregated in two major clusters representing early (5–15 DAP) and late stages (20–30 DAP) of silique development. Further analysis led to the discovery of sub-clusters having similar patterns of gene expression. Two of the sub-clusters (one each from the early and late stages) comprised of genes that were inducible by both the durations of cold stress. Comparison of transcripts from these clusters led to identification of 283 transcripts that were commonly induced by cold stress, and were referred to as “core cold-inducible” transcripts. Additionally, we found that 689 and 100 transcripts were specifically up-regulated by cold stress in early and late stages, respectively. We further explored the expression patterns of gene families encoding for transcription factors (TFs), transcription regulators (TRs) and kinases, and found that cold stress induced protein kinases only during early silique development. We validated the digital gene expression profiles of selected transcripts by qPCR and found a high degree of concordance between the two analyses. To our knowledge this is the first report of transcriptome sequencing of cold-stressed B. juncea siliques. The data generated in this study would be a valuable resource for not only understanding the cold stress signaling pathway but also for introducing cold hardiness in B. juncea. PMID:26579175
Global transcriptome analysis of the C57BL/6J mouse testis by SAGE: evidence for nonrandom gene order.

PubMed

Divina, Petr; Vlcek, Cestmír; Strnad, Petr; Paces, Václav; Forejt, Jirí

2005-03-05

We generated the gene expression profile of the total testis from the adult C57BL/6J male mice using serial analysis of gene expression (SAGE). Two high-quality SAGE libraries containing a total of 76 854 tags were constructed. An extensive bioinformatic analysis and comparison of SAGE transcriptomes of the total testis, testicular somatic cells and other mouse tissues was performed and the theory of male-biased gene accumulation on the X chromosome was tested. We sorted out 829 genes predominantly expressed from the germinal part and 944 genes from the somatic part of the testis. The genes preferentially and specifically expressed in total testis and testicular somatic cells were identified by comparing the testis SAGE transcriptomes to the available transcriptomes of seven non-testis tissues. We uncovered chromosomal clusters of adjacent genes with preferential expression in total testis and testicular somatic cells by a genome-wide search and found that the clusters encompassed a significantly higher number of genes than expected by chance. We observed a significant 3.2-fold enrichment of the proportion of X-linked genes specific for testicular somatic cells, while the proportions of X-linked genes specific for total testis and for other tissues were comparable. In contrast to the tissue-specific genes, an under-representation of X-linked genes in the total testis transcriptome but not in the transcriptomes of testicular somatic cells and other tissues was detected. Our results provide new evidence in favor of the theory of male-biased genes accumulation on the X chromosome in testicular somatic cells and indicate the opposite action of the meiotic X-inactivation in testicular germ cells.
Global transcriptome analysis of the C57BL/6J mouse testis by SAGE: evidence for nonrandom gene order

PubMed Central

Divina, Petr; Vlček, Čestmír; Strnad, Petr; Pačes, Václav; Forejt, Jiří

2005-01-01

Background We generated the gene expression profile of the total testis from the adult C57BL/6J male mice using serial analysis of gene expression (SAGE). Two high-quality SAGE libraries containing a total of 76 854 tags were constructed. An extensive bioinformatic analysis and comparison of SAGE transcriptomes of the total testis, testicular somatic cells and other mouse tissues was performed and the theory of male-biased gene accumulation on the X chromosome was tested. Results We sorted out 829 genes predominantly expressed from the germinal part and 944 genes from the somatic part of the testis. The genes preferentially and specifically expressed in total testis and testicular somatic cells were identified by comparing the testis SAGE transcriptomes to the available transcriptomes of seven non-testis tissues. We uncovered chromosomal clusters of adjacent genes with preferential expression in total testis and testicular somatic cells by a genome-wide search and found that the clusters encompassed a significantly higher number of genes than expected by chance. We observed a significant 3.2-fold enrichment of the proportion of X-linked genes specific for testicular somatic cells, while the proportions of X-linked genes specific for total testis and for other tissues were comparable. In contrast to the tissue-specific genes, an under-representation of X-linked genes in the total testis transcriptome but not in the transcriptomes of testicular somatic cells and other tissues was detected. Conclusion Our results provide new evidence in favor of the theory of male-biased genes accumulation on the X chromosome in testicular somatic cells and indicate the opposite action of the meiotic X-inactivation in testicular germ cells. PMID:15748293
Secondary metabolism in Fusarium fujikuroi: strategies to unravel the function of biosynthetic pathways.

PubMed

Janevska, Slavica; Tudzynski, Bettina

2018-01-01

The fungus Fusarium fujikuroi causes bakanae disease of rice due to its ability to produce the plant hormones, the gibberellins. The fungus is also known for producing harmful mycotoxins (e.g., fusaric acid and fusarins) and pigments (e.g., bikaverin and fusarubins). However, for a long time, most of these well-known products could not be linked to biosynthetic gene clusters. Recent genome sequencing has revealed altogether 47 putative gene clusters. Most of them were orphan clusters for which the encoded natural product(s) were unknown. In this review, we describe the current status of our research on identification and functional characterizations of novel secondary metabolite gene clusters. We present several examples where linking known metabolites to the respective biosynthetic genes has been achieved and describe recent strategies and methods to access new natural products, e.g., by genetic manipulation of pathway-specific or global transcritption factors. In addition, we demonstrate that deletion and over-expression of histone-modifying genes is a powerful tool to activate silent gene clusters and to discover their products.
Temporal gene expression profiling of the rat knee joint capsule during immobilization-induced joint contractures.

PubMed

Wong, Kayleigh; Sun, Fangui; Trudel, Guy; Sebastiani, Paola; Laneuville, Odette

2015-05-26

Contractures of the knee joint cause disability and handicap. Recovering range of motion is recognized by arthritic patients as their preference for improved health outcome secondary only to pain management. Clinical and experimental studies provide evidence that the posterior knee capsule prevents the knee from achieving full extension. This study was undertaken to investigate the dynamic changes of the joint capsule transcriptome during the progression of knee joint contractures induced by immobilization. We performed a microarray analysis of genes expressed in the posterior knee joint capsule following induction of a flexion contracture by rigidly immobilizing the rat knee joint over a time-course of 16 weeks. Fold changes of expression values were measured and co-expressed genes were identified by clustering based on time-series analysis. Genes associated with immobilization were further analyzed to reveal pathways and biological significance and validated by immunohistochemistry on sagittal sections of knee joints. Changes in expression with a minimum of 1.5 fold changes were dominated by a decrease in expression for 7732 probe sets occurring at week 8 while the expression of 2251 probe sets increased. Clusters of genes with similar profiles of expression included a total of 162 genes displaying at least a 2 fold change compared to week 1. Functional analysis revealed ontology categories corresponding to triglyceride metabolism, extracellular matrix and muscle contraction. The altered expression of selected genes involved in the triglyceride biosynthesis pathway; AGPAT-9, and of the genes P4HB and HSP47, both involved in collagen synthesis, was confirmed by immunohistochemistry. Gene expression in the knee joint capsule was sensitive to joint immobility and provided insights into molecular mechanisms relevant to the pathophysiology of knee flexion contractures. Capsule responses to immobilization was dynamic and characterized by modulation of at least three reaction pathways; down regulation of triglyceride biosynthesis, alteration of extracellular matrix degradation and muscle contraction gene expression. The posterior knee capsule may deploy tissue-specific patterns of mRNA regulatory responses to immobilization. The identification of altered expression of genes and biochemical pathways in the joint capsule provides potential targets for the therapy of knee flexion contractures.
Bioinformatics, interaction network analysis, and neural networks to characterize gene expression of radicular cyst and periapical granuloma.

PubMed

Poswar, Fabiano de Oliveira; Farias, Lucyana Conceição; Fraga, Carlos Alberto de Carvalho; Bambirra, Wilson; Brito-Júnior, Manoel; Sousa-Neto, Manoel Damião; Santos, Sérgio Henrique Souza; de Paula, Alfredo Maurício Batista; D'Angelo, Marcos Flávio Silveira Vasconcelos; Guimarães, André Luiz Sena

2015-06-01

Bioinformatics has emerged as an important tool to analyze the large amount of data generated by research in different diseases. In this study, gene expression for radicular cysts (RCs) and periapical granulomas (PGs) was characterized based on a leader gene approach. A validated bioinformatics algorithm was applied to identify leader genes for RCs and PGs. Genes related to RCs and PGs were first identified in PubMed, GenBank, GeneAtlas, and GeneCards databases. The Web-available STRING software (The European Molecular Biology Laboratory [EMBL], Heidelberg, Baden-Württemberg, Germany) was used in order to build the interaction map among the identified genes by a significance score named weighted number of links. Based on the weighted number of links, genes were clustered using k-means. The genes in the highest cluster were considered leader genes. Multilayer perceptron neural network analysis was used as a complementary supplement for gene classification. For RCs, the suggested leader genes were TP53 and EP300, whereas PGs were associated with IL2RG, CCL2, CCL4, CCL5, CCR1, CCR3, and CCR5 genes. Our data revealed different gene expression for RCs and PGs, suggesting that not only the inflammatory nature but also other biological processes might differentiate RCs and PGs. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.

Exploring root symbiotic programs in the model legume Medicago truncatula using EST analysis.

PubMed

Journet, Etienne-Pascal; van Tuinen, Diederik; Gouzy, Jérome; Crespeau, Hervé; Carreau, Véronique; Farmer, Mary-Jo; Niebel, Andreas; Schiex, Thomas; Jaillon, Olivier; Chatagnier, Odile; Godiard, Laurence; Micheli, Fabienne; Kahn, Daniel; Gianinazzi-Pearson, Vivienne; Gamas, Pascal

2002-12-15

We report on a large-scale expressed sequence tag (EST) sequencing and analysis program aimed at characterizing the sets of genes expressed in roots of the model legume Medicago truncatula during interactions with either of two microsymbionts, the nitrogen-fixing bacterium Sinorhizobium meliloti or the arbuscular mycorrhizal fungus Glomus intraradices. We have designed specific tools for in silico analysis of EST data, in relation to chimeric cDNA detection, EST clustering, encoded protein prediction, and detection of differential expression. Our 21 473 5'- and 3'-ESTs could be grouped into 6359 EST clusters, corresponding to distinct virtual genes, along with 52 498 other M.truncatula ESTs available in the dbEST (NCBI) database that were recruited in the process. These clusters were manually annotated, using a specifically developed annotation interface. Analysis of EST cluster distribution in various M.truncatula cDNA libraries, supported by a refined R test to evaluate statistical significance and by 'electronic northern' representation, enabled us to identify a large number of novel genes predicted to be up- or down-regulated during either symbiotic root interaction. These in silico analyses provide a first global view of the genetic programs for root symbioses in M.truncatula. A searchable database has been built and can be accessed through a public interface.
Heterologous expression of the avirulence gene ACE1 from the fungal rice pathogen Magnaporthe oryzae † †Electronic supplementary information (ESI) available. See DOI: 10.1039/c4sc03707c Click here for additional data file.

PubMed Central

Song, Zhongshu; Bakeer, Walid; Marshall, James W.; Yakasai, Ahmed A.; Khalid, Rozida Mohd; Collemare, Jerome; Skellam, Elizabeth; Tharreau, Didier; Lebrun, Marc-Henri; Lazarus, Colin M.; Bailey, Andrew M.; Simpson, Thomas J.

2015-01-01

The ACE1 and RAP1 genes from the avirulence signalling gene cluster of the rice blast fungus Magnaporthe oryzae were expressed in Aspergillus oryzae and M. oryzae itself. Expression of ACE1 alone produced a polyenyl pyrone (magnaporthepyrone), which is regioselectively epoxidised and hydrolysed to give different diols, 6 and 7, in the two host organisms. Analysis of the three introns present in ACE1 determined that A. oryzae does not process intron 2 correctly, while M. oryzae processes all introns correctly in both appressoria and mycelia. Co-expression of ACE1 and RAP1 in A. oryzae produced an amide 8 which is similar to the PKS-NRPS derived backbone of the cytochalasans. Biological testing on rice leaves showed that neither the diols 6 and 7, nor amide 8 was responsible for the observed ACE1 mediated avirulence, however, gene cluster analysis suggests that the true avirulence signalling compound may be a tyrosine-derived cytochalasan compound. PMID:29142718
Unsupervised clustering of gene expression data points at hypoxia as possible trigger for metabolic syndrome.

PubMed

Ptitsyn, Andrey; Hulver, Matthew; Cefalu, William; York, David; Smith, Steven R

2006-12-19

Classification of large volumes of data produced in a microarray experiment allows for the extraction of important clues as to the nature of a disease. Using multi-dimensional unsupervised FOREL (FORmal ELement) algorithm we have re-analyzed three public datasets of skeletal muscle gene expression in connection with insulin resistance and type 2 diabetes (DM2). Our analysis revealed the major line of variation between expression profiles of normal, insulin resistant, and diabetic skeletal muscle. A cluster of most "metabolically sound" samples occupied one end of this line. The distance along this line coincided with the classic markers of diabetes risk, namely obesity and insulin resistance, but did not follow the accepted clinical diagnosis of DM2 as defined by the presence or absence of hyperglycemia. Genes implicated in this expression pattern are those controlling skeletal muscle fiber type and glycolytic metabolism. Additionally myoglobin and hemoglobin were upregulated and ribosomal genes deregulated in insulin resistant patients. Our findings are concordant with the changes seen in skeletal muscle with altitude hypoxia. This suggests that hypoxia and shift to glycolytic metabolism may also drive insulin resistance.
THE EFFECT OF PROPRANOLOL ON GENE EXPRESSION DURING THE BLOOD ALCOHOL CYCLE OF RATS FED ETHANOL INTRAGASTRICALLY

PubMed Central

Li, Jun; Bardag-Gorce, F; Joan, Oliva; French, BA; Dedes, J; French, SW

2010-01-01

Propranolol, a beta adrenergic blocker prevents the blood alcohol (BAL) cycle in rats fed ethanol intragastrically at a constant rate by preventing the cyclic changes in the metabolic rate caused by fluctuating levels of norepinephrine released into the blood. The change in the rate of metabolism changes the rate of alcohol elimination in the blood which causes the BAL to cycle. Microarray analysis of the livers from the rats fed ethanol and propranolol showed similar changes in clusters of functionally related gene expressions. The controls and the trough of the cycle differed dramatically from the cluster pattern seen in the rats at the peaks of the blood alcohol cycle. The changes in gene expression induced by ethanol were similar when propranolol was fed without ethanol especially with the changes in the kinases and phosphatases, Toll-like receptor signaling and cytokine-cytokine receptor interaction were also changed. The changes in gene expression caused by ethanol and propranolol feeding are alike probably because both drugs induce β adrenergic receptor desensitization. PMID:19925788
Identification of Loci and Functional Characterization of Trichothecene Biosynthesis Genes in Filamentous Fungi of the Genus Trichoderma▿†

PubMed Central

Cardoza, R. E.; Malmierca, M. G.; Hermosa, M. R.; Alexander, N. J.; McCormick, S. P.; Proctor, R. H.; Tijerino, A. M.; Rumbero, A.; Monte, E.; Gutiérrez, S.

2011-01-01

Trichothecenes are mycotoxins produced by Trichoderma, Fusarium, and at least four other genera in the fungal order Hypocreales. Fusarium has a trichothecene biosynthetic gene (TRI) cluster that encodes transport and regulatory proteins as well as most enzymes required for the formation of the mycotoxins. However, little is known about trichothecene biosynthesis in the other genera. Here, we identify and characterize TRI gene orthologues (tri) in Trichoderma arundinaceum and Trichoderma brevicompactum. Our results indicate that both Trichoderma species have a tri cluster that consists of orthologues of seven genes present in the Fusarium TRI cluster. Organization of genes in the cluster is the same in the two Trichoderma species but differs from the organization in Fusarium. Sequence and functional analysis revealed that the gene (tri5) responsible for the first committed step in trichothecene biosynthesis is located outside the cluster in both Trichoderma species rather than inside the cluster as it is in Fusarium. Heterologous expression analysis revealed that two T. arundinaceum cluster genes (tri4 and tri11) differ in function from their Fusarium orthologues. The Tatri4-encoded enzyme catalyzes only three of the four oxygenation reactions catalyzed by the orthologous enzyme in Fusarium. The Tatri11-encoded enzyme catalyzes a completely different reaction (trichothecene C-4 hydroxylation) than the Fusarium orthologue (trichothecene C-15 hydroxylation). The results of this study indicate that although some characteristics of the tri/TRI cluster have been conserved during evolution of Trichoderma and Fusarium, the cluster has undergone marked changes, including gene loss and/or gain, gene rearrangement, and divergence of gene function. PMID:21642405
Overexpression of miR-183/-96/-182 triggers neuronal cell fate in Human Retinal Pigment Epithelial (hRPE) cells in culture.

PubMed

Davari, Maliheh; Soheili, Zahra-Soheila; Samiei, Shahram; Sharifi, Zohreh; Pirmardan, Ehsan Ranaei

2017-01-29

miR-183 cluster, composed of miR-183/-96/-182 genes, is highly expressed in the adult retina, particularly in photoreceptors. It involves in development, maturation and normal function of neuroretina. Ectopic overexpression of miR-183/-96/-182 genes was performed to assess reprogramming of hRPE cells. They were amplified from genomic DNA and cloned independently or in tandem configuration into pAAV.MCS vector. hRPE cells were then transfected with the recombinant constructs. Real-Time PCR was performed to measure the expression levels of miR-183/-96/-182 and that of several retina-specific neuronal genes such as OTX2, NRL, PDC and DCT. The transfected cells also were immunocytochemically examined for retina-specific neuronal markers, including Rhodopsin, red opsin, CRX, Thy1, CD73, recoverin and PKCα, to determine the cellular fate of the transfected hRPE cells. Data showed that upon miR-183/-96/-182 overexpression in hRPE cultures, the expression of neuronal genes including OTX2, NRL, PDC and DCT was also upregulated. Moreover, miR-183 cluster-treated hRPE cells were immunoreactive for neuronal markers such as Rhodopsin, red opsin, CRX and Thy1. Both transcriptional and translational upregulation of neuronal genes in miR-183 cluster-treated hRPE cells suggests that in vitro overexpression of miR-183 cluster could trigger reprogramming of hRPE cells to retinal neuron fate. Copyright © 2016 Elsevier Inc. All rights reserved.
LacR Is a Repressor of lacABCD and LacT Is an Activator of lacTFEG, Constituting the lac Gene Cluster in Streptococcus pneumoniae

PubMed Central

Afzal, Muhammad; Shafeeq, Sulman

2014-01-01

Comparison of the transcriptome of Streptococcus pneumoniae strain D39 grown in the presence of either lactose or galactose with that of the strain grown in the presence of glucose revealed the elevated expression of various genes and operons, including the lac gene cluster, which is organized into two operons, i.e., lac operon I (lacABCD) and lac operon II (lacTFEG). Deletion of the DeoR family transcriptional regulator lacR that is present downstream of the lac gene cluster revealed elevated expression of lac operon I even in the absence of lactose. This suggests a function of LacR as a transcriptional repressor of lac operon I, which encodes enzymes involved in the phosphorylated tagatose pathway in the absence of lactose or galactose. Deletion of lacR did not affect the expression of lac operon II, which encodes a lactose-specific phosphotransferase. This finding was further confirmed by β-galactosidase assays with PlacA-lacZ and PlacT-lacZ in the presence of either lactose or glucose as the sole carbon source in the medium. This suggests the involvement of another transcriptional regulator in the regulation of lac operon II, which is the BglG-family transcriptional antiterminator LacT. We demonstrate the role of LacT as a transcriptional activator of lac operon II in the presence of lactose and CcpA-independent regulation of the lac gene cluster in S. pneumoniae. PMID:24951784
Deciphering the Cryptic Genome: Genome-wide Analyses of the Rice Pathogen Fusarium fujikuroi Reveal Complex Regulation of Secondary Metabolism and Novel Metabolites

PubMed Central

Studt, Lena; Niehaus, Eva-Maria; Espino, Jose J.; Huß, Kathleen; Michielse, Caroline B.; Albermann, Sabine; Wagner, Dominik; Bergner, Sonja V.; Connolly, Lanelle R.; Fischer, Andreas; Reuter, Gunter; Kleigrewe, Karin; Bald, Till; Wingfield, Brenda D.; Ophir, Ron; Freeman, Stanley; Hippler, Michael; Smith, Kristina M.; Brown, Daren W.; Proctor, Robert H.; Münsterkötter, Martin; Freitag, Michael; Humpf, Hans-Ulrich; Güldener, Ulrich; Tudzynski, Bettina

2013-01-01

The fungus Fusarium fujikuroi causes “bakanae” disease of rice due to its ability to produce gibberellins (GAs), but it is also known for producing harmful mycotoxins. However, the genetic capacity for the whole arsenal of natural compounds and their role in the fungus' interaction with rice remained unknown. Here, we present a high-quality genome sequence of F. fujikuroi that was assembled into 12 scaffolds corresponding to the 12 chromosomes described for the fungus. We used the genome sequence along with ChIP-seq, transcriptome, proteome, and HPLC-FTMS-based metabolome analyses to identify the potential secondary metabolite biosynthetic gene clusters and to examine their regulation in response to nitrogen availability and plant signals. The results indicate that expression of most but not all gene clusters correlate with proteome and ChIP-seq data. Comparison of the F. fujikuroi genome to those of six other fusaria revealed that only a small number of gene clusters are conserved among these species, thus providing new insights into the divergence of secondary metabolism in the genus Fusarium. Noteworthy, GA biosynthetic genes are present in some related species, but GA biosynthesis is limited to F. fujikuroi, suggesting that this provides a selective advantage during infection of the preferred host plant rice. Among the genome sequences analyzed, one cluster that includes a polyketide synthase gene (PKS19) and another that includes a non-ribosomal peptide synthetase gene (NRPS31) are unique to F. fujikuroi. The metabolites derived from these clusters were identified by HPLC-FTMS-based analyses of engineered F. fujikuroi strains overexpressing cluster genes. In planta expression studies suggest a specific role for the PKS19-derived product during rice infection. Thus, our results indicate that combined comparative genomics and genome-wide experimental analyses identified novel genes and secondary metabolites that contribute to the evolutionary success of F. fujikuroi as a rice pathogen. PMID:23825955
Genome-wide DNA methylation analysis reveals estrogen-mediated epigenetic repression of metallothionein-1 gene cluster in breast cancer.

PubMed

Jadhav, Rohit R; Ye, Zhenqing; Huang, Rui-Lan; Liu, Joseph; Hsu, Pei-Yin; Huang, Yi-Wen; Rangel, Leticia B; Lai, Hung-Cheng; Roa, Juan Carlos; Kirma, Nameer B; Huang, Tim Hui-Ming; Jin, Victor X

2015-01-01

Recent genome-wide analysis has shown that DNA methylation spans long stretches of chromosome regions consisting of clusters of contiguous CpG islands or gene families. Hypermethylation of various gene clusters has been reported in many types of cancer. In this study, we conducted methyl-binding domain capture (MBDCap) sequencing (MBD-seq) analysis on a breast cancer cohort consisting of 77 patients and 10 normal controls, as well as a panel of 38 breast cancer cell lines. Bioinformatics analysis determined seven gene clusters with a significant difference in overall survival (OS) and further revealed a distinct feature that the conservation of a large gene cluster (approximately 70 kb) metallothionein-1 (MT1) among 45 species is much lower than the average of all RefSeq genes. Furthermore, we found that DNA methylation is an important epigenetic regulator contributing to gene repression of MT1 gene cluster in both ERα positive (ERα+) and ERα negative (ERα-) breast tumors. In silico analysis revealed much lower gene expression of this cluster in The Cancer Genome Atlas (TCGA) cohort for ERα + tumors. To further investigate the role of estrogen, we conducted 17β-estradiol (E2) and demethylating agent 5-aza-2'-deoxycytidine (DAC) treatment in various breast cancer cell types. Cell proliferation and invasion assays suggested MT1F and MT1M may play an anti-oncogenic role in breast cancer. Our data suggests that DNA methylation in large contiguous gene clusters can be potential prognostic markers of breast cancer. Further investigation of these clusters revealed that estrogen mediates epigenetic repression of MT1 cluster in ERα + breast cancer cell lines. In all, our studies identify thousands of breast tumor hypermethylated regions for the first time, in particular, discovering seven large contiguous hypermethylated gene clusters.
Comparative Genomics of Non-TNL Disease Resistance Genes from Six Plant Species.

PubMed

Nepal, Madhav P; Andersen, Ethan J; Neupane, Surendra; Benson, Benjamin V

2017-09-30

Disease resistance genes (R genes), as part of the plant defense system, have coevolved with corresponding pathogen molecules. The main objectives of this project were to identify non-Toll interleukin receptor, nucleotide-binding site, leucine-rich repeat (nTNL) genes and elucidate their evolutionary divergence across six plant genomes. Using reference sequences from Arabidopsis , we investigated nTNL orthologs in the genomes of common bean, Medicago , soybean, poplar, and rice. We used Hidden Markov Models for sequence identification, performed model-based phylogenetic analyses, visualized chromosomal positioning, inferred gene clustering, and assessed gene expression profiles. We analyzed 908 nTNL R genes in the genomes of the six plant species, and classified them into 12 subgroups based on the presence of coiled-coil (CC), nucleotide binding site (NBS), leucine rich repeat (LRR), resistance to Powdery mildew 8 (RPW8), and BED type zinc finger domains. Traditionally classified CC-NBS-LRR (CNL) genes were nested into four clades (CNL A-D) often with abundant, well-supported homogeneous subclades of Type-II R genes. CNL-D members were absent in rice, indicating a unique R gene retention pattern in the rice genome. Genomes from Arabidopsis , common bean, poplar and soybean had one chromosome without any CNL R genes. Medicago and Arabidopsis had the highest and lowest number of gene clusters, respectively. Gene expression analyses suggested unique patterns of expression for each of the CNL clades. Differential gene expression patterns of the nTNL genes were often found to correlate with number of introns and GC content, suggesting structural and functional divergence.
Comparative Genomics of Non-TNL Disease Resistance Genes from Six Plant Species

PubMed Central

Andersen, Ethan J.; Neupane, Surendra; Benson, Benjamin V.

2017-01-01

Disease resistance genes (R genes), as part of the plant defense system, have coevolved with corresponding pathogen molecules. The main objectives of this project were to identify non-Toll interleukin receptor, nucleotide-binding site, leucine-rich repeat (nTNL) genes and elucidate their evolutionary divergence across six plant genomes. Using reference sequences from Arabidopsis, we investigated nTNL orthologs in the genomes of common bean, Medicago, soybean, poplar, and rice. We used Hidden Markov Models for sequence identification, performed model-based phylogenetic analyses, visualized chromosomal positioning, inferred gene clustering, and assessed gene expression profiles. We analyzed 908 nTNL R genes in the genomes of the six plant species, and classified them into 12 subgroups based on the presence of coiled-coil (CC), nucleotide binding site (NBS), leucine rich repeat (LRR), resistance to Powdery mildew 8 (RPW8), and BED type zinc finger domains. Traditionally classified CC-NBS-LRR (CNL) genes were nested into four clades (CNL A-D) often with abundant, well-supported homogeneous subclades of Type-II R genes. CNL-D members were absent in rice, indicating a unique R gene retention pattern in the rice genome. Genomes from Arabidopsis, common bean, poplar and soybean had one chromosome without any CNL R genes. Medicago and Arabidopsis had the highest and lowest number of gene clusters, respectively. Gene expression analyses suggested unique patterns of expression for each of the CNL clades. Differential gene expression patterns of the nTNL genes were often found to correlate with number of introns and GC content, suggesting structural and functional divergence. PMID:28973974
Expression profiling of chickpea genes differentially regulated during a resistance response to Ascochyta rabiei.

PubMed

Coram, Tristan E; Pang, Edwin C K

2006-11-01

Using microarray technology and a set of chickpea (Cicer arietinum L.) unigenes, grasspea (Lathyrus sativus L.) expressed sequence tags (ESTs) and lentil (Lens culinaris Med.) resistance gene analogues, the ascochyta blight (Ascochyta rabiei (Pass.) L.) resistance response was studied in four chickpea genotypes, including resistant, moderately resistant, susceptible and wild relative (Cicer echinospermum L.) genotypes. The experimental system minimized environmental effects and was conducted in reference design, in which samples from mock-inoculated controls acted as reference against post-inoculation samples. Robust data quality was achieved through the use of three biological replicates (including a dye swap), the inclusion of negative controls and strict selection criteria for differentially expressed genes, including a fold change cut-off determined by self-self hybridizations, Student's t-test and multiple testing correction (P < 0.05). Microarray observations were also validated by quantitative reverse transcriptase-polymerase chain reaction (RT-PCR). The time course expression patterns of 756 microarray features resulted in the differential expression of 97 genes in at least one genotype at one time point. k-means clustering grouped the genes into clusters of similar observations for each genotype, and comparisons between A. rabiei-resistant and A. rabiei-susceptible genotypes revealed potential gene 'signatures' predictive of effective A. rabiei resistance. These genes included several pathogenesis-related proteins, SNAKIN2 antimicrobial peptide, proline-rich protein, disease resistance response protein DRRG49-C, environmental stress-inducible protein, leucine-zipper protein, polymorphic antigen membrane protein, Ca-binding protein and several unknown proteins. The potential involvement of these genes and their pathways of induction are discussed. This study represents the first large-scale gene expression profiling in chickpea, and future work will focus on the functional validation of the genes of interest.
Gene expression profiling in rat kidney after intratracheal exposure to cadmium-doped nanoparticles

NASA Astrophysics Data System (ADS)

Coccini, Teresa; Roda, Elisa; Fabbri, Marco; Sacco, Maria Grazia; Gribaldo, Laura; Manzo, Luigi

2012-08-01

While nephrotoxicity of cadmium is well documented, very limited information exists on renal effects of exposure to cadmium-containing nanomaterials. In this work, "omics" methodologies have been used to assess the action of cadmium-containing silica nanoparticles (Cd-SiNPs) in the kidney of Sprague-Dawley rats exposed intratracheally. Groups of animals received a single dose of Cd-SiNPs (1 mg/rat), CdCl2 (400 μg/rat) or 0.1 ml saline (control). Renal gene expression was evaluated 7 and 30 days post exposure by DNA microarray technology using the Agilent Whole Rat Genome Microarray 4x44K. Gene modulating effects were observed in kidney at both time periods after treatment with Cd-SiNPs. The number of differentially expressed genes being 139 and 153 at the post exposure days 7 and 30, respectively. Renal gene expression changes were also observed in the kidney of CdCl2-treated rats with a total of 253 and 70 probes modulated at 7 and 30 days, respectively. Analysis of renal gene expression profiles at day 7 indicated in both Cd-SiNP and CdCl2 groups downregulation of several cluster genes linked to immune function, oxidative stress, and inflammation processes. Differing from day 7, the majority of cluster gene categories modified by nanoparticles in kidney 30 days after dosing were genes implicated in cell regulation and apoptosis. Modest renal gene expression changes were observed at day 30 in rats treated with CdCl2. These results indicate that kidney may be a susceptible target for subtle long-lasting molecular alterations produced by cadmium nanoparticles locally instilled in the lung.
Response of Human Skin to Aesthetic Scarification

PubMed Central

Gabriel, Vincent A.; McClellan, Elizabeth A.; Scheuermann, Richard H.

2014-01-01

This study was undertaken to investigate changes in RNA expression in previously healthy adult human skin following thermal injury induced by contact with hot metal that was undertaken as part of aesthetic scarification, a body modification practice. Subjects were recruited to have pre-injury skin and serial wound biopsies performed. 4 mm punch biopsies were taken prior to branding and 1 hour, 1 week, and 1, 2 and 3 months post injury. RNA was extracted and quality assured prior to the use of a whole-genome based bead array platform to describe expression changes in the samples using the pre-injury skin as a comparator. Analysis of the array data was performed using k-means clustering and a hypergeometric probability distribution without replacement and corrections for multiple comparisons were done. Confirmatory q-PCR was performed. Using a k of 10, several clusters of genes were shown to co-cluster together based on Gene Ontology classification with probabilities unlikely to occur by chance alone. OF particular interest were clusters relating to cell cycle, proteinaceous extracellular matrix and keratinization. Given the consistent expression changes at one week following injury in the cell cycle cluster, there is an opportunity to intervene early following burn injury to influence scar development. PMID:24582755
Identification of an unusual type II thioesterase in the dithiolopyrrolone antibiotics biosynthetic pathway.

PubMed

Zhai, Ying; Bai, Silei; Liu, Jingjing; Yang, Liyuan; Han, Li; Huang, Xueshi; He, Jing

2016-04-22

Dithiolopyrrolone group antibiotics characterized by an electronically unique dithiolopyrrolone heterobicyclic core are known for their antibacterial, antifungal, insecticidal and antitumor activities. Recently the biosynthetic gene clusters for two dithiolopyrrolone compounds, holomycin and thiomarinol, have been identified respectively in different bacterial species. Here, we report a novel dithiolopyrrolone biosynthetic gene cluster (aut) isolated from Streptomyces thioluteus DSM 40027 which produces two pyrrothine derivatives, aureothricin and thiolutin. By comparison with other characterized dithiolopyrrolone clusters, eight genes in the aut cluster were verified to be responsible for the assembly of dithiolopyrrolone core. The aut cluster was further confirmed by heterologous expression and in-frame gene deletion experiments. Intriguingly, we found that the heterogenetic thioesterase HlmK derived from the holomycin (hlm) gene cluster in Streptomyces clavuligerus significantly improved heterologous biosynthesis of dithiolopyrrolones in Streptomyces albus through coexpression with the aut cluster. In the previous studies, HlmK was considered invalid because it has a Ser to Gly point mutation within the canonical Ser-His-Asp catalytic triad of thioesterases. However, gene inactivation and complementation experiments in our study unequivocally demonstrated that HlmK is an active distinctive type II thioesterase that plays a beneficial role in dithiolopyrrolone biosynthesis. Copyright © 2016 Elsevier Inc. All rights reserved.
Post-genome research on the biosynthesis of ergot alkaloids.

PubMed

Li, Shu-Ming; Unsöld, Inge A

2006-10-01

Genome sequencing provides new opportunities and challenges for identifying genes for the biosynthesis of secondary metabolites. A putative biosynthetic gene cluster of fumigaclavine C, an ergot alkaloid of the clavine type, was identified in the genome sequence of ASPERGILLUS FUMIGATUS by a bioinformatic approach. This cluster spans 22 kb of genomic DNA and comprises at least 11 open reading frames (ORFs). Seven of them are orthologous to genes from the biosynthetic gene cluster of ergot alkaloids in CLAVICEPS PURPUREA. Experimental evidence of the identified cluster was provided by heterologous expression and biochemical characterization of two ORFs, FgaPT1 and FgaPT2, in the cluster of A. FUMIGATUS, which show remarkable similarities to dimethylallyltryptophan synthase from C. PURPUREA and function as prenyltransferases. FgaPT2 converts L-tryptophan to dimethylallyltryptophan and thereby catalyzes the first step of ergot alkaloid biosynthesis, whilst FgaPT1 catalyzes the last step of the fumigaclavine C biosynthesis, i. e., the prenylation of fumigaclavine A at C-2 position of the indole nucleus. In addition to information obtained from the gene cluster of ergot alkaloids from C. PURPUREA, the identification of the biosynthetic gene cluster of fumigaclavine C in A. FUMIGATUS opens an alternative way to study the biosynthesis of ergot alkaloids in fungi.
Molecular events of apical bud formation in white spruce, Picea glauca.

PubMed

El Kayal, Walid; Allen, Carmen C G; Ju, Chelsea J-T; Adams, Eri; King-Jones, Susanne; Zaharia, L Irina; Abrams, Suzanne R; Cooke, Janice E K

2011-03-01

Bud formation is an adaptive trait that temperate forest trees have acquired to facilitate seasonal synchronization. We have characterized transcriptome-level changes that occur during bud formation of white spruce [Picea glauca (Moench) Voss], a primarily determinate species in which preformed stem units contained within the apical bud constitute most of next season's growth. Microarray analysis identified 4460 differentially expressed sequences in shoot tips during short day-induced bud formation. Cluster analysis revealed distinct temporal patterns of expression, and functional classification of genes in these clusters implied molecular processes that coincide with anatomical changes occurring in the developing bud. Comparing expression profiles in developing buds under long day and short day conditions identified possible photoperiod-responsive genes that may not be essential for bud development. Several genes putatively associated with hormone signalling were identified, and hormone quantification revealed distinct profiles for abscisic acid (ABA), cytokinins, auxin and their metabolites that can be related to morphological changes to the bud. Comparison of gene expression profiles during bud formation in different tissues revealed 108 genes that are differentially expressed only in developing buds and show greater transcript abundance in developing buds than other tissues. These findings provide a temporal roadmap of bud formation in white spruce. © 2011 Blackwell Publishing Ltd.
Annotation of Ehux ESTs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuo, Alan; Grigoriev, Igor

2009-06-12

22 percent ESTs do no align with scaffolds. EST Pipeleine assembles 17126 consensi from the noaligned ESTs. Annotation Pipeline predicts 8564 ORFS on the consensi. Domain analysis of ORFs reveals missing genes. Cluster analysis reveals missing genes. Expression analysis reveals potential strain specific genes.
Pre-Bilaterian Origins of the Hox Cluster and the Hox Code: Evidence from the Sea Anemone, Nematostella vectensis

PubMed Central

Ryan, Joseph F.; Mazza, Maureen E.; Pang, Kevin; Matus, David Q.; Baxevanis, Andreas D.; Martindale, Mark Q.; Finnerty, John R.

2007-01-01

Background Hox genes were critical to many morphological innovations of bilaterian animals. However, early Hox evolution remains obscure. Phylogenetic, developmental, and genomic analyses on the cnidarian sea anemone Nematostella vectensis challenge recent claims that the Hox code is a bilaterian invention and that no “true” Hox genes exist in the phylum Cnidaria. Methodology/Principal Findings Phylogenetic analyses of 18 Hox-related genes from Nematostella identify putative Hox1, Hox2, and Hox9+ genes. Statistical comparisons among competing hypotheses bolster these findings, including an explicit consideration of the gene losses implied by alternate topologies. In situ hybridization studies of 20 Hox-related genes reveal that multiple Hox genes are expressed in distinct regions along the primary body axis, supporting the existence of a pre-bilaterian Hox code. Additionally, several Hox genes are expressed in nested domains along the secondary body axis, suggesting a role in “dorsoventral” patterning. Conclusions/Significance A cluster of anterior and posterior Hox genes, as well as ParaHox cluster of genes evolved prior to the cnidarian-bilaterian split. There is evidence to suggest that these clusters were formed from a series of tandem gene duplication events and played a role in patterning both the primary and secondary body axes in a bilaterally symmetrical common ancestor. Cnidarians and bilaterians shared a common ancestor some 570 to 700 million years ago, and as such, are derived from a common body plan. Our work reveals several conserved genetic components that are found in both of these diverse lineages. This finding is consistent with the hypothesis that a set of developmental rules established in the common ancestor of cnidarians and bilaterians is still at work today. PMID:17252055
Active and Repressive Chromatin Are Interspersed without Spreading in an Imprinted Gene Cluster in the Mammalian Genome

PubMed Central

Regha, Kakkad; Sloane, Mathew A.; Huang, Ru; Pauler, Florian M.; Warczok, Katarzyna E.; Melikant, Balázs; Radolf, Martin; Martens, Joost H.A.; Schotta, Gunnar; Jenuwein, Thomas; Barlow, Denise P.

2010-01-01

SUMMARY The Igf2r imprinted cluster is an epigenetic silencing model in which expression of a ncRNA silences multiple genes in cis. Here, we map a 250 kb region in mouse embryonic fibroblast cells to show that histone modifications associated with expressed and silent genes are mutually exclusive and localized to discrete regions. Expressed genes were modified at promoter regions by H3K4me3 + H3K4me2 + H3K9Ac and on putative regulatory elements flanking active promoters by H3K4me2 + H3K9Ac. Silent genes showed two types of nonoverlapping profile. One type spread over large domains of tissue-specific silent genes and contained H3K27me3 alone. A second type formed localized foci on silent imprinted gene promoters and a nonexpressed pseudogene and contained H3K9me3 + H4K20me3 ± HP1. Thus, mammalian chromosome arms contain active chromatin interspersed with repressive chromatin resembling the type of heterochromatin previously considered a feature of centromeres, telomeres, and the inactive X chromosome. PMID:17679087

Effects of inorganic nitrogen sources on the production of PP-V [(10Z)-12-carboxyl-monascorubramine] and the Expression of the nitrate assimilation gene cluster by Penicillium sp. AZ.

PubMed

Arai, Teppei; Umemura, Sara; Ota, Tamaki; Ogihara, Jun; Kato, Jun; Kasumi, Takafumi

2012-01-01

A fungal strain, Penicillium sp. AZ, produced the azaphilone Monascus pigment homolog when cultured in a medium composed of soluble starch, ammonium nitrate, yeast extract, and citrate buffer, pH 5.0. One of the typical features of violet pigment PP-V [(10Z)-12-carboxyl-monascorubramine] is that pyranoid oxygen is replaced with nitrogen. In this study, we found that ammonia and nitrate nitrogen are available for PP-V biosynthesis, and that ammonia nitrogen was much more effective than nitrate nitrogen. Further, we isolated nitrate assimilation gene cluster, niaD, niiA, and crnA, and analyzed the expression of these genes. The expression levels of all these genes increased with sodium nitrate addition to the culture medium. The results obtained here strongly suggest that Penicillium sp. AZ produced PP-V using nitrate in the form of ammonium reduced from nitrate through a bioprocess assimilatory reaction.
A remarkably stable TipE gene cluster: evolution of insect Para sodium channel auxiliary subunits

PubMed Central

2011-01-01

Background First identified in fruit flies with temperature-sensitive paralysis phenotypes, the Drosophila melanogaster TipE locus encodes four voltage-gated sodium (NaV) channel auxiliary subunits. This cluster of TipE-like genes on chromosome 3L, and a fifth family member on chromosome 3R, are important for the optional expression and functionality of the Para NaV channel but appear quite distinct from auxiliary subunits in vertebrates. Here, we exploited available arthropod genomic resources to trace the origin of TipE-like genes by mapping their evolutionary histories and examining their genomic architectures. Results We identified a remarkably conserved synteny block of TipE-like orthologues with well-maintained local gene arrangements from 21 insect species. Homologues in the water flea, Daphnia pulex, suggest an ancestral pancrustacean repertoire of four TipE-like genes; a subsequent gene duplication may have generated functional redundancy allowing gene losses in the silk moth and mosquitoes. Intronic nesting of the insect TipE gene cluster probably occurred following the divergence from crustaceans, but in the flour beetle and silk moth genomes the clusters apparently escaped from nesting. Across Pancrustacea, TipE gene family members have experienced intronic nesting, escape from nesting, retrotransposition, translocation, and gene loss events while generally maintaining their local gene neighbourhoods. D. melanogaster TipE-like genes exhibit coordinated spatial and temporal regulation of expression distinct from their host gene but well-correlated with their regulatory target, the Para NaV channel, suggesting that functional constraints may preserve the TipE gene cluster. We identified homology between TipE-like NaV channel regulators and vertebrate Slo-beta auxiliary subunits of big-conductance calcium-activated potassium (BKCa) channels, which suggests that ion channel regulatory partners have evolved distinct lineage-specific characteristics. Conclusions TipE-like genes form a remarkably conserved genomic cluster across all examined insect genomes. This study reveals likely structural and functional constraints on the genomic evolution of insect TipE gene family members maintained in synteny over hundreds of millions of years of evolution. The likely common origin of these NaV channel regulators with BKCa auxiliary subunits highlights the evolutionary plasticity of ion channel regulatory mechanisms. PMID:22098672
Differential expression of cysteine desulfurases in soybean

PubMed Central

2011-01-01

Background Iron-sulfur [Fe-S] clusters are prosthetic groups required to sustain fundamental life processes including electron transfer, metabolic reactions, sensing, signaling, gene regulation and stabilization of protein structures. In plants, the biogenesis of Fe-S protein is compartmentalized and adapted to specific needs of the cell. Many environmental factors affect plant development and limit productivity and geographical distribution. The impact of these limiting factors is particularly relevant for major crops, such as soybean, which has worldwide economic importance. Results Here we analyze the transcriptional profile of the soybean cysteine desulfurases NFS1, NFS2 and ISD11 genes, involved in the biogenesis of [Fe-S] clusters, by quantitative RT-PCR. NFS1, ISD11 and NFS2 encoding two mitochondrial and one plastid located proteins, respectively, are duplicated and showed distinct transcript levels considering tissue and stress response. NFS1 and ISD11 are highly expressed in roots, whereas NFS2 showed no differential expression in tissues. Cold-treated plants showed a decrease in NFS2 and ISD11 transcript levels in roots, and an increased expression of NFS1 and ISD11 genes in leaves. Plants treated with salicylic acid exhibited increased NFS1 transcript levels in roots but lower levels in leaves. In silico analysis of promoter regions indicated the presence of different cis-elements in cysteine desulfurase genes, in good agreement with differential expression of each locus. Our data also showed that increasing of transcript levels of mitochondrial genes, NFS1/ISD11, are associated with higher activities of aldehyde oxidase and xanthine dehydrogenase, two cytosolic Fe-S proteins. Conclusions Our results suggest a relationship between gene expression pattern, biochemical effects, and transcription factor binding sites in promoter regions of cysteine desulfurase genes. Moreover, data show proportionality between NFS1 and ISD11 genes expression. PMID:22099069
Nipbl and mediator cooperatively regulate gene expression to control limb development.

PubMed

Muto, Akihiko; Ikeda, Shingo; Lopez-Burks, Martha E; Kikuchi, Yutaka; Calof, Anne L; Lander, Arthur D; Schilling, Thomas F

2014-09-01

Haploinsufficiency for Nipbl, a cohesin loading protein, causes Cornelia de Lange Syndrome (CdLS), the most common "cohesinopathy". It has been proposed that the effects of Nipbl-haploinsufficiency result from disruption of long-range communication between DNA elements. Here we use zebrafish and mouse models of CdLS to examine how transcriptional changes caused by Nipbl deficiency give rise to limb defects, a common condition in individuals with CdLS. In the zebrafish pectoral fin (forelimb), knockdown of Nipbl expression led to size reductions and patterning defects that were preceded by dysregulated expression of key early limb development genes, including fgfs, shha, hand2 and multiple hox genes. In limb buds of Nipbl-haploinsufficient mice, transcriptome analysis revealed many similar gene expression changes, as well as altered expression of additional classes of genes that play roles in limb development. In both species, the pattern of dysregulation of hox-gene expression depended on genomic location within the Hox clusters. In view of studies suggesting that Nipbl colocalizes with the mediator complex, which facilitates enhancer-promoter communication, we also examined zebrafish deficient for the Med12 Mediator subunit, and found they resembled Nipbl-deficient fish in both morphology and gene expression. Moreover, combined partial reduction of both Nipbl and Med12 had a strongly synergistic effect, consistent with both molecules acting in a common pathway. In addition, three-dimensional fluorescent in situ hybridization revealed that Nipbl and Med12 are required to bring regions containing long-range enhancers into close proximity with the zebrafish hoxda cluster. These data demonstrate a crucial role for Nipbl in limb development, and support the view that its actions on multiple gene pathways result from its influence, together with Mediator, on regulation of long-range chromosomal interactions.
Simultaneous clustering of gene expression data with clinical chemistry and pathological evaluations reveals phenotypic prototypes

PubMed Central

Bushel, Pierre R; Wolfinger, Russell D; Gibson, Greg

2007-01-01

Background Commonly employed clustering methods for analysis of gene expression data do not directly incorporate phenotypic data about the samples. Furthermore, clustering of samples with known phenotypes is typically performed in an informal fashion. The inability of clustering algorithms to incorporate biological data in the grouping process can limit proper interpretation of the data and its underlying biology. Results We present a more formal approach, the modk-prototypes algorithm, for clustering biological samples based on simultaneously considering microarray gene expression data and classes of known phenotypic variables such as clinical chemistry evaluations and histopathologic observations. The strategy involves constructing an objective function with the sum of the squared Euclidean distances for numeric microarray and clinical chemistry data and simple matching for histopathology categorical values in order to measure dissimilarity of the samples. Separate weighting terms are used for microarray, clinical chemistry and histopathology measurements to control the influence of each data domain on the clustering of the samples. The dynamic validity index for numeric data was modified with a category utility measure for determining the number of clusters in the data sets. A cluster's prototype, formed from the mean of the values for numeric features and the mode of the categorical values of all the samples in the group, is representative of the phenotype of the cluster members. The approach is shown to work well with a simulated mixed data set and two real data examples containing numeric and categorical data types. One from a heart disease study and another from acetaminophen (an analgesic) exposure in rat liver that causes centrilobular necrosis. Conclusion The modk-prototypes algorithm partitioned the simulated data into clusters with samples in their respective class group and the heart disease samples into two groups (sick and buff denoting samples having pain type representative of angina and non-angina respectively) with an accuracy of 79%. This is on par with, or better than, the assignment accuracy of the heart disease samples by several well-known and successful clustering algorithms. Following modk-prototypes clustering of the acetaminophen-exposed samples, informative genes from the cluster prototypes were identified that are descriptive of, and phenotypically anchored to, levels of necrosis of the centrilobular region of the rat liver. The biological processes cell growth and/or maintenance, amine metabolism, and stress response were shown to discern between no and moderate levels of acetaminophen-induced centrilobular necrosis. The use of well-known and traditional measurements directly in the clustering provides some guarantee that the resulting clusters will be meaningfully interpretable. PMID:17408499
Epigenetic transgenerational inheritance of somatic transcriptomes and epigenetic control regions

PubMed Central

2012-01-01

Background Environmentally induced epigenetic transgenerational inheritance of adult onset disease involves a variety of phenotypic changes, suggesting a general alteration in genome activity. Results Investigation of different tissue transcriptomes in male and female F3 generation vinclozolin versus control lineage rats demonstrated all tissues examined had transgenerational transcriptomes. The microarrays from 11 different tissues were compared with a gene bionetwork analysis. Although each tissue transgenerational transcriptome was unique, common cellular pathways and processes were identified between the tissues. A cluster analysis identified gene modules with coordinated gene expression and each had unique gene networks regulating tissue-specific gene expression and function. A large number of statistically significant over-represented clusters of genes were identified in the genome for both males and females. These gene clusters ranged from 2-5 megabases in size, and a number of them corresponded to the epimutations previously identified in sperm that transmit the epigenetic transgenerational inheritance of disease phenotypes. Conclusions Combined observations demonstrate that all tissues derived from the epigenetically altered germ line develop transgenerational transcriptomes unique to the tissue, but common epigenetic control regions in the genome may coordinately regulate these tissue-specific transcriptomes. This systems biology approach provides insight into the molecular mechanisms involved in the epigenetic transgenerational inheritance of a variety of adult onset disease phenotypes. PMID:23034163
Diversity amongst trigeminal neurons revealed by high throughput single cell sequencing

PubMed Central

Nguyen, Minh Q.; Wu, Youmei; Bonilla, Lauren S.; von Buchholtz, Lars J.

2017-01-01

The trigeminal ganglion contains somatosensory neurons that detect a range of thermal, mechanical and chemical cues and innervate unique sensory compartments in the head and neck including the eyes, nose, mouth, meninges and vibrissae. We used single-cell sequencing and in situ hybridization to examine the cellular diversity of the trigeminal ganglion in mice, defining thirteen clusters of neurons. We show that clusters are well conserved in dorsal root ganglia suggesting they represent distinct functional classes of somatosensory neurons and not specialization associated with their sensory targets. Notably, functionally important genes (e.g. the mechanosensory channel Piezo2 and the capsaicin gated ion channel Trpv1) segregate into multiple clusters and often are expressed in subsets of cells within a cluster. Therefore, the 13 genetically-defined classes are likely to be physiologically heterogeneous rather than highly parallel (i.e., redundant) lines of sensory input. Our analysis harnesses the power of single-cell sequencing to provide a unique platform for in silico expression profiling that complements other approaches linking gene-expression with function and exposes unexpected diversity in the somatosensory system. PMID:28957441
The transcriptome of a complete episode of acute otitis media.

PubMed

Hernandez, Michelle; Leichtle, Anke; Pak, Kwang; Webster, Nicholas J; Wasserman, Stephen I; Ryan, Allen F

2015-04-03

Otitis media is the most common disease of childhood, and represents an important health challenge to the 10-15% of children who experience chronic/recurrent middle ear infections. The middle ear undergoes extensive modifications during otitis media, potentially involving changes in the expression of many genes. Expression profiling offers an opportunity to discover novel genes and pathways involved in this common childhood disease. The middle ears of 320 WBxB6 F1 hybrid mice were inoculated with non-typeable Haemophilus influenzae (NTHi) or PBS (sham control). Two independent samples were generated for each time point and condition, from initiation of infection to resolution. RNA was profiled on Affymetrix mouse 430 2.0 whole-genome microarrays. Approximately 8% of the sampled transcripts defined the signature of acute NTHi-induced otitis media across time. Hierarchical clustering of signal intensities revealed several temporal gene clusters. Network and pathway enrichment analysis of these clusters identified sets of genes involved in activation of the innate immune response, negative regulation of immune response, changes in epithelial and stromal cell markers, and the recruitment/function of neutrophils and macrophages. We also identified key transcriptional regulators related to events in otitis media, which likely determine the expression of these gene clusters. A list of otitis media susceptibility genes, derived from genome-wide association and candidate gene studies, was significantly enriched during the early induction phase and the middle re-modeling phase of otitis but not in the resolution phase. Our results further indicate that positive versus negative regulation of inflammatory processes occur with highly similar kinetics during otitis media, underscoring the importance of anti-inflammatory responses in controlling pathogenesis. The results characterize the global gene response during otitis media and identify key signaling and transcription factor networks that control the defense of the middle ear against infection. These networks deserve further attention, as dysregulated immune defense and inflammatory responses may contribute to recurrent or chronic otitis in children.
Glutamic acid promotes monacolin K production and monacolin K biosynthetic gene cluster expression in Monascus.

PubMed

Zhang, Chan; Liang, Jian; Yang, Le; Chai, Shiyuan; Zhang, Chenxi; Sun, Baoguo; Wang, Chengtao

2017-12-01

This study investigated the effects of glutamic acid on production of monacolin K and expression of the monacolin K biosynthetic gene cluster. When Monascus M1 was grown in glutamic medium instead of in the original medium, monacolin K production increased from 48.4 to 215.4 mg l -1 , monacolin K production increased by 3.5 times. Glutamic acid enhanced monacolin K production by upregulating the expression of mokB-mokI; on day 8, the expression level of mokA tended to decrease by Reverse Transcription-polymerase Chain Reaction. Our findings demonstrated that mokA was not a key gene responsible for the quantity of monacolin K production in the presence of glutamic acid. Observation of Monascus mycelium morphology using Scanning Electron Microscope showed glutamic acid significantly increased the content of Monascus mycelium, altered the permeability of Monascus mycelium, enhanced secretion of monacolin K from the cell, and reduced the monacolin K content in Monascus mycelium, thereby enhancing monacolin K production.
Regulation of gene expression in plasmid ColE1: delayed expression of the kil gene.

PubMed Central

Zhang, S P; Yan, L F; Zubay, G

1988-01-01

cea, imm, and kil are a cluster of three functionally related genes of the plasmid ColE1. The cea and kil genes are in the same inducible operon, with transcription being initiated from a promoter adjacent to the cea gene. The imm gene is located between the cea and kil genes, but it is transcribed in the opposite direction. Complementary interaction between the imm mRNA and the anti-imm sequences in the middle of the cea-kil transcript causes a pronounced delay in expression of the kil gene when the cea-kil operon is induced. A segment in the overlapping region between the cea and imm genes causes delayed expression of the kil gene in the absence of imm gene transcription. This delay effect increases the yields of colicin synthesized in induced cells. Images PMID:3142845
Analysis of large-scale gene expression data.

PubMed

Sherlock, G

2000-04-01

The advent of cDNA and oligonucleotide microarray technologies has led to a paradigm shift in biological investigation, such that the bottleneck in research is shifting from data generation to data analysis. Hierarchical clustering, divisive clustering, self-organizing maps and k-means clustering have all been recently used to make sense of this mass of data.
An integrated approach to reconstructing genome-scale transcriptional regulatory networks

DOE PAGES

Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.; ...

2015-02-27

Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making themmore » highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating comparative genomics of closely related organisms with gene expression data to assemble large-scale TRN models with high-quality predictions.« less
Microarray identifies ADAM family members as key responders to TGF-beta1 in alveolar epithelial cells.

PubMed

Keating, Dominic T; Sadlier, Denise M; Patricelli, Andrea; Smith, Sinead M; Walls, Dermot; Egan, Jim J; Doran, Peter P

2006-09-01

The molecular mechanisms of Idiopathic Pulmonary Fibrosis (IPF) remain elusive. Transforming Growth Factor beta 1(TGF-beta1) is a key effector cytokine in the development of lung fibrosis. We used microarray and computational biology strategies to identify genes whose expression is significantly altered in alveolar epithelial cells (A549) in response to TGF-beta1, IL-4 and IL-13 and Epstein Barr virus. A549 cells were exposed to 10 ng/ml TGF-beta1, IL-4 and IL-13 at serial time points. Total RNA was used for hybridisation to Affymetrix Human Genome U133A microarrays. Each in vitro time-point was studied in duplicate and an average RMA value computed. Expression data for each time point was compared to control and a signal log ratio of 0.6 or greater taken to identify significant differential regulation. Using normalised RMA values and unsupervised Average Linkage Hierarchical Cluster Analysis, a list of 312 extracellular matrix (ECM) proteins or modulators of matrix turnover was curated via Onto-Compare and Gene-Ontology (GO) databases for baited cluster analysis of ECM associated genes. Interrogation of the dataset using ontological classification focused cluster analysis revealed coordinate differential expression of a large cohort of extracellular matrix associated genes. Of this grouping members of the ADAM (A disintegrin and Metalloproteinase domain containing) family of genes were differentially expressed. ADAM gene expression was also identified in EBV infected A549 cells as well as IL-13 and IL-4 stimulated cells. We probed pathologenomic activities (activation and functional activity) of ADAM19 and ADAMTS9 using siRNA and collagen assays. Knockdown of these genes resulted in diminished production of collagen in A549 cells exposed to TGF-beta1, suggesting a potential role for these molecules in ECM accumulation in IPF.
Clustered metallothionein genes are co-regulated in rice and ectopic expression of OsMT1e-P confers multiple abiotic stress tolerance in tobacco via ROS scavenging

PubMed Central

2012-01-01

Background Metallothioneins (MT) are low molecular weight, cysteine rich metal binding proteins, found across genera and species, but their function(s) in abiotic stress tolerance are not well documented. Results We have characterized a rice MT gene, OsMT1e-P, isolated from a subtractive library generated from a stressed salinity tolerant rice genotype, Pokkali. Bioinformatics analysis of the rice genome sequence revealed that this gene belongs to a multigenic family, which consists of 13 genes with 15 protein products. OsMT1e-P is located on chromosome XI, away from the majority of other type I genes that are clustered on chromosome XII. Various members of this MT gene cluster showed a tight co-regulation pattern under several abiotic stresses. Sequence analysis revealed the presence of conserved cysteine residues in OsMT1e-P protein. Salinity stress was found to regulate the transcript abundance of OsMT1e-P in a developmental and organ specific manner. Using transgenic approach, we found a positive correlation between ectopic expression of OsMT1e-P and stress tolerance. Our experiments further suggest ROS scavenging to be the possible mechanism for multiple stress tolerance conferred by OsMT1e-P. Conclusion We present an overview of MTs, describing their gene structure, genome localization and expression patterns under salinity and development in rice. We have found that ectopic expression of OsMT1e-P enhances tolerance towards multiple abiotic stresses in transgenic tobacco and the resultant plants could survive and set viable seeds under saline conditions. Taken together, the experiments presented here have indicated that ectopic expression of OsMT1e-P protects against oxidative stress primarily through efficient scavenging of reactive oxygen species. PMID:22780875
Clustered metallothionein genes are co-regulated in rice and ectopic expression of OsMT1e-P confers multiple abiotic stress tolerance in tobacco via ROS scavenging.

PubMed

Kumar, Gautam; Kushwaha, Hemant Ritturaj; Panjabi-Sabharwal, Vaishali; Kumari, Sumita; Joshi, Rohit; Karan, Ratna; Mittal, Shweta; Pareek, Sneh L Singla; Pareek, Ashwani

2012-07-10

Metallothioneins (MT) are low molecular weight, cysteine rich metal binding proteins, found across genera and species, but their function(s) in abiotic stress tolerance are not well documented. We have characterized a rice MT gene, OsMT1e-P, isolated from a subtractive library generated from a stressed salinity tolerant rice genotype, Pokkali. Bioinformatics analysis of the rice genome sequence revealed that this gene belongs to a multigenic family, which consists of 13 genes with 15 protein products. OsMT1e-P is located on chromosome XI, away from the majority of other type I genes that are clustered on chromosome XII. Various members of this MT gene cluster showed a tight co-regulation pattern under several abiotic stresses. Sequence analysis revealed the presence of conserved cysteine residues in OsMT1e-P protein. Salinity stress was found to regulate the transcript abundance of OsMT1e-P in a developmental and organ specific manner. Using transgenic approach, we found a positive correlation between ectopic expression of OsMT1e-P and stress tolerance. Our experiments further suggest ROS scavenging to be the possible mechanism for multiple stress tolerance conferred by OsMT1e-P. We present an overview of MTs, describing their gene structure, genome localization and expression patterns under salinity and development in rice. We have found that ectopic expression of OsMT1e-P enhances tolerance towards multiple abiotic stresses in transgenic tobacco and the resultant plants could survive and set viable seeds under saline conditions. Taken together, the experiments presented here have indicated that ectopic expression of OsMT1e-P protects against oxidative stress primarily through efficient scavenging of reactive oxygen species.
[Differentially expressed genes of cell signal transduction associated with benzene poisoning by cDNA microarray].

PubMed

Wang, Hong; Bi, Yongyi; Tao, Ning; Wang, Chunhong

2005-08-01

To detect the differential expression of cell signal transduction genes associated with benzene poisoning, and to explore the pathogenic mechanisms of blood system damage induced by benzene. Peripheral white blood cell gene expression profile of 7 benzene poisoning patients, including one aplastic anemia, was determined by cDNA microarray. Seven chips from normal workers were served as controls. Cluster analysis of gene expression profile was performed. Among the 4265 target genes, 176 genes associated with cell signal transduction were differentially expressed. 35 up-regulated genes including PTPRC, STAT4, IFITM1 etc were found in at least 6 pieces of microarray; 45 down-regulated genes including ARHB, PPP3CB, CDC37 etc were found in at least 5 pieces of microarray. cDNA microarray technology is an effective technique for screening the differentially expressed genes of cell signal transduction. Disorder in cell signal transduction may play certain role in the pathogenic mechanism of benzene poisoning.
Gene expression analysis of hypersensitivity to mosquito bite, chronic active EBV infection and NK/T-lymphoma/leukemia.

PubMed

Washio, Kana; Oka, Takashi; Abdalkader, Lamia; Muraoka, Michiko; Shimada, Akira; Oda, Megumi; Sato, Hiaki; Takata, Katsuyoshi; Kagami, Yoshitoyo; Shimizu, Norio; Kato, Seiichi; Kimura, Hiroshi; Nishizaki, Kazunori; Yoshino, Tadashi; Tsukahara, Hirokazu

2017-11-01

The human herpes virus, Epstein-Barr virus (EBV), is a known oncogenic virus and plays important roles in life-threatening T/NK-cell lymphoproliferative disorders (T/NK-cell LPD) such as hypersensitivity to mosquito bite (HMB), chronic active EBV infection (CAEBV), and NK/T-cell lymphoma/leukemia. During the clinical courses of HMB and CAEBV, patients frequently develop malignant lymphomas and the diseases passively progress sequentially. In the present study, gene expression of CD16 (-) CD56 (+) -, EBV (+) HMB, CAEBV, NK-lymphoma, and NK-leukemia cell lines, which were established from patients, was analyzed using oligonucleotide microarrays and compared to that of CD56 bright CD16 dim/- NK cells from healthy donors. Principal components analysis showed that CAEBV and NK-lymphoma cells were relatively closely located, indicating that they had similar expression profiles. Unsupervised hierarchal clustering analyses of microarray data and gene ontology analysis revealed specific gene clusters and identified several candidate genes responsible for disease that can be used to discriminate each category of NK-LPD and NK-cell lymphoma/leukemia.
Gene Expression Analysis Reveals New Possible Mechanisms of Vancomycin-Induced Nephrotoxicity and Identifies Gene Markers Candidates

PubMed Central

Dieterich, Christine; Puey, Angela; Lyn, Sylvia; Swezey, Robert; Furimsky, Anna; Fairchild, David; Mirsalis, Jon C.; Ng, Hanna H.

2009-01-01

Vancomycin, one of few effective treatments against methicillin-resistant Staphylococcus aureus, is nephrotoxic. The goals of this study were to (1) gain insights into molecular mechanisms of nephrotoxicity at the genomic level, (2) evaluate gene markers of vancomycin-induced kidney injury, and (3) compare gene expression responses after iv and ip administration. Groups of six female BALB/c mice were treated with seven daily iv or ip doses of vancomycin (50, 200, and 400 mg/kg) or saline, and sacrificed on day 8. Clinical chemistry and histopathology demonstrated kidney injury at 400 mg/kg only. Hierarchical clustering analysis revealed that kidney gene expression profiles of all mice treated at 400 mg/kg clustered with those of mice administered 200 mg/kg iv. Transcriptional profiling might thus be more sensitive than current clinical markers for detecting kidney damage, though the profiles can differ with the route of administration. Analysis of transcripts whose expression was changed by at least twofold compared with vehicle saline after high iv and ip doses of vancomycin suggested the possibility of oxidative stress and mitochondrial damage in vancomycin-induced toxicity. In addition, our data showed changes in expression of several transcripts from the complement and inflammatory pathways. Such expression changes were confirmed by relative real-time reverse transcription–polymerase chain reaction. Finally, our results further substantiate the use of gene markers of kidney toxicity such as KIM-1/Havcr1, as indicators of renal injury. PMID:18930951
Gene expression analysis reveals new possible mechanisms of vancomycin-induced nephrotoxicity and identifies gene markers candidates.

PubMed

Dieterich, Christine; Puey, Angela; Lin, Sylvia; Lyn, Sylvia; Swezey, Robert; Furimsky, Anna; Fairchild, David; Mirsalis, Jon C; Ng, Hanna H

2009-01-01

Vancomycin, one of few effective treatments against methicillin-resistant Staphylococcus aureus, is nephrotoxic. The goals of this study were to (1) gain insights into molecular mechanisms of nephrotoxicity at the genomic level, (2) evaluate gene markers of vancomycin-induced kidney injury, and (3) compare gene expression responses after iv and ip administration. Groups of six female BALB/c mice were treated with seven daily iv or ip doses of vancomycin (50, 200, and 400 mg/kg) or saline, and sacrificed on day 8. Clinical chemistry and histopathology demonstrated kidney injury at 400 mg/kg only. Hierarchical clustering analysis revealed that kidney gene expression profiles of all mice treated at 400 mg/kg clustered with those of mice administered 200 mg/kg iv. Transcriptional profiling might thus be more sensitive than current clinical markers for detecting kidney damage, though the profiles can differ with the route of administration. Analysis of transcripts whose expression was changed by at least twofold compared with vehicle saline after high iv and ip doses of vancomycin suggested the possibility of oxidative stress and mitochondrial damage in vancomycin-induced toxicity. In addition, our data showed changes in expression of several transcripts from the complement and inflammatory pathways. Such expression changes were confirmed by relative real-time reverse transcription-polymerase chain reaction. Finally, our results further substantiate the use of gene markers of kidney toxicity such as KIM-1/Havcr1, as indicators of renal injury.
A cryptic pigment biosynthetic pathway uncovered by heterologous expression is essential for conidial development in Pestalotiopsis fici.

PubMed

Zhang, Peng; Wang, Xiuna; Fan, Aili; Zheng, Yanjing; Liu, Xingzhong; Wang, Shihua; Zou, Huixi; Oakley, Berl R; Keller, Nancy P; Yin, Wen-Bing

2017-08-01

Spore pigmentation is very common in the fungal kingdom. The best studied pigment in fungi is melanin which coats the surface of single cell spores. What and how pigments function in a fungal species with multiple cell conidia is poorly understood. Here, we identified and deleted a polyketide synthase (PKS) gene PfmaE and showed that it is essential for multicellular conidial pigmentation and development in a plant endophytic fungus, Pestalotiopsis fici. To further characterize the melanin pathway, we utilized an advanced Aspergillus nidulans heterologous system for the expression of the PKS PfmaE and the Pfma gene cluster. By structural elucidation of the pathway metabolite scytalone in A. nidulans, we provided chemical evidence that the Pfma cluster synthesizes DHN melanin. Combining genetic deletion and combinatorial gene expression of Pfma cluster genes, we determined that the putative reductase PfmaG and the PKS are sufficient for the synthesis of scytalone. Feeding scytalone back to the P. fici ΔPfmaE mutant restored pigmentation and multicellular adherence of the conidia. These results cement a growing understanding that pigments are essential not simply for protection of spores from biotic and abiotic stresses but also for spore structural development. © 2017 John Wiley & Sons Ltd.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.