Sample records for large-scale gene expression

  1. From Coexpression to Coregulation: An Approach to Inferring Transcriptional Regulation Among Gene Classes from Large-Scale Expression Data

    NASA Technical Reports Server (NTRS)

    Mjolsness, Eric; Castano, Rebecca; Mann, Tobias; Wold, Barbara

    2000-01-01

    We provide preliminary evidence that existing algorithms for inferring small-scale gene regulation networks from gene expression data can be adapted to large-scale gene expression data coming from hybridization microarrays. The essential steps are (I) clustering many genes by their expression time-course data into a minimal set of clusters of co-expressed genes, (2) theoretically modeling the various conditions under which the time-courses are measured using a continuous-time analog recurrent neural network for the cluster mean time-courses, (3) fitting such a regulatory model to the cluster mean time courses by simulated annealing with weight decay, and (4) analysing several such fits for commonalities in the circuit parameter sets including the connection matrices. This procedure can be used to assess the adequacy of existing and future gene expression time-course data sets for determining transcriptional regulatory relationships such as coregulation.

  2. A Review of Feature Extraction Software for Microarray Gene Expression Data

    PubMed Central

    Tan, Ching Siang; Ting, Wai Soon; Mohamad, Mohd Saberi; Chan, Weng Howe; Deris, Safaai; Ali Shah, Zuraini

    2014-01-01

    When gene expression data are too large to be processed, they are transformed into a reduced representation set of genes. Transforming large-scale gene expression data into a set of genes is called feature extraction. If the genes extracted are carefully chosen, this gene set can extract the relevant information from the large-scale gene expression data, allowing further analysis by using this reduced representation instead of the full size data. In this paper, we review numerous software applications that can be used for feature extraction. The software reviewed is mainly for Principal Component Analysis (PCA), Independent Component Analysis (ICA), Partial Least Squares (PLS), and Local Linear Embedding (LLE). A summary and sources of the software are provided in the last section for each feature extraction method. PMID:25250315

  3. Development of a gene synthesis platform for the efficient large scale production of small genes encoding animal toxins.

    PubMed

    Sequeira, Ana Filipa; Brás, Joana L A; Guerreiro, Catarina I P D; Vincentelli, Renaud; Fontes, Carlos M G A

    2016-12-01

    Gene synthesis is becoming an important tool in many fields of recombinant DNA technology, including recombinant protein production. De novo gene synthesis is quickly replacing the classical cloning and mutagenesis procedures and allows generating nucleic acids for which no template is available. In addition, when coupled with efficient gene design algorithms that optimize codon usage, it leads to high levels of recombinant protein expression. Here, we describe the development of an optimized gene synthesis platform that was applied to the large scale production of small genes encoding venom peptides. This improved gene synthesis method uses a PCR-based protocol to assemble synthetic DNA from pools of overlapping oligonucleotides and was developed to synthesise multiples genes simultaneously. This technology incorporates an accurate, automated and cost effective ligation independent cloning step to directly integrate the synthetic genes into an effective Escherichia coli expression vector. The robustness of this technology to generate large libraries of dozens to thousands of synthetic nucleic acids was demonstrated through the parallel and simultaneous synthesis of 96 genes encoding animal toxins. An automated platform was developed for the large-scale synthesis of small genes encoding eukaryotic toxins. Large scale recombinant expression of synthetic genes encoding eukaryotic toxins will allow exploring the extraordinary potency and pharmacological diversity of animal venoms, an increasingly valuable but unexplored source of lead molecules for drug discovery.

  4. Thymidylate synthase (TS) gene expression in primary lung cancer patients: a large-scale study in Japanese population.

    PubMed

    Tanaka, F; Wada, H; Fukui, Y; Fukushima, M

    2011-08-01

    Previous small-sized studies showed lower thymidylate synthase (TS) expression in adenocarcinoma of the lung, which may explain higher antitumor activity of TS-inhibiting agents such as pemetrexed. To quantitatively measure TS gene expression in a large-scale Japanese population (n = 2621) with primary lung cancer, laser-captured microdissected sections were cut from primary tumors, surrounding normal lung tissues and involved nodes. TS gene expression level in primary tumor was significantly higher than that in normal lung tissue (mean TS/β-actin, 3.4 and 1.0, respectively; P < 0.01), and TS gene expression level was further higher in involved node (mean TS/β-actin, 7.7; P < 0.01). Analyses of TS gene expression levels in primary tumor according to histologic cell type revealed that small-cell carcinoma showed highest TS expression (mean TS/β-actin, 13.8) and that squamous cell carcinoma showed higher TS expression as compared with adenocarcinoma (mean TS/β-actin, 4.3 and 2.3, respectively; P < 0.01); TS gene expression was significantly increased along with a decrease in the grade of tumor cell differentiation. There was no significant difference in TS gene expression according to any other patient characteristics including tumor progression. Lower TS expression in adenocarcinoma of the lung was confirmed in a large-scale study.

  5. DEXTER: Disease-Expression Relation Extraction from Text.

    PubMed

    Gupta, Samir; Dingerdissen, Hayley; Ross, Karen E; Hu, Yu; Wu, Cathy H; Mazumder, Raja; Vijay-Shanker, K

    2018-01-01

    Gene expression levels affect biological processes and play a key role in many diseases. Characterizing expression profiles is useful for clinical research, and diagnostics and prognostics of diseases. There are currently several high-quality databases that capture gene expression information, obtained mostly from large-scale studies, such as microarray and next-generation sequencing technologies, in the context of disease. The scientific literature is another rich source of information on gene expression-disease relationships that not only have been captured from large-scale studies but have also been observed in thousands of small-scale studies. Expression information obtained from literature through manual curation can extend expression databases. While many of the existing databases include information from literature, they are limited by the time-consuming nature of manual curation and have difficulty keeping up with the explosion of publications in the biomedical field. In this work, we describe an automated text-mining tool, Disease-Expression Relation Extraction from Text (DEXTER) to extract information from literature on gene and microRNA expression in the context of disease. One of the motivations in developing DEXTER was to extend the BioXpress database, a cancer-focused gene expression database that includes data derived from large-scale experiments and manual curation of publications. The literature-based portion of BioXpress lags behind significantly compared to expression information obtained from large-scale studies and can benefit from our text-mined results. We have conducted two different evaluations to measure the accuracy of our text-mining tool and achieved average F-scores of 88.51 and 81.81% for the two evaluations, respectively. Also, to demonstrate the ability to extract rich expression information in different disease-related scenarios, we used DEXTER to extract information on differential expression information for 2024 genes in lung cancer, 115 glycosyltransferases in 62 cancers and 826 microRNA in 171 cancers. All extractions using DEXTER are integrated in the literature-based portion of BioXpress.Database URL: http://biotm.cis.udel.edu/DEXTER.

  6. Integrative approaches for large-scale transcriptome-wide association studies

    PubMed Central

    Gusev, Alexander; Ko, Arthur; Shi, Huwenbo; Bhatia, Gaurav; Chung, Wonil; Penninx, Brenda W J H; Jansen, Rick; de Geus, Eco JC; Boomsma, Dorret I; Wright, Fred A; Sullivan, Patrick F; Nikkola, Elina; Alvarez, Marcus; Civelek, Mete; Lusis, Aldons J.; Lehtimäki, Terho; Raitoharju, Emma; Kähönen, Mika; Seppälä, Ilkka; Raitakari, Olli T.; Kuusisto, Johanna; Laakso, Markku; Price, Alkes L.; Pajukanta, Päivi; Pasaniuc, Bogdan

    2016-01-01

    Many genetic variants influence complex traits by modulating gene expression, thus altering the abundance levels of one or multiple proteins. Here, we introduce a powerful strategy that integrates gene expression measurements with summary association statistics from large-scale genome-wide association studies (GWAS) to identify genes whose cis-regulated expression is associated to complex traits. We leverage expression imputation to perform a transcriptome wide association scan (TWAS) to identify significant expression-trait associations. We applied our approaches to expression data from blood and adipose tissue measured in ~3,000 individuals overall. We imputed gene expression into GWAS data from over 900,000 phenotype measurements to identify 69 novel genes significantly associated to obesity-related traits (BMI, lipids, and height). Many of the novel genes are associated with relevant phenotypes in the Hybrid Mouse Diversity Panel. Our results showcase the power of integrating genotype, gene expression and phenotype to gain insights into the genetic basis of complex traits. PMID:26854917

  7. paraGSEA: a scalable approach for large-scale gene expression profiling

    PubMed Central

    Peng, Shaoliang; Yang, Shunyun

    2017-01-01

    Abstract More studies have been conducted using gene expression similarity to identify functional connections among genes, diseases and drugs. Gene Set Enrichment Analysis (GSEA) is a powerful analytical method for interpreting gene expression data. However, due to its enormous computational overhead in the estimation of significance level step and multiple hypothesis testing step, the computation scalability and efficiency are poor on large-scale datasets. We proposed paraGSEA for efficient large-scale transcriptome data analysis. By optimization, the overall time complexity of paraGSEA is reduced from O(mn) to O(m+n), where m is the length of the gene sets and n is the length of the gene expression profiles, which contributes more than 100-fold increase in performance compared with other popular GSEA implementations such as GSEA-P, SAM-GS and GSEA2. By further parallelization, a near-linear speed-up is gained on both workstations and clusters in an efficient manner with high scalability and performance on large-scale datasets. The analysis time of whole LINCS phase I dataset (GSE92742) was reduced to nearly half hour on a 1000 node cluster on Tianhe-2, or within 120 hours on a 96-core workstation. The source code of paraGSEA is licensed under the GPLv3 and available at http://github.com/ysycloud/paraGSEA. PMID:28973463

  8. Gene Expression Browser: Large-Scale and Cross-Experiment Microarray Data Management, Search & Visualization

    USDA-ARS?s Scientific Manuscript database

    The amount of microarray gene expression data in public repositories has been increasing exponentially for the last couple of decades. High-throughput microarray data integration and analysis has become a critical step in exploring the large amount of expression data for biological discovery. Howeve...

  9. Large-scale gene expression profiling data for the model moss Physcomitrella patens aid understanding of developmental progression, culture and stress conditions.

    PubMed

    Hiss, Manuel; Laule, Oliver; Meskauskiene, Rasa M; Arif, Muhammad A; Decker, Eva L; Erxleben, Anika; Frank, Wolfgang; Hanke, Sebastian T; Lang, Daniel; Martin, Anja; Neu, Christina; Reski, Ralf; Richardt, Sandra; Schallenberg-Rüdinger, Mareike; Szövényi, Peter; Tiko, Theodhor; Wiedemann, Gertrud; Wolf, Luise; Zimmermann, Philip; Rensing, Stefan A

    2014-08-01

    The moss Physcomitrella patens is an important model organism for studying plant evolution, development, physiology and biotechnology. Here we have generated microarray gene expression data covering the principal developmental stages, culture forms and some environmental/stress conditions. Example analyses of developmental stages and growth conditions as well as abiotic stress treatments demonstrate that (i) growth stage is dominant over culture conditions, (ii) liquid culture is not stressful for the plant, (iii) low pH might aid protoplastation by reduced expression of cell wall structure genes, (iv) largely the same gene pool mediates response to dehydration and rehydration, and (v) AP2/EREBP transcription factors play important roles in stress response reactions. With regard to the AP2 gene family, phylogenetic analysis and comparison with Arabidopsis thaliana shows commonalities as well as uniquely expressed family members under drought, light perturbations and protoplastation. Gene expression profiles for P. patens are available for the scientific community via the easy-to-use tool at https://www.genevestigator.com. By providing large-scale expression profiles, the usability of this model organism is further enhanced, for example by enabling selection of control genes for quantitative real-time PCR. Now, gene expression levels across a broad range of conditions can be accessed online for P. patens. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.

  10. Hi-C Chromatin Interaction Networks Predict Co-expression in the Mouse Cortex

    PubMed Central

    Hulsman, Marc; Lelieveldt, Boudewijn P. F.; de Ridder, Jeroen; Reinders, Marcel

    2015-01-01

    The three dimensional conformation of the genome in the cell nucleus influences important biological processes such as gene expression regulation. Recent studies have shown a strong correlation between chromatin interactions and gene co-expression. However, predicting gene co-expression from frequent long-range chromatin interactions remains challenging. We address this by characterizing the topology of the cortical chromatin interaction network using scale-aware topological measures. We demonstrate that based on these characterizations it is possible to accurately predict spatial co-expression between genes in the mouse cortex. Consistent with previous findings, we find that the chromatin interaction profile of a gene-pair is a good predictor of their spatial co-expression. However, the accuracy of the prediction can be substantially improved when chromatin interactions are described using scale-aware topological measures of the multi-resolution chromatin interaction network. We conclude that, for co-expression prediction, it is necessary to take into account different levels of chromatin interactions ranging from direct interaction between genes (i.e. small-scale) to chromatin compartment interactions (i.e. large-scale). PMID:25965262

  11. Analysis of blood-based gene expression in idiopathic Parkinson disease.

    PubMed

    Shamir, Ron; Klein, Christine; Amar, David; Vollstedt, Eva-Juliane; Bonin, Michael; Usenovic, Marija; Wong, Yvette C; Maver, Ales; Poths, Sven; Safer, Hershel; Corvol, Jean-Christophe; Lesage, Suzanne; Lavi, Ofer; Deuschl, Günther; Kuhlenbaeumer, Gregor; Pawlack, Heike; Ulitsky, Igor; Kasten, Meike; Riess, Olaf; Brice, Alexis; Peterlin, Borut; Krainc, Dimitri

    2017-10-17

    To examine whether gene expression analysis of a large-scale Parkinson disease (PD) patient cohort produces a robust blood-based PD gene signature compared to previous studies that have used relatively small cohorts (≤220 samples). Whole-blood gene expression profiles were collected from a total of 523 individuals. After preprocessing, the data contained 486 gene profiles (n = 205 PD, n = 233 controls, n = 48 other neurodegenerative diseases) that were partitioned into training, validation, and independent test cohorts to identify and validate a gene signature. Batch-effect reduction and cross-validation were performed to ensure signature reliability. Finally, functional and pathway enrichment analyses were applied to the signature to identify PD-associated gene networks. A gene signature of 100 probes that mapped to 87 genes, corresponding to 64 upregulated and 23 downregulated genes differentiating between patients with idiopathic PD and controls, was identified with the training cohort and successfully replicated in both an independent validation cohort (area under the curve [AUC] = 0.79, p = 7.13E-6) and a subsequent independent test cohort (AUC = 0.74, p = 4.2E-4). Network analysis of the signature revealed gene enrichment in pathways, including metabolism, oxidation, and ubiquitination/proteasomal activity, and misregulation of mitochondria-localized genes, including downregulation of COX4I1 , ATP5A1 , and VDAC3 . We present a large-scale study of PD gene expression profiling. This work identifies a reliable blood-based PD signature and highlights the importance of large-scale patient cohorts in developing potential PD biomarkers. © 2017 American Academy of Neurology.

  12. Large-Scale Analysis of Network Bistability for Human Cancers

    PubMed Central

    Shiraishi, Tetsuya; Matsuyama, Shinako; Kitano, Hiroaki

    2010-01-01

    Protein–protein interaction and gene regulatory networks are likely to be locked in a state corresponding to a disease by the behavior of one or more bistable circuits exhibiting switch-like behavior. Sets of genes could be over-expressed or repressed when anomalies due to disease appear, and the circuits responsible for this over- or under-expression might persist for as long as the disease state continues. This paper shows how a large-scale analysis of network bistability for various human cancers can identify genes that can potentially serve as drug targets or diagnosis biomarkers. PMID:20628618

  13. Development of multitissue microfluidic dynamic array for assessing changes in gene expression associated with channel catfish appetite, growth, metabolism, and intestinal health

    USDA-ARS?s Scientific Manuscript database

    Large-scale, gene expression methods allow for high throughput analysis of physiological pathways at a fraction of the cost of individual gene expression analysis. Systems, such as the Fluidigm quantitative PCR array described here, can provide powerful assessments of the effects of diet, environme...

  14. Highly efficient mesophyll protoplast isolation and PEG-mediated transient gene expression for rapid and large-scale gene characterization in cassava (Manihot esculenta Crantz).

    PubMed

    Wu, Jun-Zheng; Liu, Qin; Geng, Xiao-Shan; Li, Kai-Mian; Luo, Li-Juan; Liu, Jin-Ping

    2017-03-14

    Cassava (Manihot esculenta Crantz) is a major crop extensively cultivated in the tropics as both an important source of calories and a promising source for biofuel production. Although stable gene expression have been used for transgenic breeding and gene function study, a quick, easy and large-scale transformation platform has been in urgent need for gene functional characterization, especially after the cassava full genome was sequenced. Fully expanded leaves from in vitro plantlets of Manihot esculenta were used to optimize the concentrations of cellulase R-10 and macerozyme R-10 for obtaining protoplasts with the highest yield and viability. Then, the optimum conditions (PEG4000 concentration and transfection time) were determined for cassava protoplast transient gene expression. In addition, the reliability of the established protocol was confirmed for subcellular protein localization. In this work we optimized the main influencing factors and developed an efficient mesophyll protoplast isolation and PEG-mediated transient gene expression in cassava. The suitable enzyme digestion system was established with the combination of 1.6% cellulase R-10 and 0.8% macerozyme R-10 for 16 h of digestion in the dark at 25 °C, resulting in the high yield (4.4 × 10 7 protoplasts/g FW) and vitality (92.6%) of mesophyll protoplasts. The maximum transfection efficiency (70.8%) was obtained with the incubation of the protoplasts/vector DNA mixture with 25% PEG4000 for 10 min. We validated the applicability of the system for studying the subcellular localization of MeSTP7 (an H + /monosaccharide cotransporter) with our transient expression protocol and a heterologous Arabidopsis transient gene expression system. We optimized the main influencing factors and developed an efficient mesophyll protoplast isolation and transient gene expression in cassava, which will facilitate large-scale characterization of genes and pathways in cassava.

  15. In silico identification and comparative analysis of differentially expressed genes in human and mouse tissues

    PubMed Central

    Pao, Sheng-Ying; Lin, Win-Li; Hwang, Ming-Jing

    2006-01-01

    Background Screening for differentially expressed genes on the genomic scale and comparative analysis of the expression profiles of orthologous genes between species to study gene function and regulation are becoming increasingly feasible. Expressed sequence tags (ESTs) are an excellent source of data for such studies using bioinformatic approaches because of the rich libraries and tremendous amount of data now available in the public domain. However, any large-scale EST-based bioinformatics analysis must deal with the heterogeneous, and often ambiguous, tissue and organ terms used to describe EST libraries. Results To deal with the issue of tissue source, in this work, we carefully screened and organized more than 8 million human and mouse ESTs into 157 human and 108 mouse tissue/organ categories, to which we applied an established statistic test using different thresholds of the p value to identify genes differentially expressed in different tissues. Further analysis of the tissue distribution and level of expression of human and mouse orthologous genes showed that tissue-specific orthologs tended to have more similar expression patterns than those lacking significant tissue specificity. On the other hand, a number of orthologs were found to have significant disparity in their expression profiles, hinting at novel functions, divergent regulation, or new ortholog relationships. Conclusion Comprehensive statistics on the tissue-specific expression of human and mouse genes were obtained in this very large-scale, EST-based analysis. These statistical results have been organized into a database, freely accessible at our website , for easy searching of human and mouse tissue-specific genes and for investigating gene expression profiles in the context of comparative genomics. Comparative analysis showed that, although highly tissue-specific genes tend to exhibit similar expression profiles in human and mouse, there are significant exceptions, indicating that orthologous genes, while sharing basic genomic properties, could result in distinct phenotypes. PMID:16626500

  16. TOXICOGENOMICS AND HUMAN DISEASE RISK ASSESSMENT

    EPA Science Inventory


    Toxicogenomics and Human Disease Risk Assessment.

    Complete sequencing of human and other genomes, availability of large-scale gene
    expression arrays with ever-increasing numbers of genes displayed, and steady
    improvements in protein expression technology can hav...

  17. Transcriptional analysis of product-concentration driven changes in cellular programs of recombinant Clostridium acetobutylicumstrains.

    PubMed

    Tummala, Seshu B; Junne, Stefan G; Paredes, Carlos J; Papoutsakis, Eleftherios T

    2003-12-30

    Antisense RNA (asRNA) downregulation alters protein expression without changing the regulation of gene expression. Downregulation of primary metabolic enzymes possibly combined with overexpression of other metabolic enzymes may result in profound changes in product formation, and this may alter the large-scale transcriptional program of the cells. DNA-array based large-scale transcriptional analysis has the potential to elucidate factors that control cellular fluxes even in the absence of proteome data. These themes are explored in the study of large-scale transcriptional analysis programs and the in vivo primary-metabolism fluxes of several related recombinant C. acetobutylicum strains: C. acetobutylicum ATCC 824(pSOS95del) (plasmid control; produces high levels of butanol snd acetone), 824(pCTFB1AS) (expresses antisense RNA against CoA transferase (ctfb1-asRNA); produces very low levels of butanol and acetone), and 824(pAADB1) (expresses ctfb1-asRNA and the alcohol-aldehyde dahydrogenase gene (aad); produce high alcohol and low acetone levels). DNA-array based transcriptional analysis revealed that the large changes in product concentrations (snd notably butanol concentration) due to ctfb1-asRNA expression alone and in combination with aad overexpression resulted in dramatic changes of the cellular transcriptome. Cluster analysis and gene expression patterns of established and putative operons involved in stress response, motility, sporulation, and fatty-acid biosynthesis indicate that these simple genetic changes dramatically alter the cellular programs of C. acetobutylicum. Comparison of gene expression and flux analysis data may point to possible flux-controling steps and suggest unknown regulatory mechanisms. Copyright 2003; Wiley Periodicals, Inc.

  18. Engineered human skin substitutes undergo large-scale genomic reprogramming and normal skin-like maturation after transplantation to athymic mice.

    PubMed

    Klingenberg, Jennifer M; McFarland, Kevin L; Friedman, Aaron J; Boyce, Steven T; Aronow, Bruce J; Supp, Dorothy M

    2010-02-01

    Bioengineered skin substitutes can facilitate wound closure in severely burned patients, but deficiencies limit their outcomes compared with native skin autografts. To identify gene programs associated with their in vivo capabilities and limitations, we extended previous gene expression profile analyses to now compare engineered skin after in vivo grafting with both in vitro maturation and normal human skin. Cultured skin substitutes were grafted on full-thickness wounds in athymic mice, and biopsy samples for microarray analyses were collected at multiple in vitro and in vivo time points. Over 10,000 transcripts exhibited large-scale expression pattern differences during in vitro and in vivo maturation. Using hierarchical clustering, 11 different expression profile clusters were partitioned on the basis of differential sample type and temporal stage-specific activation or repression. Analyses show that the wound environment exerts a massive influence on gene expression in skin substitutes. For example, in vivo-healed skin substitutes gained the expression of many native skin-expressed genes, including those associated with epidermal barrier and multiple categories of cell-cell and cell-basement membrane adhesion. In contrast, immunological, trichogenic, and endothelial gene programs were largely lacking. These analyses suggest important areas for guiding further improvement of engineered skin for both increased homology with native skin and enhanced wound healing.

  19. Large-scale transcriptome analysis reveals arabidopsis metabolic pathways are frequently influenced by different pathogens.

    PubMed

    Jiang, Zhenhong; He, Fei; Zhang, Ziding

    2017-07-01

    Through large-scale transcriptional data analyses, we highlighted the importance of plant metabolism in plant immunity and identified 26 metabolic pathways that were frequently influenced by the infection of 14 different pathogens. Reprogramming of plant metabolism is a common phenomenon in plant defense responses. Currently, a large number of transcriptional profiles of infected tissues in Arabidopsis (Arabidopsis thaliana) have been deposited in public databases, which provides a great opportunity to understand the expression patterns of metabolic pathways during plant defense responses at the systems level. Here, we performed a large-scale transcriptome analysis based on 135 previously published expression samples, including 14 different pathogens, to explore the expression pattern of Arabidopsis metabolic pathways. Overall, metabolic genes are significantly changed in expression during plant defense responses. Upregulated metabolic genes are enriched on defense responses, and downregulated genes are enriched on photosynthesis, fatty acid and lipid metabolic processes. Gene set enrichment analysis (GSEA) identifies 26 frequently differentially expressed metabolic pathways (FreDE_Paths) that are differentially expressed in more than 60% of infected samples. These pathways are involved in the generation of energy, fatty acid and lipid metabolism as well as secondary metabolite biosynthesis. Clustering analysis based on the expression levels of these 26 metabolic pathways clearly distinguishes infected and control samples, further suggesting the importance of these metabolic pathways in plant defense responses. By comparing with FreDE_Paths from abiotic stresses, we find that the expression patterns of 26 FreDE_Paths from biotic stresses are more consistent across different infected samples. By investigating the expression correlation between transcriptional factors (TFs) and FreDE_Paths, we identify several notable relationships. Collectively, the current study will deepen our understanding of plant metabolism in plant immunity and provide new insights into disease-resistant crop improvement.

  20. Ingestion of bacterially expressed double-stranded RNA inhibits gene expression in planarians.

    PubMed

    Newmark, Phillip A; Reddien, Peter W; Cebrià, Francesc; Sánchez Alvarado, Alejandro

    2003-09-30

    Freshwater planarian flatworms are capable of regenerating complete organisms from tiny fragments of their bodies; the basis for this regenerative prowess is an experimentally accessible stem cell population that is present in the adult planarian. The study of these organisms, classic experimental models for investigating metazoan regeneration, has been revitalized by the application of modern molecular biological approaches. The identification of thousands of unique planarian ESTs, coupled with large-scale whole-mount in situ hybridization screens, and the ability to inhibit planarian gene expression through double-stranded RNA-mediated genetic interference, provide a wealth of tools for studying the molecular mechanisms that regulate tissue regeneration and stem cell biology in these organisms. Here we show that, as in Caenorhabditis elegans, ingestion of bacterially expressed double-stranded RNA can inhibit gene expression in planarians. This inhibition persists throughout the process of regeneration, allowing phenotypes with disrupted regenerative patterning to be identified. These results pave the way for large-scale screens for genes involved in regenerative processes.

  1. ExprAlign - the identification of ESTs in non-model species by alignment of cDNA microarray expression profiles

    PubMed Central

    2009-01-01

    Background Sequence identification of ESTs from non-model species offers distinct challenges particularly when these species have duplicated genomes and when they are phylogenetically distant from sequenced model organisms. For the common carp, an environmental model of aquacultural interest, large numbers of ESTs remained unidentified using BLAST sequence alignment. We have used the expression profiles from large-scale microarray experiments to suggest gene identities. Results Expression profiles from ~700 cDNA microarrays describing responses of 7 major tissues to multiple environmental stressors were used to define a co-expression landscape. This was based on the Pearsons correlation coefficient relating each gene with all other genes, from which a network description provided clusters of highly correlated genes as 'mountains'. We show that these contain genes with known identities and genes with unknown identities, and that the correlation constitutes evidence of identity in the latter. This procedure has suggested identities to 522 of 2701 unknown carp ESTs sequences. We also discriminate several common carp genes and gene isoforms that were not discriminated by BLAST sequence alignment alone. Precision in identification was substantially improved by use of data from multiple tissues and treatments. Conclusion The detailed analysis of co-expression landscapes is a sensitive technique for suggesting an identity for the large number of BLAST unidentified cDNAs generated in EST projects. It is capable of detecting even subtle changes in expression profiles, and thereby of distinguishing genes with a common BLAST identity into different identities. It benefits from the use of multiple treatments or contrasts, and from the large-scale microarray data. PMID:19939286

  2. Combining classifiers to predict gene function in Arabidopsis thaliana using large-scale gene expression measurements.

    PubMed

    Lan, Hui; Carson, Rachel; Provart, Nicholas J; Bonner, Anthony J

    2007-09-21

    Arabidopsis thaliana is the model species of current plant genomic research with a genome size of 125 Mb and approximately 28,000 genes. The function of half of these genes is currently unknown. The purpose of this study is to infer gene function in Arabidopsis using machine-learning algorithms applied to large-scale gene expression data sets, with the goal of identifying genes that are potentially involved in plant response to abiotic stress. Using in house and publicly available data, we assembled a large set of gene expression measurements for A. thaliana. Using those genes of known function, we first evaluated and compared the ability of basic machine-learning algorithms to predict which genes respond to stress. Predictive accuracy was measured using ROC50 and precision curves derived through cross validation. To improve accuracy, we developed a method for combining these classifiers using a weighted-voting scheme. The combined classifier was then trained on genes of known function and applied to genes of unknown function, identifying genes that potentially respond to stress. Visual evidence corroborating the predictions was obtained using electronic Northern analysis. Three of the predicted genes were chosen for biological validation. Gene knockout experiments confirmed that all three are involved in a variety of stress responses. The biological analysis of one of these genes (At1g16850) is presented here, where it is shown to be necessary for the normal response to temperature and NaCl. Supervised learning methods applied to large-scale gene expression measurements can be used to predict gene function. However, the ability of basic learning methods to predict stress response varies widely and depends heavily on how much dimensionality reduction is used. Our method of combining classifiers can improve the accuracy of such predictions - in this case, predictions of genes involved in stress response in plants - and it effectively chooses the appropriate amount of dimensionality reduction automatically. The method provides a useful means of identifying genes in A. thaliana that potentially respond to stress, and we expect it would be useful in other organisms and for other gene functions.

  3. A Gene Co-Expression Network in Whole Blood of Schizophrenia Patients Is Independent of Antipsychotic-Use and Enriched for Brain-Expressed Genes

    PubMed Central

    de Jong, Simone; Boks, Marco P. M.; Fuller, Tova F.; Strengman, Eric; Janson, Esther; de Kovel, Carolien G. F.; Ori, Anil P. S.; Vi, Nancy; Mulder, Flip; Blom, Jan Dirk; Glenthøj, Birte; Schubart, Chris D.; Cahn, Wiepke; Kahn, René S.; Horvath, Steve; Ophoff, Roel A.

    2012-01-01

    Despite large-scale genome-wide association studies (GWAS), the underlying genes for schizophrenia are largely unknown. Additional approaches are therefore required to identify the genetic background of this disorder. Here we report findings from a large gene expression study in peripheral blood of schizophrenia patients and controls. We applied a systems biology approach to genome-wide expression data from whole blood of 92 medicated and 29 antipsychotic-free schizophrenia patients and 118 healthy controls. We show that gene expression profiling in whole blood can identify twelve large gene co-expression modules associated with schizophrenia. Several of these disease related modules are likely to reflect expression changes due to antipsychotic medication. However, two of the disease modules could be replicated in an independent second data set involving antipsychotic-free patients and controls. One of these robustly defined disease modules is significantly enriched with brain-expressed genes and with genetic variants that were implicated in a GWAS study, which could imply a causal role in schizophrenia etiology. The most highly connected intramodular hub gene in this module (ABCF1), is located in, and regulated by the major histocompatibility (MHC) complex, which is intriguing in light of the fact that common allelic variants from the MHC region have been implicated in schizophrenia. This suggests that the MHC increases schizophrenia susceptibility via altered gene expression of regulatory genes in this network. PMID:22761806

  4. Identification of tissue-specific, abiotic stress-responsive gene expression patterns in wine grape (Vitis vinifera L.) based on curation and mining of large-scale EST data sets

    PubMed Central

    2011-01-01

    Background Abiotic stresses, such as water deficit and soil salinity, result in changes in physiology, nutrient use, and vegetative growth in vines, and ultimately, yield and flavor in berries of wine grape, Vitis vinifera L. Large-scale expressed sequence tags (ESTs) were generated, curated, and analyzed to identify major genetic determinants responsible for stress-adaptive responses. Although roots serve as the first site of perception and/or injury for many types of abiotic stress, EST sequencing in root tissues of wine grape exposed to abiotic stresses has been extremely limited to date. To overcome this limitation, large-scale EST sequencing was conducted from root tissues exposed to multiple abiotic stresses. Results A total of 62,236 expressed sequence tags (ESTs) were generated from leaf, berry, and root tissues from vines subjected to abiotic stresses and compared with 32,286 ESTs sequenced from 20 public cDNA libraries. Curation to correct annotation errors, clustering and assembly of the berry and leaf ESTs with currently available V. vinifera full-length transcripts and ESTs yielded a total of 13,278 unique sequences, with 2302 singletons and 10,976 mapped to V. vinifera gene models. Of these, 739 transcripts were found to have significant differential expression in stressed leaves and berries including 250 genes not described previously as being abiotic stress responsive. In a second analysis of 16,452 ESTs from a normalized root cDNA library derived from roots exposed to multiple, short-term, abiotic stresses, 135 genes with root-enriched expression patterns were identified on the basis of their relative EST abundance in roots relative to other tissues. Conclusions The large-scale analysis of relative EST frequency counts among a diverse collection of 23 different cDNA libraries from leaf, berry, and root tissues of wine grape exposed to a variety of abiotic stress conditions revealed distinct, tissue-specific expression patterns, previously unrecognized stress-induced genes, and many novel genes with root-enriched mRNA expression for improving our understanding of root biology and manipulation of rootstock traits in wine grape. mRNA abundance estimates based on EST library-enriched expression patterns showed only modest correlations between microarray and quantitative, real-time reverse transcription-polymerase chain reaction (qRT-PCR) methods highlighting the need for deep-sequencing expression profiling methods. PMID:21592389

  5. Multiscale Embedded Gene Co-expression Network Analysis

    PubMed Central

    Song, Won-Min; Zhang, Bin

    2015-01-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778

  6. Multiscale Embedded Gene Co-expression Network Analysis.

    PubMed

    Song, Won-Min; Zhang, Bin

    2015-11-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

  7. An integrated approach to reconstructing genome-scale transcriptional regulatory networks

    DOE PAGES

    Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.; ...

    2015-02-27

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making themmore » highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating comparative genomics of closely related organisms with gene expression data to assemble large-scale TRN models with high-quality predictions.« less

  8. Dating and functional characterization of duplicated genes in the apple (Malus domestica Borkh.) by analyzing EST data.

    PubMed

    Sanzol, Javier

    2010-05-14

    Gene duplication is central to genome evolution. In plants, genes can be duplicated through small-scale events and large-scale duplications often involving polyploidy. The apple belongs to the subtribe Pyrinae (Rosaceae), a diverse lineage that originated via allopolyploidization. Both small-scale duplications and polyploidy may have been important mechanisms shaping the genome of this species. This study evaluates the gene duplication and polyploidy history of the apple by characterizing duplicated genes in this species using EST data. Overall, 68% of the apple genes were clustered into families with a mean copy-number of 4.6. Analysis of the age distribution of gene duplications supported a continuous mode of small-scale duplications, plus two episodes of large-scale duplicates of vastly different ages. The youngest was consistent with the polyploid origin of the Pyrinae 37-48 MYBP, whereas the older may be related to gamma-triplication; an ancient hexapolyploidization previously characterized in the four sequenced eurosid genomes and basal to the eurosid-asterid divergence. Duplicated genes were studied for functional diversification with an emphasis on young paralogs; those originated during or after the formation of the Pyrinae lineage. Unequal assignment of single-copy genes and gene families to Gene Ontology categories suggested functional bias in the pattern of gene retention of paralogs. Young paralogs related to signal transduction, metabolism, and energy pathways have been preferentially retained. Non-random retention of duplicated genes seems to have mediated the expansion of gene families, some of which may have substantially increased their members after the origin of the Pyrinae. The joint analysis of over-duplicated functional categories and phylogenies, allowed evaluation of the role of both polyploidy and small-scale duplications during this process. Finally, gene expression analysis indicated that 82% of duplicated genes, including 80% of young paralogs, showed uncorrelated expression profiles, suggesting extensive subfunctionalization and a role of gene duplication in the acquisition of novel patterns of gene expression. This study reports a genome-wide analysis of the mode of gene duplication in the apple, and provides evidence for its role in genome functional diversification by characterising three major processes: selective retention of paralogs, amplification of gene families, and changes in gene expression.

  9. Molecular Structure-Based Large-Scale Prediction of Chemical-Induced Gene Expression Changes.

    PubMed

    Liu, Ruifeng; AbdulHameed, Mohamed Diwan M; Wallqvist, Anders

    2017-09-25

    The quantitative structure-activity relationship (QSAR) approach has been used to model a wide range of chemical-induced biological responses. However, it had not been utilized to model chemical-induced genomewide gene expression changes until very recently, owing to the complexity of training and evaluating a very large number of models. To address this issue, we examined the performance of a variable nearest neighbor (v-NN) method that uses information on near neighbors conforming to the principle that similar structures have similar activities. Using a data set of gene expression signatures of 13 150 compounds derived from cell-based measurements in the NIH Library of Integrated Network-based Cellular Signatures program, we were able to make predictions for 62% of the compounds in a 10-fold cross validation test, with a correlation coefficient of 0.61 between the predicted and experimentally derived signatures-a reproducibility rivaling that of high-throughput gene expression measurements. To evaluate the utility of the predicted gene expression signatures, we compared the predicted and experimentally derived signatures in their ability to identify drugs known to cause specific liver, kidney, and heart injuries. Overall, the predicted and experimentally derived signatures had similar receiver operating characteristics, whose areas under the curve ranged from 0.71 to 0.77 and 0.70 to 0.73, respectively, across the three organ injury models. However, detailed analyses of enrichment curves indicate that signatures predicted from multiple near neighbors outperformed those derived from experiments, suggesting that averaging information from near neighbors may help improve the signal from gene expression measurements. Our results demonstrate that the v-NN method can serve as a practical approach for modeling large-scale, genomewide, chemical-induced, gene expression changes.

  10. Large-Scale Cognitive GWAS Meta-Analysis Reveals Tissue-Specific Neural Expression and Potential Nootropic Drug Targets.

    PubMed

    Lam, Max; Trampush, Joey W; Yu, Jin; Knowles, Emma; Davies, Gail; Liewald, David C; Starr, John M; Djurovic, Srdjan; Melle, Ingrid; Sundet, Kjetil; Christoforou, Andrea; Reinvang, Ivar; DeRosse, Pamela; Lundervold, Astri J; Steen, Vidar M; Espeseth, Thomas; Räikkönen, Katri; Widen, Elisabeth; Palotie, Aarno; Eriksson, Johan G; Giegling, Ina; Konte, Bettina; Roussos, Panos; Giakoumaki, Stella; Burdick, Katherine E; Payton, Antony; Ollier, William; Chiba-Falek, Ornit; Attix, Deborah K; Need, Anna C; Cirulli, Elizabeth T; Voineskos, Aristotle N; Stefanis, Nikos C; Avramopoulos, Dimitrios; Hatzimanolis, Alex; Arking, Dan E; Smyrnis, Nikolaos; Bilder, Robert M; Freimer, Nelson A; Cannon, Tyrone D; London, Edythe; Poldrack, Russell A; Sabb, Fred W; Congdon, Eliza; Conley, Emily Drabant; Scult, Matthew A; Dickinson, Dwight; Straub, Richard E; Donohoe, Gary; Morris, Derek; Corvin, Aiden; Gill, Michael; Hariri, Ahmad R; Weinberger, Daniel R; Pendleton, Neil; Bitsios, Panos; Rujescu, Dan; Lahti, Jari; Le Hellard, Stephanie; Keller, Matthew C; Andreassen, Ole A; Deary, Ian J; Glahn, David C; Malhotra, Anil K; Lencz, Todd

    2017-11-28

    Here, we present a large (n = 107,207) genome-wide association study (GWAS) of general cognitive ability ("g"), further enhanced by combining results with a large-scale GWAS of educational attainment. We identified 70 independent genomic loci associated with general cognitive ability. Results showed significant enrichment for genes causing Mendelian disorders with an intellectual disability phenotype. Competitive pathway analysis implicated the biological processes of neurogenesis and synaptic regulation, as well as the gene targets of two pharmacologic agents: cinnarizine, a T-type calcium channel blocker, and LY97241, a potassium channel inhibitor. Transcriptome-wide and epigenome-wide analysis revealed that the implicated loci were enriched for genes expressed across all brain regions (most strongly in the cerebellum). Enrichment was exclusive to genes expressed in neurons but not oligodendrocytes or astrocytes. Finally, we report genetic correlations between cognitive ability and disparate phenotypes including psychiatric disorders, several autoimmune disorders, longevity, and maternal age at first birth. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  11. Large-Scale Gene Relocations following an Ancient Genome Triplication Associated with the Diversification of Core Eudicots.

    PubMed

    Wang, Yupeng; Ficklin, Stephen P; Wang, Xiyin; Feltus, F Alex; Paterson, Andrew H

    2016-01-01

    Different modes of gene duplication including whole-genome duplication (WGD), and tandem, proximal and dispersed duplications are widespread in angiosperm genomes. Small-scale, stochastic gene relocations and transposed gene duplications are widely accepted to be the primary mechanisms for the creation of dispersed duplicates. However, here we show that most surviving ancient dispersed duplicates in core eudicots originated from large-scale gene relocations within a narrow window of time following a genome triplication (γ) event that occurred in the stem lineage of core eudicots. We name these surviving ancient dispersed duplicates as relocated γ duplicates. In Arabidopsis thaliana, relocated γ, WGD and single-gene duplicates have distinct features with regard to gene functions, essentiality, and protein interactions. Relative to γ duplicates, relocated γ duplicates have higher non-synonymous substitution rates, but comparable levels of expression and regulation divergence. Thus, relocated γ duplicates should be distinguished from WGD and single-gene duplicates for evolutionary investigations. Our results suggest large-scale gene relocations following the γ event were associated with the diversification of core eudicots.

  12. Large-Scale Gene Relocations following an Ancient Genome Triplication Associated with the Diversification of Core Eudicots

    PubMed Central

    Wang, Yupeng; Ficklin, Stephen P.; Wang, Xiyin; Feltus, F. Alex; Paterson, Andrew H.

    2016-01-01

    Different modes of gene duplication including whole-genome duplication (WGD), and tandem, proximal and dispersed duplications are widespread in angiosperm genomes. Small-scale, stochastic gene relocations and transposed gene duplications are widely accepted to be the primary mechanisms for the creation of dispersed duplicates. However, here we show that most surviving ancient dispersed duplicates in core eudicots originated from large-scale gene relocations within a narrow window of time following a genome triplication (γ) event that occurred in the stem lineage of core eudicots. We name these surviving ancient dispersed duplicates as relocated γ duplicates. In Arabidopsis thaliana, relocated γ, WGD and single-gene duplicates have distinct features with regard to gene functions, essentiality, and protein interactions. Relative to γ duplicates, relocated γ duplicates have higher non-synonymous substitution rates, but comparable levels of expression and regulation divergence. Thus, relocated γ duplicates should be distinguished from WGD and single-gene duplicates for evolutionary investigations. Our results suggest large-scale gene relocations following the γ event were associated with the diversification of core eudicots. PMID:27195960

  13. High-efficiency Agrobacterium-mediated transformation of Norway spruce (Picea abies) and loblolly pine (Pinus taeda)

    NASA Technical Reports Server (NTRS)

    Wenck, A. R.; Quinn, M.; Whetten, R. W.; Pullman, G.; Sederoff, R.; Brown, C. S. (Principal Investigator)

    1999-01-01

    Agrobacterium-mediated gene transfer is the method of choice for many plant biotechnology laboratories; however, large-scale use of this organism in conifer transformation has been limited by difficult propagation of explant material, selection efficiencies and low transformation frequency. We have analyzed co-cultivation conditions and different disarmed strains of Agrobacterium to improve transformation. Additional copies of virulence genes were added to three common disarmed strains. These extra virulence genes included either a constitutively active virG or extra copies of virG and virB, both from pTiBo542. In experiments with Norway spruce, we increased transformation efficiencies 1000-fold from initial experiments where little or no transient expression was detected. Over 100 transformed lines expressing the marker gene beta-glucuronidase (GUS) were generated from rapidly dividing embryogenic suspension-cultured cells co-cultivated with Agrobacterium. GUS activity was used to monitor transient expression and to further test lines selected on kanamycin-containing medium. In loblolly pine, transient expression increased 10-fold utilizing modified Agrobacterium strains. Agrobacterium-mediated gene transfer is a useful technique for large-scale generation of transgenic Norway spruce and may prove useful for other conifer species.

  14. Dynamic DNA cytosine methylation in the Populus trichocarpa genome: tissue-level variation and relationship to gene expression

    PubMed Central

    2012-01-01

    Background DNA cytosine methylation is an epigenetic modification that has been implicated in many biological processes. However, large-scale epigenomic studies have been applied to very few plant species, and variability in methylation among specialized tissues and its relationship to gene expression is poorly understood. Results We surveyed DNA methylation from seven distinct tissue types (vegetative bud, male inflorescence [catkin], female catkin, leaf, root, xylem, phloem) in the reference tree species black cottonwood (Populus trichocarpa). Using 5-methyl-cytosine DNA immunoprecipitation followed by Illumina sequencing (MeDIP-seq), we mapped a total of 129,360,151 36- or 32-mer reads to the P. trichocarpa reference genome. We validated MeDIP-seq results by bisulfite sequencing, and compared methylation and gene expression using published microarray data. Qualitative DNA methylation differences among tissues were obvious on a chromosome scale. Methylated genes had lower expression than unmethylated genes, but genes with methylation in transcribed regions ("gene body methylation") had even lower expression than genes with promoter methylation. Promoter methylation was more frequent than gene body methylation in all tissues except male catkins. Male catkins differed in demethylation of particular transposable element categories, in level of gene body methylation, and in expression range of genes with methylated transcribed regions. Tissue-specific gene expression patterns were correlated with both gene body and promoter methylation. Conclusions We found striking differences among tissues in methylation, which were apparent at the chromosomal scale and when genes and transposable elements were examined. In contrast to other studies in plants, gene body methylation had a more repressive effect on transcription than promoter methylation. PMID:22251412

  15. Comparative modular analysis of gene expression in vertebrate organs.

    PubMed

    Piasecka, Barbara; Kutalik, Zoltán; Roux, Julien; Bergmann, Sven; Robinson-Rechavi, Marc

    2012-03-29

    The degree of conservation of gene expression between homologous organs largely remains an open question. Several recent studies reported some evidence in favor of such conservation. Most studies compute organs' similarity across all orthologous genes, whereas the expression level of many genes are not informative about organ specificity. Here, we use a modularization algorithm to overcome this limitation through the identification of inter-species co-modules of organs and genes. We identify such co-modules using mouse and human microarray expression data. They are functionally coherent both in terms of genes and of organs from both organisms. We show that a large proportion of genes belonging to the same co-module are orthologous between mouse and human. Moreover, their zebrafish orthologs also tend to be expressed in the corresponding homologous organs. Notable exceptions to the general pattern of conservation are the testis and the olfactory bulb. Interestingly, some co-modules consist of single organs, while others combine several functionally related organs. For instance, amygdala, cerebral cortex, hypothalamus and spinal cord form a clearly discernible unit of expression, both in mouse and human. Our study provides a new framework for comparative analysis which will be applicable also to other sets of large-scale phenotypic data collected across different species.

  16. APPLICATION OF CDNA MICROARRAY TECHNOLOGY TO IN VITRO TOXICOLOGY AND THE SELECTION OF GENES FOR A REAL TIME RT-PCR-BASED SCREEN FOR OXIDATIVE STRESS IN HEP-G2 CELLS

    EPA Science Inventory

    Large-scale analysis of gene expression using cDNA microarrays promises the
    rapid detection of the mode of toxicity for drugs and other chemicals. cDNA
    microarrays were used to examine chemically-induced alterations of gene
    expression in HepG2 cells exposed to oxidative ...

  17. Gene expression studies of developing bovine longissimus muscle from two different beef cattle breeds

    PubMed Central

    Lehnert, Sigrid A; Reverter, Antonio; Byrne, Keren A; Wang, Yonghong; Nattrass, Greg S; Hudson, Nicholas J; Greenwood, Paul L

    2007-01-01

    Background The muscle fiber number and fiber composition of muscle is largely determined during prenatal development. In order to discover genes that are involved in determining adult muscle phenotypes, we studied the gene expression profile of developing fetal bovine longissimus muscle from animals with two different genetic backgrounds using a bovine cDNA microarray. Fetal longissimus muscle was sampled at 4 stages of myogenesis and muscle maturation: primary myogenesis (d 60), secondary myogenesis (d 135), as well as beginning (d 195) and final stages (birth) of functional differentiation of muscle fibers. All fetuses and newborns (total n = 24) were from Hereford dams and crossed with either Wagyu (high intramuscular fat) or Piedmontese (GDF8 mutant) sires, genotypes that vary markedly in muscle and compositional characteristics later in postnatal life. Results We obtained expression profiles of three individuals for each time point and genotype to allow comparisons across time and between sire breeds. Quantitative reverse transcription-PCR analysis of RNA from developing longissimus muscle was able to validate the differential expression patterns observed for a selection of differentially expressed genes, with one exception. We detected large-scale changes in temporal gene expression between the four developmental stages in genes coding for extracellular matrix and for muscle fiber structural and metabolic proteins. FSTL1 and IGFBP5 were two genes implicated in growth and differentiation that showed developmentally regulated expression levels in fetal muscle. An abundantly expressed gene with no functional annotation was found to be developmentally regulated in the same manner as muscle structural proteins. We also observed differences in gene expression profiles between the two different sire breeds. Wagyu-sired calves showed higher expression of fatty acid binding protein 5 (FABP5) RNA at birth. The developing longissimus muscle of fetuses carrying the Piedmontese mutation shows an emphasis on glycolytic muscle biochemistry and a large-scale up-regulation of the translational machinery at birth. We also document evidence for timing differences in differentiation events between the two breeds. Conclusion Taken together, these findings provide a detailed description of molecular events accompanying skeletal muscle differentiation in the bovine, as well as gene expression differences that may underpin the phenotype differences between the two breeds. In addition, this study has highlighted a non-coding RNA, which is abundantly expressed and developmentally regulated in bovine fetal muscle. PMID:17697390

  18. Gene expression profiling in the hippocampus of learned helpless and nonhelpless rats.

    PubMed

    Kohen, R; Kirov, S; Navaja, G P; Happe, H Kevin; Hamblin, M W; Snoddy, J R; Neumaier, J F; Petty, F

    2005-01-01

    In the learned helplessness (LH) animal model of depression, failure to attempt escape from avoidable environmental stress, LH, indicates behavioral despair, whereas nonhelpless (NH) behavior reflects behavioral resilience to the effects of environmental stress. Comparing hippocampal gene expression with large-scale oligonucleotide microarrays, we found that stress-resilient (NH) rats, although behaviorally indistinguishable from controls, showed a distinct gene expression profile compared to LH, sham stressed, and naïve control animals. Genes that were confirmed as differentially expressed in the NH group by quantitative PCR strongly correlated in their levels of expression across all four animal groups. Differential expression could not be confirmed at the protein level. We identified several shared degenerate sequence motifs in the 3' untranslated region (3'UTR) of differentially expressed genes that could be a factor in this tight correlation of expression levels among differentially expressed genes.

  19. Transcriptome profiles link environmental variation and physiological response of Mytilus californianus between Pacific tides

    PubMed Central

    Place, Sean P.; Menge, Bruce A.; Hofmann, Gretchen E.

    2011-01-01

    Summary The marine intertidal zone is characterized by large variation in temperature, pH, dissolved oxygen and the supply of nutrients and food on seasonal and daily time scales. These oceanic fluctuations drive of ecological processes such as recruitment, competition and consumer-prey interactions largely via physiological mehcanisms. Thus, to understand coastal ecosystem dynamics and responses to climate change, it is crucial to understand these mechanisms. Here we utilize transcriptome analysis of the physiological response of the mussel Mytilus californianus at different spatial scales to gain insight into these mechanisms. We used mussels inhabiting different vertical locations within Strawberry Hill on Cape Perpetua, OR and Boiler Bay on Cape Foulweather, OR to study inter- and intra-site variation of gene expression. The results highlight two distinct gene expression signatures related to the cycling of metabolic activity and perturbations to cellular homeostasis. Intermediate spatial scales show a strong influence of oceanographic differences in food and stress environments between sites separated by ~65 km. Together, these new insights into environmental control of gene expression may allow understanding of important physiological drivers within and across populations. PMID:22563136

  20. A genome scale metabolic network for rice and accompanying analysis of tryptophan, auxin and serotonin biosynthesis regulation under biotic stress

    USDA-ARS?s Scientific Manuscript database

    Functional annotations of large plant genome projects mostly provide information on gene function and gene families based on the presence of protein domains and gene homology, but not necessarily in association with gene expression or metabolic and regulatory networks. These additional annotations a...

  1. RNA sequencing demonstrates large-scale temporal dysregulation of gene expression in stimulated macrophages derived from MHC-defined chicken haplotypes.

    PubMed

    Irizarry, Kristopher J L; Downs, Eileen; Bryden, Randall; Clark, Jory; Griggs, Lisa; Kopulos, Renee; Boettger, Cynthia M; Carr, Thomas J; Keeler, Calvin L; Collisson, Ellen; Drechsler, Yvonne

    2017-01-01

    Discovering genetic biomarkers associated with disease resistance and enhanced immunity is critical to developing advanced strategies for controlling viral and bacterial infections in different species. Macrophages, important cells of innate immunity, are directly involved in cellular interactions with pathogens, the release of cytokines activating other immune cells and antigen presentation to cells of the adaptive immune response. IFNγ is a potent activator of macrophages and increased production has been associated with disease resistance in several species. This study characterizes the molecular basis for dramatically different nitric oxide production and immune function between the B2 and the B19 haplotype chicken macrophages.A large-scale RNA sequencing approach was employed to sequence the RNA of purified macrophages from each haplotype group (B2 vs. B19) during differentiation and after stimulation. Our results demonstrate that a large number of genes exhibit divergent expression between B2 and B19 haplotype cells both prior and after stimulation. These differences in gene expression appear to be regulated by complex epigenetic mechanisms that need further investigation.

  2. bigSCale: an analytical framework for big-scale single-cell data.

    PubMed

    Iacono, Giovanni; Mereu, Elisabetta; Guillaumet-Adkins, Amy; Corominas, Roser; Cuscó, Ivon; Rodríguez-Esteban, Gustavo; Gut, Marta; Pérez-Jurado, Luis Alberto; Gut, Ivo; Heyn, Holger

    2018-06-01

    Single-cell RNA sequencing (scRNA-seq) has significantly deepened our insights into complex tissues, with the latest techniques capable of processing tens of thousands of cells simultaneously. Analyzing increasing numbers of cells, however, generates extremely large data sets, extending processing time and challenging computing resources. Current scRNA-seq analysis tools are not designed to interrogate large data sets and often lack sensitivity to identify marker genes. With bigSCale, we provide a scalable analytical framework to analyze millions of cells, which addresses the challenges associated with large data sets. To handle the noise and sparsity of scRNA-seq data, bigSCale uses large sample sizes to estimate an accurate numerical model of noise. The framework further includes modules for differential expression analysis, cell clustering, and marker identification. A directed convolution strategy allows processing of extremely large data sets, while preserving transcript information from individual cells. We evaluated the performance of bigSCale using both a biological model of aberrant gene expression in patient-derived neuronal progenitor cells and simulated data sets, which underlines the speed and accuracy in differential expression analysis. To test its applicability for large data sets, we applied bigSCale to assess 1.3 million cells from the mouse developing forebrain. Its directed down-sampling strategy accumulates information from single cells into index cell transcriptomes, thereby defining cellular clusters with improved resolution. Accordingly, index cell clusters identified rare populations, such as reelin ( Reln )-positive Cajal-Retzius neurons, for which we report previously unrecognized heterogeneity associated with distinct differentiation stages, spatial organization, and cellular function. Together, bigSCale presents a solution to address future challenges of large single-cell data sets. © 2018 Iacono et al.; Published by Cold Spring Harbor Laboratory Press.

  3. Predictive model for inflammation grades of chronic hepatitis B: Large-scale analysis of clinical parameters and gene expressions.

    PubMed

    Zhou, Weichen; Ma, Yanyun; Zhang, Jun; Hu, Jingyi; Zhang, Menghan; Wang, Yi; Li, Yi; Wu, Lijun; Pan, Yida; Zhang, Yitong; Zhang, Xiaonan; Zhang, Xinxin; Zhang, Zhanqing; Zhang, Jiming; Li, Hai; Lu, Lungen; Jin, Li; Wang, Jiucun; Yuan, Zhenghong; Liu, Jie

    2017-11-01

    Liver biopsy is the gold standard to assess pathological features (eg inflammation grades) for hepatitis B virus-infected patients although it is invasive and traumatic; meanwhile, several gene profiles of chronic hepatitis B (CHB) have been separately described in relatively small hepatitis B virus (HBV)-infected samples. We aimed to analyse correlations among inflammation grades, gene expressions and clinical parameters (serum alanine amino transaminase, aspartate amino transaminase and HBV-DNA) in large-scale CHB samples and to predict inflammation grades by using clinical parameters and/or gene expressions. We analysed gene expressions with three clinical parameters in 122 CHB samples by an improved regression model. Principal component analysis and machine-learning methods including Random Forest, K-nearest neighbour and support vector machine were used for analysis and further diagnosis models. Six normal samples were conducted to validate the predictive model. Significant genes related to clinical parameters were found enriching in the immune system, interferon-stimulated, regulation of cytokine production, anti-apoptosis, and etc. A panel of these genes with clinical parameters can effectively predict binary classifications of inflammation grade (area under the ROC curve [AUC]: 0.88, 95% confidence interval [CI]: 0.77-0.93), validated by normal samples. A panel with only clinical parameters was also valuable (AUC: 0.78, 95% CI: 0.65-0.86), indicating that liquid biopsy method for detecting the pathology of CHB is possible. This is the first study to systematically elucidate the relationships among gene expressions, clinical parameters and pathological inflammation grades in CHB, and to build models predicting inflammation grades by gene expressions and/or clinical parameters as well. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  4. Cellular Factors Shape 3D Genome Landscape

    Cancer.gov

    Researchers, using novel large-scale imaging technology, have mapped the spatial location of individual genes in the nucleus of human cells and identified 50 cellular factors required for the proper 3D positioning of genes. These spatial locations play important roles in gene expression, DNA repair, genome stability, and other cellular activities.

  5. Parallel human genome analysis: microarray-based expression monitoring of 1000 genes.

    PubMed Central

    Schena, M; Shalon, D; Heller, R; Chai, A; Brown, P O; Davis, R W

    1996-01-01

    Microarrays containing 1046 human cDNAs of unknown sequence were printed on glass with high-speed robotics. These 1.0-cm2 DNA "chips" were used to quantitatively monitor differential expression of the cognate human genes using a highly sensitive two-color hybridization assay. Array elements that displayed differential expression patterns under given experimental conditions were characterized by sequencing. The identification of known and novel heat shock and phorbol ester-regulated genes in human T cells demonstrates the sensitivity of the assay. Parallel gene analysis with microarrays provides a rapid and efficient method for large-scale human gene discovery. Images Fig. 1 Fig. 2 Fig. 3 PMID:8855227

  6. The opportunities and challenges of large-scale molecular approaches to songbird neurobiology

    PubMed Central

    Mello, C.V.; Clayton, D.F.

    2014-01-01

    High-through put methods for analyzing genome structure and function are having a large impact in song-bird neurobiology. Methods include genome sequencing and annotation, comparative genomics, DNA microarrays and transcriptomics, and the development of a brain atlas of gene expression. Key emerging findings include the identification of complex transcriptional programs active during singing, the robust brain expression of non-coding RNAs, evidence of profound variations in gene expression across brain regions, and the identification of molecular specializations within song production and learning circuits. Current challenges include the statistical analysis of large datasets, effective genome curations, the efficient localization of gene expression changes to specific neuronal circuits and cells, and the dissection of behavioral and environmental factors that influence brain gene expression. The field requires efficient methods for comparisons with organisms like chicken, which offer important anatomical, functional and behavioral contrasts. As sequencing costs plummet, opportunities emerge for comparative approaches that may help reveal evolutionary transitions contributing to vocal learning, social behavior and other properties that make songbirds such compelling research subjects. PMID:25280907

  7. The Plant Genome Integrative Explorer Resource: PlantGenIE.org.

    PubMed

    Sundell, David; Mannapperuma, Chanaka; Netotea, Sergiu; Delhomme, Nicolas; Lin, Yao-Cheng; Sjödin, Andreas; Van de Peer, Yves; Jansson, Stefan; Hvidsten, Torgeir R; Street, Nathaniel R

    2015-12-01

    Accessing and exploring large-scale genomics data sets remains a significant challenge to researchers without specialist bioinformatics training. We present the integrated PlantGenIE.org platform for exploration of Populus, conifer and Arabidopsis genomics data, which includes expression networks and associated visualization tools. Standard features of a model organism database are provided, including genome browsers, gene list annotation, Blast homology searches and gene information pages. Community annotation updating is supported via integration of WebApollo. We have produced an RNA-sequencing (RNA-Seq) expression atlas for Populus tremula and have integrated these data within the expression tools. An updated version of the ComPlEx resource for performing comparative plant expression analyses of gene coexpression network conservation between species has also been integrated. The PlantGenIE.org platform provides intuitive access to large-scale and genome-wide genomics data from model forest tree species, facilitating both community contributions to annotation improvement and tools supporting use of the included data resources to inform biological insight. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  8. Principles of gene microarray data analysis.

    PubMed

    Mocellin, Simone; Rossi, Carlo Riccardo

    2007-01-01

    The development of several gene expression profiling methods, such as comparative genomic hybridization (CGH), differential display, serial analysis of gene expression (SAGE), and gene microarray, together with the sequencing of the human genome, has provided an opportunity to monitor and investigate the complex cascade of molecular events leading to tumor development and progression. The availability of such large amounts of information has shifted the attention of scientists towards a nonreductionist approach to biological phenomena. High throughput technologies can be used to follow changing patterns of gene expression over time. Among them, gene microarray has become prominent because it is easier to use, does not require large-scale DNA sequencing, and allows for the parallel quantification of thousands of genes from multiple samples. Gene microarray technology is rapidly spreading worldwide and has the potential to drastically change the therapeutic approach to patients affected with tumor. Therefore, it is of paramount importance for both researchers and clinicians to know the principles underlying the analysis of the huge amount of data generated with microarray technology.

  9. Reduction of adenovirus E1A mRNA by RNAi results in enhanced recombinant protein expression in transiently transfected HEK293 cells.

    PubMed

    Hacker, David L; Bertschinger, Martin; Baldi, Lucia; Wurm, Florian M

    2004-10-27

    Human embryonic kidney 293 (HEK293) cells, a widely used host for large-scale transient expression of recombinant proteins, are transformed with the adenovirus E1A and E1B genes. Because the E1A proteins function as transcriptional activators or repressors, they may have a positive or negative effect on transient transgene expression in this cell line. Suspension cultures of HEK293 EBNA (HEK293E) cells were co-transfected with a reporter plasmid expressing the GFP gene and a plasmid expressing a short hairpin RNA (shRNA) targeting the E1A mRNAs for degradation by RNA interference (RNAi). The presence of the shRNA in HEK293E cells reduced the steady state level of E1A mRNA up to 75% and increased transient GFP expression from either the elongation factor-1alpha (EF-1alpha) promoter or the human cytomegalovirus (HCMV) immediate early promoter up to twofold. E1A mRNA depletion also resulted in a twofold increase in transient expression of a recombinant IgG in both small- and large-scale suspension cultures when the IgG light and heavy chain genes were controlled by the EF-1alpha promoter. Finally, transient IgG expression was enhanced 2.5-fold when the anti-E1A shRNA was expressed from the same vector as the IgG light chain gene. These results demonstrated that E1A has a negative effect on transient gene expression in HEK293E cells, and they established that RNAi can be used to enhance recombinant protein expression in mammalian cells.

  10. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

    PubMed Central

    Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; Taylor, Ronald C.; Weisenhorn, Pamela; Olson, Robert D.; Stevens, Rick L.; Rocha, Miguel; Rocha, Isabel; Best, Aaron A.; DeJongh, Matthew; Tintle, Nathan L.; Parrello, Bruce; Overbeek, Ross; Henry, Christopher S.

    2016-01-01

    Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. An important step toward meeting the challenge of understanding gene function and regulation is the identification of sets of genes that are always co-expressed. These gene sets, Atomic Regulons (ARs), represent fundamental units of function within a cell and could be used to associate genes of unknown function with cellular processes and to enable rational genetic engineering of cellular systems. Here, we describe an approach for inferring ARs that leverages large-scale expression data sets, gene context, and functional relationships among genes. We computed ARs for Escherichia coli based on 907 gene expression experiments and compared our results with gene clusters produced by two prevalent data-driven methods: Hierarchical clustering and k-means clustering. We compared ARs and purely data-driven gene clusters to the curated set of regulatory interactions for E. coli found in RegulonDB, showing that ARs are more consistent with gold standard regulons than are data-driven gene clusters. We further examined the consistency of ARs and data-driven gene clusters in the context of gene interactions predicted by Context Likelihood of Relatedness (CLR) analysis, finding that the ARs show better agreement with CLR predicted interactions. We determined the impact of increasing amounts of expression data on AR construction and find that while more data improve ARs, it is not necessary to use the full set of gene expression experiments available for E. coli to produce high quality ARs. In order to explore the conservation of co-regulated gene sets across different organisms, we computed ARs for Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus, each of which represents increasing degrees of phylogenetic distance from E. coli. Comparison of the organism-specific ARs showed that the consistency of AR gene membership correlates with phylogenetic distance, but there is clear variability in the regulatory networks of closely related organisms. As large scale expression data sets become increasingly common for model and non-model organisms, comparative analyses of atomic regulons will provide valuable insights into fundamental regulatory modules used across the bacterial domain. PMID:27933038

  11. Genome-scale gene expression characteristics define the follicular initiation and developmental rules during folliculogenesis.

    PubMed

    Shi, Kerong; He, Feng; Yuan, Xuefeng; Zhao, Yaofeng; Deng, Xuemei; Hu, Xiaoxiang; Li, Ning

    2013-08-01

    The ovarian follicle supplies a unique dynamic system for gametes that ensures the propagation of the species. During folliculogenesis, the vast majority of the germ cells are lost or inactivated because of ovarian follicle atresia, resulting in diminished reproductive potency and potential infertility. Understanding the underlying molecular mechanism of folliculogenesis rules is essential. Primordial (P), preantral (M), and large antral (L) porcine follicles were used to reveal their genome-wide gene expression profiles. Results indicate that primordial follicles (P) process a diverse gene expression pattern compared to growing follicles (M and L). The 5,548 differentially expressed genes display a similar expression mode in M and L, with a correlation coefficient of 0.892. The number of regulated (both up and down) genes in M is more than that in L. Also, their regulation folds in M (2-364-fold) are much more acute than in L (2-75-fold). Differentially expressed gene groups with different regulation patterns in certain follicular stages are identified and presumed to be closely related following follicular developmental rules. Interestingly, functional annotation analysis revealed that these gene groups feature distinct biological processes or molecular functions. Moreover, representative candidate genes from these gene groups have had their RNA or protein expressions within follicles confirmed. Our study emphasized genome-scale gene expression characteristics, which provide novel entry points for understanding the folliculogenesis rules on the molecular level, such as follicular initiation, atresia, and dominance. Transcriptional regulatory circuitries in certain follicular stages are expected to be found among the identified differentially expressed gene groups.

  12. Comparisons between Arabidopsis thaliana and Drosophila melanogaster in relation to Coding and Noncoding Sequence Length and Gene Expression

    PubMed Central

    Caldwell, Rachel; Lin, Yan-Xia; Zhang, Ren

    2015-01-01

    There is a continuing interest in the analysis of gene architecture and gene expression to determine the relationship that may exist. Advances in high-quality sequencing technologies and large-scale resource datasets have increased the understanding of relationships and cross-referencing of expression data to the large genome data. Although a negative correlation between expression level and gene (especially transcript) length has been generally accepted, there have been some conflicting results arising from the literature concerning the impacts of different regions of genes, and the underlying reason is not well understood. The research aims to apply quantile regression techniques for statistical analysis of coding and noncoding sequence length and gene expression data in the plant, Arabidopsis thaliana, and fruit fly, Drosophila melanogaster, to determine if a relationship exists and if there is any variation or similarities between these species. The quantile regression analysis found that the coding sequence length and gene expression correlations varied, and similarities emerged for the noncoding sequence length (5′ and 3′ UTRs) between animal and plant species. In conclusion, the information described in this study provides the basis for further exploration into gene regulation with regard to coding and noncoding sequence length. PMID:26114098

  13. Transcriptome sequencing and annotation of the halophytic microalga Dunaliella salina * #

    PubMed Central

    Hong, Ling; Liu, Jun-li; Midoun, Samira Z.; Miller, Philip C.

    2017-01-01

    The unicellular green alga Dunaliella salina is well adapted to salt stress and contains compounds (including β-carotene and vitamins) with potential commercial value. A large transcriptome database of D. salina during the adjustment, exponential and stationary growth phases was generated using a high throughput sequencing platform. We characterized the metabolic processes in D. salina with a focus on valuable metabolites, with the aim of manipulating D. salina to achieve greater economic value in large-scale production through a bioengineering strategy. Gene expression profiles under salt stress verified using quantitative polymerase chain reaction (qPCR) implied that salt can regulate the expression of key genes. This study generated a substantial fraction of D. salina transcriptional sequences for the entire growth cycle, providing a basis for the discovery of novel genes. This first full-scale transcriptome study of D. salina establishes a foundation for further comparative genomic studies. PMID:28990374

  14. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution

    PubMed Central

    Clarke, Thomas H.; Garb, Jessica E.; Hayashi, Cheryl Y.; Arensburger, Peter; Ayoub, Nadia A.

    2015-01-01

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). PMID:26058392

  15. The Spike-and-Slab Lasso Generalized Linear Models for Prediction and Associated Genes Detection.

    PubMed

    Tang, Zaixiang; Shen, Yueping; Zhang, Xinyan; Yi, Nengjun

    2017-01-01

    Large-scale "omics" data have been increasingly used as an important resource for prognostic prediction of diseases and detection of associated genes. However, there are considerable challenges in analyzing high-dimensional molecular data, including the large number of potential molecular predictors, limited number of samples, and small effect of each predictor. We propose new Bayesian hierarchical generalized linear models, called spike-and-slab lasso GLMs, for prognostic prediction and detection of associated genes using large-scale molecular data. The proposed model employs a spike-and-slab mixture double-exponential prior for coefficients that can induce weak shrinkage on large coefficients, and strong shrinkage on irrelevant coefficients. We have developed a fast and stable algorithm to fit large-scale hierarchal GLMs by incorporating expectation-maximization (EM) steps into the fast cyclic coordinate descent algorithm. The proposed approach integrates nice features of two popular methods, i.e., penalized lasso and Bayesian spike-and-slab variable selection. The performance of the proposed method is assessed via extensive simulation studies. The results show that the proposed approach can provide not only more accurate estimates of the parameters, but also better prediction. We demonstrate the proposed procedure on two cancer data sets: a well-known breast cancer data set consisting of 295 tumors, and expression data of 4919 genes; and the ovarian cancer data set from TCGA with 362 tumors, and expression data of 5336 genes. Our analyses show that the proposed procedure can generate powerful models for predicting outcomes and detecting associated genes. The methods have been implemented in a freely available R package BhGLM (http://www.ssg.uab.edu/bhglm/). Copyright © 2017 by the Genetics Society of America.

  16. Virus-Induced Gene Silencing Using Tobacco Rattle Virus as a Tool to Study the Interaction between Nicotiana attenuata and Rhizophagus irregularis.

    PubMed

    Groten, Karin; Pahari, Nabin T; Xu, Shuqing; Miloradovic van Doorn, Maja; Baldwin, Ian T

    2015-01-01

    Most land plants live in a symbiotic association with arbuscular mycorrhizal fungi (AMF) that belong to the phylum Glomeromycota. Although a number of plant genes involved in the plant-AMF interactions have been identified by analyzing mutants, the ability to rapidly manipulate gene expression to study the potential functions of new candidate genes remains unrealized. We analyzed changes in gene expression of wild tobacco roots (Nicotiana attenuata) after infection with mycorrhizal fungi (Rhizophagus irregularis) by serial analysis of gene expression (SuperSAGE) combined with next generation sequencing, and established a virus-induced gene-silencing protocol to study the function of candidate genes in the interaction. From 92,434 SuperSAGE Tag sequences, 32,808 (35%) matched with our in-house Nicotiana attenuata transcriptome database and 3,698 (4%) matched to Rhizophagus genes. In total, 11,194 Tags showed a significant change in expression (p<0.05, >2-fold change) after infection. When comparing the functions of highly up-regulated annotated Tags in this study with those of two previous large-scale gene expression studies, 18 gene functions were found to be up-regulated in all three studies mainly playing roles related to phytohormone metabolism, catabolism and defense. To validate the function of identified candidate genes, we used the technique of virus-induced gene silencing (VIGS) to silence the expression of three putative N. attenuata genes: germin-like protein, indole-3-acetic acid-amido synthetase GH3.9 and, as a proof-of-principle, calcium and calmodulin-dependent protein kinase (CCaMK). The silencing of the three plant genes in roots was successful, but only CCaMK silencing had a significant effect on the interaction with R. irregularis. Interestingly, when a highly activated inoculum was used for plant inoculation, the effect of CCaMK silencing on fungal colonization was masked, probably due to trans-complementation. This study demonstrates that large-scale gene expression studies across different species induce of a core set of genes of similar functions. However, additional factors seem to influence the overall pattern of gene expression, resulting in high variability among independent studies with different hosts. We conclude that VIGS is a powerful tool with which to investigate the function of genes involved in plant-AMF interactions but that inoculum strength can strongly influence the outcome of the interaction.

  17. Reverse engineering and analysis of large genome-scale gene networks

    PubMed Central

    Aluru, Maneesha; Zola, Jaroslaw; Nettleton, Dan; Aluru, Srinivas

    2013-01-01

    Reverse engineering the whole-genome networks of complex multicellular organisms continues to remain a challenge. While simpler models easily scale to large number of genes and gene expression datasets, more accurate models are compute intensive limiting their scale of applicability. To enable fast and accurate reconstruction of large networks, we developed Tool for Inferring Network of Genes (TINGe), a parallel mutual information (MI)-based program. The novel features of our approach include: (i) B-spline-based formulation for linear-time computation of MI, (ii) a novel algorithm for direct permutation testing and (iii) development of parallel algorithms to reduce run-time and facilitate construction of large networks. We assess the quality of our method by comparison with ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) and GeneNet and demonstrate its unique capability by reverse engineering the whole-genome network of Arabidopsis thaliana from 3137 Affymetrix ATH1 GeneChips in just 9 min on a 1024-core cluster. We further report on the development of a new software Gene Network Analyzer (GeNA) for extracting context-specific subnetworks from a given set of seed genes. Using TINGe and GeNA, we performed analysis of 241 Arabidopsis AraCyc 8.0 pathways, and the results are made available through the web. PMID:23042249

  18. Evolution of Synonymous Codon Usage in Neurospora tetrasperma and Neurospora discreta

    PubMed Central

    Whittle, C. A.; Sun, Y.; Johannesson, H.

    2011-01-01

    Neurospora comprises a primary model system for the study of fungal genetics and biology. In spite of this, little is known about genome evolution in Neurospora. For example, the evolution of synonymous codon usage is largely unknown in this genus. In the present investigation, we conducted a comprehensive analysis of synonymous codon usage and its relationship to gene expression and gene length (GL) in Neurospora tetrasperma and Neurospora discreta. For our analysis, we examined codon usage among 2,079 genes per organism and assessed gene expression using large-scale expressed sequenced tag (EST) data sets (279,323 and 453,559 ESTs for N. tetrasperma and N. discreta, respectively). Data on relative synonymous codon usage revealed 24 codons (and two putative codons) that are more frequently used in genes with high than with low expression and thus were defined as optimal codons. Although codon-usage bias was highly correlated with gene expression, it was independent of selectively neutral base composition (introns); thus demonstrating that translational selection drives synonymous codon usage in these genomes. We also report that GL (coding sequences [CDS]) was inversely associated with optimal codon usage at each gene expression level, with highly expressed short genes having the greatest frequency of optimal codons. Optimal codon frequency was moderately higher in N. tetrasperma than in N. discreta, which might be due to variation in selective pressures and/or mating systems. PMID:21402862

  19. Upper airway gene expression in smokers: the mouth as a "window to the soul" of lung carcinogenesis?

    PubMed

    Spira, Avrum

    2010-03-01

    This perspective on Boyle et al. (beginning on page 266 in this issue of the journal) explores transcriptomic profiling of upper airway epithelium as a biomarker of host response to tobacco smoke exposure. Boyle et al. have shown a striking relationship between smoking-related gene expression changes in the mouth and bronchus. This relationship suggests that buccal gene expression may serve as a relatively noninvasive surrogate marker of the physiologic response of the lung to tobacco smoke that could be used in large-scale screening and chemoprevention studies for lung cancer.

  20. Developmental transcriptional profiling reveals key insights into Triticeae reproductive development.

    PubMed

    Tran, Frances; Penniket, Carolyn; Patel, Rohan V; Provart, Nicholas J; Laroche, André; Rowland, Owen; Robert, Laurian S

    2013-06-01

    Despite their importance, there remains a paucity of large-scale gene expression-based studies of reproductive development in species belonging to the Triticeae. As a first step to address this deficiency, a gene expression atlas of triticale reproductive development was generated using the 55K Affymetrix GeneChip(®) wheat genome array. The global transcriptional profiles of the anther/pollen, ovary and stigma were analyzed at concurrent developmental stages, and co-expressed as well as preferentially expressed genes were identified. Data analysis revealed both novel and conserved regulatory factors underlying Triticeae floral development and function. This comprehensive resource rests upon detailed gene annotations, and the expression profiles are readily accessible via a web browser. © 2013 Her Majesty the Queen in Right of Canada as represented by the Minister of Agriculture and Agri-Food Canada.

  1. A powerful nonparametric method for detecting differentially co-expressed genes: distance correlation screening and edge-count test.

    PubMed

    Zhang, Qingyang

    2018-05-16

    Differential co-expression analysis, as a complement of differential expression analysis, offers significant insights into the changes in molecular mechanism of different phenotypes. A prevailing approach to detecting differentially co-expressed genes is to compare Pearson's correlation coefficients in two phenotypes. However, due to the limitations of Pearson's correlation measure, this approach lacks the power to detect nonlinear changes in gene co-expression which is common in gene regulatory networks. In this work, a new nonparametric procedure is proposed to search differentially co-expressed gene pairs in different phenotypes from large-scale data. Our computational pipeline consisted of two main steps, a screening step and a testing step. The screening step is to reduce the search space by filtering out all the independent gene pairs using distance correlation measure. In the testing step, we compare the gene co-expression patterns in different phenotypes by a recently developed edge-count test. Both steps are distribution-free and targeting nonlinear relations. We illustrate the promise of the new approach by analyzing the Cancer Genome Atlas data and the METABRIC data for breast cancer subtypes. Compared with some existing methods, the new method is more powerful in detecting nonlinear type of differential co-expressions. The distance correlation screening can greatly improve computational efficiency, facilitating its application to large data sets.

  2. Effects of seawater acidification on gene expression: resolving broader-scale trends in sea urchins.

    PubMed

    Evans, Tyler G; Watson-Wynn, Priscilla

    2014-06-01

    Sea urchins are ecologically and economically important calcifying organisms threatened by acidification of the global ocean caused by anthropogenic CO2 emissions. Propelled by the sequencing of the purple sea urchin (Strongylocentrotus purpuratus) genome, profiling changes in gene expression during exposure to high pCO2 seawater has emerged as a powerful and increasingly common method to infer the response of urchins to ocean change. However, analyses of gene expression are sensitive to experimental methodology, and comparisons between studies of genes regulated by ocean acidification are most often made in the context of major caveats. Here we perform meta-analyses as a means of minimizing experimental discrepancies and resolving broader-scale trends regarding the effects of ocean acidification on gene expression in urchins. Analyses across eight studies and four urchin species largely support prevailing hypotheses about the impact of ocean acidification on marine calcifiers. The predominant expression pattern involved the down-regulation of genes within energy-producing pathways, a clear indication of metabolic depression. Genes with functions in ion transport were significantly over-represented and are most plausibly contributing to intracellular pH regulation. Expression profiles provided extensive evidence for an impact on biomineralization, epitomized by the down-regulation of seven spicule matrix proteins. In contrast, expression profiles provided limited evidence for CO2-mediated developmental delay or induction of a cellular stress response. Congruence between studies of gene expression and the ocean acidification literature in general validates the accuracy of gene expression in predicting the consequences of ocean change and justifies its continued use in future studies. © 2014 Marine Biological Laboratory.

  3. Systems biology of embryonic development: Prospects for a complete understanding of the Caenorhabditis elegans embryo.

    PubMed

    Murray, John Isaac

    2018-05-01

    The convergence of developmental biology and modern genomics tools brings the potential for a comprehensive understanding of developmental systems. This is especially true for the Caenorhabditis elegans embryo because its small size, invariant developmental lineage, and powerful genetic and genomic tools provide the prospect of a cellular resolution understanding of messenger RNA (mRNA) expression and regulation across the organism. We describe here how a systems biology framework might allow large-scale determination of the embryonic regulatory relationships encoded in the C. elegans genome. This framework consists of two broad steps: (a) defining the "parts list"-all genes expressed in all cells at each time during development and (b) iterative steps of computational modeling and refinement of these models by experimental perturbation. Substantial progress has been made towards defining the parts list through imaging methods such as large-scale green fluorescent protein (GFP) reporter analysis. Imaging results are now being augmented by high-resolution transcriptome methods such as single-cell RNA sequencing, and it is likely the complete expression patterns of all genes across the embryo will be known within the next few years. In contrast, the modeling and perturbation experiments performed so far have focused largely on individual cell types or genes, and improved methods will be needed to expand them to the full genome and organism. This emerging comprehensive map of embryonic expression and regulatory function will provide a powerful resource for developmental biologists, and would also allow scientists to ask questions not accessible without a comprehensive picture. This article is categorized under: Invertebrate Organogenesis > Worms Technologies > Analysis of the Transcriptome Gene Expression and Transcriptional Hierarchies > Gene Networks and Genomics. © 2018 Wiley Periodicals, Inc.

  4. A gene expression resource generated by genome-wide lacZ profiling in the mouse

    PubMed Central

    Tuck, Elizabeth; Estabel, Jeanne; Oellrich, Anika; Maguire, Anna Karin; Adissu, Hibret A.; Souter, Luke; Siragher, Emma; Lillistone, Charlotte; Green, Angela L.; Wardle-Jones, Hannah; Carragher, Damian M.; Karp, Natasha A.; Smedley, Damian; Adams, Niels C.; Bussell, James N.; Adams, David J.; Ramírez-Solis, Ramiro; Steel, Karen P.; Galli, Antonella; White, Jacqueline K.

    2015-01-01

    ABSTRACT Knowledge of the expression profile of a gene is a critical piece of information required to build an understanding of the normal and essential functions of that gene and any role it may play in the development or progression of disease. High-throughput, large-scale efforts are on-going internationally to characterise reporter-tagged knockout mouse lines. As part of that effort, we report an open access adult mouse expression resource, in which the expression profile of 424 genes has been assessed in up to 47 different organs, tissues and sub-structures using a lacZ reporter gene. Many specific and informative expression patterns were noted. Expression was most commonly observed in the testis and brain and was most restricted in white adipose tissue and mammary gland. Over half of the assessed genes presented with an absent or localised expression pattern (categorised as 0-10 positive structures). A link between complexity of expression profile and viability of homozygous null animals was observed; inactivation of genes expressed in ≥21 structures was more likely to result in reduced viability by postnatal day 14 compared with more restricted expression profiles. For validation purposes, this mouse expression resource was compared with Bgee, a federated composite of RNA-based expression data sets. Strong agreement was observed, indicating a high degree of specificity in our data. Furthermore, there were 1207 observations of expression of a particular gene in an anatomical structure where Bgee had no data, indicating a large amount of novelty in our data set. Examples of expression data corroborating and extending genotype-phenotype associations and supporting disease gene candidacy are presented to demonstrate the potential of this powerful resource. PMID:26398943

  5. Analysis of host response to bacterial infection using error model based gene expression microarray experiments

    PubMed Central

    Stekel, Dov J.; Sarti, Donatella; Trevino, Victor; Zhang, Lihong; Salmon, Mike; Buckley, Chris D.; Stevens, Mark; Pallen, Mark J.; Penn, Charles; Falciani, Francesco

    2005-01-01

    A key step in the analysis of microarray data is the selection of genes that are differentially expressed. Ideally, such experiments should be properly replicated in order to infer both technical and biological variability, and the data should be subjected to rigorous hypothesis tests to identify the differentially expressed genes. However, in microarray experiments involving the analysis of very large numbers of biological samples, replication is not always practical. Therefore, there is a need for a method to select differentially expressed genes in a rational way from insufficiently replicated data. In this paper, we describe a simple method that uses bootstrapping to generate an error model from a replicated pilot study that can be used to identify differentially expressed genes in subsequent large-scale studies on the same platform, but in which there may be no replicated arrays. The method builds a stratified error model that includes array-to-array variability, feature-to-feature variability and the dependence of error on signal intensity. We apply this model to the characterization of the host response in a model of bacterial infection of human intestinal epithelial cells. We demonstrate the effectiveness of error model based microarray experiments and propose this as a general strategy for a microarray-based screening of large collections of biological samples. PMID:15800204

  6. Construction of regulatory networks using expression time-series data of a genotyped population.

    PubMed

    Yeung, Ka Yee; Dombek, Kenneth M; Lo, Kenneth; Mittler, John E; Zhu, Jun; Schadt, Eric E; Bumgarner, Roger E; Raftery, Adrian E

    2011-11-29

    The inference of regulatory and biochemical networks from large-scale genomics data is a basic problem in molecular biology. The goal is to generate testable hypotheses of gene-to-gene influences and subsequently to design bench experiments to confirm these network predictions. Coexpression of genes in large-scale gene-expression data implies coregulation and potential gene-gene interactions, but provide little information about the direction of influences. Here, we use both time-series data and genetics data to infer directionality of edges in regulatory networks: time-series data contain information about the chronological order of regulatory events and genetics data allow us to map DNA variations to variations at the RNA level. We generate microarray data measuring time-dependent gene-expression levels in 95 genotyped yeast segregants subjected to a drug perturbation. We develop a Bayesian model averaging regression algorithm that incorporates external information from diverse data types to infer regulatory networks from the time-series and genetics data. Our algorithm is capable of generating feedback loops. We show that our inferred network recovers existing and novel regulatory relationships. Following network construction, we generate independent microarray data on selected deletion mutants to prospectively test network predictions. We demonstrate the potential of our network to discover de novo transcription-factor binding sites. Applying our construction method to previously published data demonstrates that our method is competitive with leading network construction algorithms in the literature.

  7. Automated Protocol for Large-Scale Modeling of Gene Expression Data.

    PubMed

    Hall, Michelle Lynn; Calkins, David; Sherman, Woody

    2016-11-28

    With the continued rise of phenotypic- and genotypic-based screening projects, computational methods to analyze, process, and ultimately make predictions in this field take on growing importance. Here we show how automated machine learning workflows can produce models that are predictive of differential gene expression as a function of a compound structure using data from A673 cells as a proof of principle. In particular, we present predictive models with an average accuracy of greater than 70% across a highly diverse ∼1000 gene expression profile. In contrast to the usual in silico design paradigm, where one interrogates a particular target-based response, this work opens the opportunity for virtual screening and lead optimization for desired multitarget gene expression profiles.

  8. Query-based biclustering of gene expression data using Probabilistic Relational Models.

    PubMed

    Zhao, Hui; Cloots, Lore; Van den Bulcke, Tim; Wu, Yan; De Smet, Riet; Storms, Valerie; Meysman, Pieter; Engelen, Kristof; Marchal, Kathleen

    2011-02-15

    With the availability of large scale expression compendia it is now possible to view own findings in the light of what is already available and retrieve genes with an expression profile similar to a set of genes of interest (i.e., a query or seed set) for a subset of conditions. To that end, a query-based strategy is needed that maximally exploits the coexpression behaviour of the seed genes to guide the biclustering, but that at the same time is robust against the presence of noisy genes in the seed set as seed genes are often assumed, but not guaranteed to be coexpressed in the queried compendium. Therefore, we developed ProBic, a query-based biclustering strategy based on Probabilistic Relational Models (PRMs) that exploits the use of prior distributions to extract the information contained within the seed set. We applied ProBic on a large scale Escherichia coli compendium to extend partially described regulons with potentially novel members. We compared ProBic's performance with previously published query-based biclustering algorithms, namely ISA and QDB, from the perspective of bicluster expression quality, robustness of the outcome against noisy seed sets and biological relevance.This comparison learns that ProBic is able to retrieve biologically relevant, high quality biclusters that retain their seed genes and that it is particularly strong in handling noisy seeds. ProBic is a query-based biclustering algorithm developed in a flexible framework, designed to detect biologically relevant, high quality biclusters that retain relevant seed genes even in the presence of noise or when dealing with low quality seed sets.

  9. Evidence for a large expansion and subfunctionalisation of globin genes in sea anemones.

    PubMed

    Smith, Hayden L; Pavasovic, Ana; Surm, Joachim M; Phillips, Matthew J; Prentis, Peter J

    2018-06-27

    The globin gene superfamily has been well-characterised in vertebrates, however, there has been limited research in early-diverging lineages, such as phylum Cnidaria. This study aimed to identify globin genes in multiple cnidarian lineages, and use bioinformatic approaches to characterise the evolution, structure and expression of these genes. Phylogenetic analyses and in silico protein predictions showed that all cnidarians have undergone an expansion of globin genes, which likely have a hexacoordinate protein structure. Our protein modelling has also revealed the possibility of a single pentacoordinate globin lineage in anthozoan species. Some cnidarian globin genes displayed tissue and development specific expression with very few orthologous genes similarly expressed across species. Our phylogenetic analyses also revealed that eumetazoan globin genes form a polyphyletic relationship with vertebrate globin genes. Overall, our analyses suggest that a Ngb-like and GbX-like gene were most likely present in the globin gene repertoire for the last common ancestor of eumetazoans. The identification of a large-scale expansion and subfunctionalisation of globin genes in actiniarians provides an excellent starting point to further our understanding of the evolution and function of the globin gene superfamily in early-diverging lineages.

  10. Bi-Force: large-scale bicluster editing and its application to gene expression data biclustering

    PubMed Central

    Sun, Peng; Speicher, Nora K.; Röttger, Richard; Guo, Jiong; Baumbach, Jan

    2014-01-01

    Abstract The explosion of the biological data has dramatically reformed today's biological research. The need to integrate and analyze high-dimensional biological data on a large scale is driving the development of novel bioinformatics approaches. Biclustering, also known as ‘simultaneous clustering’ or ‘co-clustering’, has been successfully utilized to discover local patterns in gene expression data and similar biomedical data types. Here, we contribute a new heuristic: ‘Bi-Force’. It is based on the weighted bicluster editing model, to perform biclustering on arbitrary sets of biological entities, given any kind of pairwise similarities. We first evaluated the power of Bi-Force to solve dedicated bicluster editing problems by comparing Bi-Force with two existing algorithms in the BiCluE software package. We then followed a biclustering evaluation protocol in a recent review paper from Eren et al. (2013) (A comparative analysis of biclustering algorithms for gene expressiondata. Brief. Bioinform., 14:279–292.) and compared Bi-Force against eight existing tools: FABIA, QUBIC, Cheng and Church, Plaid, BiMax, Spectral, xMOTIFs and ISA. To this end, a suite of synthetic datasets as well as nine large gene expression datasets from Gene Expression Omnibus were analyzed. All resulting biclusters were subsequently investigated by Gene Ontology enrichment analysis to evaluate their biological relevance. The distinct theoretical foundation of Bi-Force (bicluster editing) is more powerful than strict biclustering. We thus outperformed existing tools with Bi-Force at least when following the evaluation protocols from Eren et al. Bi-Force is implemented in Java and integrated into the open source software package of BiCluE. The software as well as all used datasets are publicly available at http://biclue.mpi-inf.mpg.de. PMID:24682815

  11. Module discovery by exhaustive search for densely connected, co-expressed regions in biomolecular interaction networks.

    PubMed

    Colak, Recep; Moser, Flavia; Chu, Jeffrey Shih-Chieh; Schönhuth, Alexander; Chen, Nansheng; Ester, Martin

    2010-10-25

    Computational prediction of functionally related groups of genes (functional modules) from large-scale data is an important issue in computational biology. Gene expression experiments and interaction networks are well studied large-scale data sources, available for many not yet exhaustively annotated organisms. It has been well established, when analyzing these two data sources jointly, modules are often reflected by highly interconnected (dense) regions in the interaction networks whose participating genes are co-expressed. However, the tractability of the problem had remained unclear and methods by which to exhaustively search for such constellations had not been presented. We provide an algorithmic framework, referred to as Densely Connected Biclustering (DECOB), by which the aforementioned search problem becomes tractable. To benchmark the predictive power inherent to the approach, we computed all co-expressed, dense regions in physical protein and genetic interaction networks from human and yeast. An automatized filtering procedure reduces our output which results in smaller collections of modules, comparable to state-of-the-art approaches. Our results performed favorably in a fair benchmarking competition which adheres to standard criteria. We demonstrate the usefulness of an exhaustive module search, by using the unreduced output to more quickly perform GO term related function prediction tasks. We point out the advantages of our exhaustive output by predicting functional relationships using two examples. We demonstrate that the computation of all densely connected and co-expressed regions in interaction networks is an approach to module discovery of considerable value. Beyond confirming the well settled hypothesis that such co-expressed, densely connected interaction network regions reflect functional modules, we open up novel computational ways to comprehensively analyze the modular organization of an organism based on prevalent and largely available large-scale datasets. Software and data sets are available at http://www.sfu.ca/~ester/software/DECOB.zip.

  12. Evolution and expression analysis of the grape (Vitis vinifera L.) WRKY gene family.

    PubMed

    Guo, Chunlei; Guo, Rongrong; Xu, Xiaozhao; Gao, Min; Li, Xiaoqin; Song, Junyang; Zheng, Yi; Wang, Xiping

    2014-04-01

    WRKY proteins comprise a large family of transcription factors that play important roles in plant defence regulatory networks, including responses to various biotic and abiotic stresses. To date, no large-scale study of WRKY genes has been undertaken in grape (Vitis vinifera L.). In this study, a total of 59 putative grape WRKY genes (VvWRKY) were identified and renamed on the basis of their respective chromosome distribution. A multiple sequence alignment analysis using all predicted grape WRKY genes coding sequences, together with those from Arabidopsis thaliana and tomato (Solanum lycopersicum), indicated that the 59 VvWRKY genes can be classified into three main groups (I-III). An evaluation of the duplication events suggested that several WRKY genes arose before the divergence of the grape and Arabidopsis lineages. Moreover, expression profiles derived from semiquantitative PCR and real-time quantitative PCR analyses showed distinct expression patterns in various tissues and in response to different treatments. Four VvWRKY genes showed a significantly higher expression in roots or leaves, 55 responded to varying degrees to at least one abiotic stress treatment, and the expression of 38 were altered following powdery mildew (Erysiphe necator) infection. Most VvWRKY genes were downregulated in response to abscisic acid or salicylic acid treatments, while the expression of a subset was upregulated by methyl jasmonate or ethylene treatments.

  13. Evolution and expression analysis of the grape (Vitis vinifera L.) WRKY gene family

    PubMed Central

    Guo, Chunlei; Guo, Rongrong; Wang, Xiping

    2014-01-01

    WRKY proteins comprise a large family of transcription factors that play important roles in plant defence regulatory networks, including responses to various biotic and abiotic stresses. To date, no large-scale study of WRKY genes has been undertaken in grape (Vitis vinifera L.). In this study, a total of 59 putative grape WRKY genes (VvWRKY) were identified and renamed on the basis of their respective chromosome distribution. A multiple sequence alignment analysis using all predicted grape WRKY genes coding sequences, together with those from Arabidopsis thaliana and tomato (Solanum lycopersicum), indicated that the 59 VvWRKY genes can be classified into three main groups (I–III). An evaluation of the duplication events suggested that several WRKY genes arose before the divergence of the grape and Arabidopsis lineages. Moreover, expression profiles derived from semiquantitative PCR and real-time quantitative PCR analyses showed distinct expression patterns in various tissues and in response to different treatments. Four VvWRKY genes showed a significantly higher expression in roots or leaves, 55 responded to varying degrees to at least one abiotic stress treatment, and the expression of 38 were altered following powdery mildew (Erysiphe necator) infection. Most VvWRKY genes were downregulated in response to abscisic acid or salicylic acid treatments, while the expression of a subset was upregulated by methyl jasmonate or ethylene treatments. PMID:24510937

  14. Sign: large-scale gene network estimation environment for high performance computing.

    PubMed

    Tamada, Yoshinori; Shimamura, Teppei; Yamaguchi, Rui; Imoto, Seiya; Nagasaki, Masao; Miyano, Satoru

    2011-01-01

    Our research group is currently developing software for estimating large-scale gene networks from gene expression data. The software, called SiGN, is specifically designed for the Japanese flagship supercomputer "K computer" which is planned to achieve 10 petaflops in 2012, and other high performance computing environments including Human Genome Center (HGC) supercomputer system. SiGN is a collection of gene network estimation software with three different sub-programs: SiGN-BN, SiGN-SSM and SiGN-L1. In these three programs, five different models are available: static and dynamic nonparametric Bayesian networks, state space models, graphical Gaussian models, and vector autoregressive models. All these models require a huge amount of computational resources for estimating large-scale gene networks and therefore are designed to be able to exploit the speed of 10 petaflops. The software will be available freely for "K computer" and HGC supercomputer system users. The estimated networks can be viewed and analyzed by Cell Illustrator Online and SBiP (Systems Biology integrative Pipeline). The software project web site is available at http://sign.hgc.jp/ .

  15. Gram-scale production of a basidiomycetous laccase in Aspergillus niger.

    PubMed

    Mekmouche, Yasmina; Zhou, Simeng; Cusano, Angela M; Record, Eric; Lomascolo, Anne; Robert, Viviane; Simaan, A Jalila; Rousselot-Pailley, Pierre; Ullah, Sana; Chaspoul, Florence; Tron, Thierry

    2014-01-01

    We report on the expression in Aspergillus niger of a laccase gene we used to produce variants in Saccharomyces cerevisiae. Grams of recombinant enzyme can be easily obtained. This highlights the potential of combining this generic laccase sequence to the yeast and fungal expression systems for large-scale productions of variants. Copyright © 2013 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  16. General statistics of stochastic process of gene expression in eukaryotic cells.

    PubMed Central

    Kuznetsov, V A; Knott, G D; Bonner, R F

    2002-01-01

    Thousands of genes are expressed at such very low levels (< or =1 copy per cell) that global gene expression analysis of rarer transcripts remains problematic. Ambiguity in identification of rarer transcripts creates considerable uncertainty in fundamental questions such as the total number of genes expressed in an organism and the biological significance of rarer transcripts. Knowing the distribution of the true number of genes expressed at each level and the corresponding gene expression level probability function (GELPF) could help resolve these uncertainties. We found that all observed large-scale gene expression data sets in yeast, mouse, and human cells follow a Pareto-like distribution model skewed by many low-abundance transcripts. A novel stochastic model of the gene expression process predicts the universality of the GELPF both across different cell types within a multicellular organism and across different organisms. This model allows us to predict the frequency distribution of all gene expression levels within a single cell and to estimate the number of expressed genes in a single cell and in a population of cells. A random "basal" transcription mechanism for protein-coding genes in all or almost all eukaryotic cell types is predicted. This fundamental mechanism might enhance the expression of rarely expressed genes and, thus, provide a basic level of phenotypic diversity, adaptability, and random monoallelic expression in cell populations. PMID:12136033

  17. Techniques for Large-Scale Bacterial Genome Manipulation and Characterization of the Mutants with Respect to In Silico Metabolic Reconstructions.

    PubMed

    diCenzo, George C; Finan, Turlough M

    2018-01-01

    The rate at which all genes within a bacterial genome can be identified far exceeds the ability to characterize these genes. To assist in associating genes with cellular functions, a large-scale bacterial genome deletion approach can be employed to rapidly screen tens to thousands of genes for desired phenotypes. Here, we provide a detailed protocol for the generation of deletions of large segments of bacterial genomes that relies on the activity of a site-specific recombinase. In this procedure, two recombinase recognition target sequences are introduced into known positions of a bacterial genome through single cross-over plasmid integration. Subsequent expression of the site-specific recombinase mediates recombination between the two target sequences, resulting in the excision of the intervening region and its loss from the genome. We further illustrate how this deletion system can be readily adapted to function as a large-scale in vivo cloning procedure, in which the region excised from the genome is captured as a replicative plasmid. We next provide a procedure for the metabolic analysis of bacterial large-scale genome deletion mutants using the Biolog Phenotype MicroArray™ system. Finally, a pipeline is described, and a sample Matlab script is provided, for the integration of the obtained data with a draft metabolic reconstruction for the refinement of the reactions and gene-protein-reaction relationships in a metabolic reconstruction.

  18. Comparative studies of gene expression and the evolution of gene regulation

    PubMed Central

    Romero, Irene Gallego; Ruvinsky, Ilya; Gilad, Yoav

    2014-01-01

    The hypothesis that differences in gene regulation play an important role in speciation and adaptation is more than 40 years old. With the advent of new sequencing technologies, we are able to characterize and study gene expression levels and associated regulatory mechanisms in a large number of individuals and species at unprecedented resolution and scale. We have thus gained new insights into the evolutionary pressures that shape gene expression levels, as well as developed an appreciation for the relative importance of evolutionary changes in different regulatory genetic and epigenetic mechanisms. The current challenge is to link gene regulatory changes to adaptive evolution of complex phenotypes. Here we mainly focus on comparative studies in primates, and how they are complemented by studies in model organisms. PMID:22705669

  19. Discrete domains of gene expression in germinal layers distinguish the development of gyrencephaly

    PubMed Central

    de Juan Romero, Camino; Bruder, Carl; Tomasello, Ugo; Sanz-Anquela, José Miguel; Borrell, Víctor

    2015-01-01

    Gyrencephalic species develop folds in the cerebral cortex in a stereotypic manner, but the genetic mechanisms underlying this patterning process are unknown. We present a large-scale transcriptomic analysis of individual germinal layers in the developing cortex of the gyrencephalic ferret, comparing between regions prospective of fold and fissure. We find unique transcriptional signatures in each germinal compartment, where thousands of genes are differentially expressed between regions, including ∼80% of genes mutated in human cortical malformations. These regional differences emerge from the existence of discrete domains of gene expression, which occur at multiple locations across the developing cortex of ferret and human, but not the lissencephalic mouse. Complex expression patterns emerge late during development and map the eventual location of folds or fissures. Protomaps of gene expression within germinal layers may contribute to define cortical folds or functional areas, but our findings demonstrate that they distinguish the development of gyrencephalic cortices. PMID:25916825

  20. Global map of physical interactions among differentially expressed genes in multiple sclerosis relapses and remissions.

    PubMed

    Tuller, Tamir; Atar, Shimshi; Ruppin, Eytan; Gurevich, Michael; Achiron, Anat

    2011-09-15

    Multiple sclerosis (MS) is a central nervous system autoimmune inflammatory T-cell-mediated disease with a relapsing-remitting course in the majority of patients. In this study, we performed a high-resolution systems biology analysis of gene expression and physical interactions in MS relapse and remission. To this end, we integrated 164 large-scale measurements of gene expression in peripheral blood mononuclear cells of MS patients in relapse or remission and healthy subjects, with large-scale information about the physical interactions between these genes obtained from public databases. These data were analyzed with a variety of computational methods. We find that there is a clear and significant global network-level signal that is related to the changes in gene expression of MS patients in comparison to healthy subjects. However, despite the clear differences in the clinical symptoms of MS patients in relapse versus remission, the network level signal is weaker when comparing patients in these two stages of the disease. This result suggests that most of the genes have relatively similar expression levels in the two stages of the disease. In accordance with previous studies, we found that the pathways related to regulation of cell death, chemotaxis and inflammatory response are differentially expressed in the disease in comparison to healthy subjects, while pathways related to cell adhesion, cell migration and cell-cell signaling are activated in relapse in comparison to remission. However, the current study includes a detailed report of the exact set of genes involved in these pathways and the interactions between them. For example, we found that the genes TP53 and IL1 are 'network-hub' that interacts with many of the differentially expressed genes in MS patients versus healthy subjects, and the epidermal growth factor receptor is a 'network-hub' in the case of MS patients with relapse versus remission. The statistical approaches employed in this study enabled us to report new sets of genes that according to their gene expression and physical interactions are predicted to be differentially expressed in MS versus healthy subjects, and in MS patients in relapse versus remission. Some of these genes may be useful biomarkers for diagnosing MS and predicting relapses in MS patients.

  1. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution.

    PubMed

    Clarke, Thomas H; Garb, Jessica E; Hayashi, Cheryl Y; Arensburger, Peter; Ayoub, Nadia A

    2015-06-08

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  2. Gene coexpression measures in large heterogeneous samples using count statistics.

    PubMed

    Wang, Y X Rachel; Waterman, Michael S; Huang, Haiyan

    2014-11-18

    With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the "big data" challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance.

  3. Activation of the alpha-globin gene expression correlates with dramatic upregulation of nearby non-globin genes and changes in local and large-scale chromatin spatial structure.

    PubMed

    Ulianov, Sergey V; Galitsyna, Aleksandra A; Flyamer, Ilya M; Golov, Arkadiy K; Khrameeva, Ekaterina E; Imakaev, Maxim V; Abdennur, Nezar A; Gelfand, Mikhail S; Gavrilov, Alexey A; Razin, Sergey V

    2017-07-11

    In homeotherms, the alpha-globin gene clusters are located within permanently open genome regions enriched in housekeeping genes. Terminal erythroid differentiation results in dramatic upregulation of alpha-globin genes making their expression comparable to the rRNA transcriptional output. Little is known about the influence of the erythroid-specific alpha-globin gene transcription outburst on adjacent, widely expressed genes and large-scale chromatin organization. Here, we have analyzed the total transcription output, the overall chromatin contact profile, and CTCF binding within the 2.7 Mb segment of chicken chromosome 14 harboring the alpha-globin gene cluster in cultured lymphoid cells and cultured erythroid cells before and after induction of terminal erythroid differentiation. We found that, similarly to mammalian genome, the chicken genomes is organized in TADs and compartments. Full activation of the alpha-globin gene transcription in differentiated erythroid cells is correlated with upregulation of several adjacent housekeeping genes and the emergence of abundant intergenic transcription. An extended chromosome region encompassing the alpha-globin cluster becomes significantly decompacted in differentiated erythroid cells, and depleted in CTCF binding and CTCF-anchored chromatin loops, while the sub-TAD harboring alpha-globin gene cluster and the upstream major regulatory element (MRE) becomes highly enriched with chromatin interactions as compared to lymphoid and proliferating erythroid cells. The alpha-globin gene domain and the neighboring loci reside within the A-like chromatin compartment in both lymphoid and erythroid cells and become further segregated from the upstream gene desert upon terminal erythroid differentiation. Our findings demonstrate that the effects of tissue-specific transcription activation are not restricted to the host genomic locus but affect the overall chromatin structure and transcriptional output of the encompassing topologically associating domain.

  4. Large-scale analysis of antisense transcription in wheat using the Affymetrix GeneChip Wheat Genome Array

    USDA-ARS?s Scientific Manuscript database

    Natural antisense transcripts (NATs) are transcripts of the opposite DNA strand to the sense-strand either at the same locus (cis-encoded) or a different locus (trans-encoded). They can affect gene expression at multiple stages including transcription, RNA processing and transport, and translation....

  5. Genome-wide computational prediction and analysis of core promoter elements across plant monocots and dicots

    USDA-ARS?s Scientific Manuscript database

    Transcription initiation, essential to gene expression regulation, involves recruitment of basal transcription factors to the core promoter elements (CPEs). The distribution of currently known CPEs across plant genomes is largely unknown. This is the first large scale genome-wide report on the compu...

  6. Common and specific signatures of gene expression and protein-protein interactions in autoimmune diseases.

    PubMed

    Tuller, T; Atar, S; Ruppin, E; Gurevich, M; Achiron, A

    2013-03-01

    The aim of this study is to understand intracellular regulatory mechanisms in peripheral blood mononuclear cells (PBMCs), which are either common to many autoimmune diseases or specific to some of them. We incorporated large-scale data such as protein-protein interactions, gene expression and demographical information of hundreds of patients and healthy subjects, related to six autoimmune diseases with available large-scale gene expression measurements: multiple sclerosis (MS), systemic lupus erythematosus (SLE), juvenile rheumatoid arthritis (JRA), Crohn's disease (CD), ulcerative colitis (UC) and type 1 diabetes (T1D). These data were analyzed concurrently by statistical and systems biology approaches tailored for this purpose. We found that chemokines such as CXCL1-3, 5, 6 and the interleukin (IL) IL8 tend to be differentially expressed in PBMCs of patients with the analyzed autoimmune diseases. In addition, the anti-apoptotic gene BCL3, interferon-γ (IFNG), and the vitamin D receptor (VDR) gene physically interact with significantly many genes that tend to be differentially expressed in PBMCs of patients with the analyzed autoimmune diseases. In general, similar cellular processes tend to be differentially expressed in PBMC in the analyzed autoimmune diseases. Specifically, the cellular processes related to cell proliferation (for example, epidermal growth factor, platelet-derived growth factor, nuclear factor-κB, Wnt/β-catenin signaling, stress-activated protein kinase c-Jun NH2-terminal kinase), inflammatory response (for example, interleukins IL2 and IL6, the cytokine granulocyte-macrophage colony-stimulating factor and the B-cell receptor), general signaling cascades (for example, mitogen-activated protein kinase, extracellular signal-regulated kinase, p38 and TRK) and apoptosis are activated in most of the analyzed autoimmune diseases. However, our results suggest that in each of the analyzed diseases, apoptosis and chemotaxis are activated via different subsignaling pathways. Analyses of the expression levels of dozens of genes and the protein-protein interactions among them demonstrated that CD and UC have relatively similar gene expression signatures, whereas the gene expression signatures of T1D and JRA relatively differ from the signatures of the other autoimmune diseases. These diseases are the only ones activated via the Fcɛ pathway. The relevant genes and pathways reported in this study are discussed at length, and may be helpful in the diagnoses and understanding of autoimmunity and/or specific autoimmune diseases.

  7. Relationship between gene expression and GC-content in mammals: statistical significance and biological relevance.

    PubMed

    Sémon, Marie; Mouchiroud, Dominique; Duret, Laurent

    2005-02-01

    Mammalian chromosomes are characterized by large-scale variations of DNA base composition (the so-called isochores). In contradiction with previous studies, Lercher et al. (Hum. Mol. Genet., 12, 2411, 2003) recently reported a strong correlation between gene expression breadth and GC-content, suggesting that there might be a selective pressure favoring the concentration of housekeeping genes in GC-rich isochores. We reassessed this issue by examining in human and mouse the correlation between gene expression and GC-content, using different measures of gene expression (EST, SAGE and microarray) and different measures of GC-content. We show that correlations between GC-content and expression are very weak, and may vary according to the method used to measure expression. Such weak correlations have a very low predictive value. The strong correlations reported by Lercher et al. (2003) are because of the fact that they measured variables over neighboring genes windows. We show here that using gene windows artificially enhances the correlation. The assertion that the expression of a given gene depends on the GC-content of the region where it is located is therefore not supported by the data.

  8. Gene expression of Caenorhabditis elegans neurons carries information on their synaptic connectivity.

    PubMed

    Kaufman, Alon; Dror, Gideon; Meilijson, Isaac; Ruppin, Eytan

    2006-12-08

    The claim that genetic properties of neurons significantly influence their synaptic network structure is a common notion in neuroscience. The nematode Caenorhabditis elegans provides an exciting opportunity to approach this question in a large-scale quantitative manner. Its synaptic connectivity network has been identified, and, combined with cellular studies, we currently have characteristic connectivity and gene expression signatures for most of its neurons. By using two complementary analysis assays we show that the expression signature of a neuron carries significant information about its synaptic connectivity signature, and identify a list of putative genes predicting neural connectivity. The current study rigorously quantifies the relation between gene expression and synaptic connectivity signatures in the C. elegans nervous system and identifies subsets of neurons where this relation is highly marked. The results presented and the genes identified provide a promising starting point for further, more detailed computational and experimental investigations.

  9. Large-scale gene function analysis with the PANTHER classification system.

    PubMed

    Mi, Huaiyu; Muruganujan, Anushya; Casagrande, John T; Thomas, Paul D

    2013-08-01

    The PANTHER (protein annotation through evolutionary relationship) classification system (http://www.pantherdb.org/) is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 82 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER pathways. The PANTHER website includes a suite of tools that enable users to browse and query gene functions, and to analyze large-scale experimental data with a number of statistical tests. It is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. In the 2013 release of PANTHER (v.8.0), in addition to an update of the data content, we redesigned the website interface to improve both user experience and the system's analytical capability. This protocol provides a detailed description of how to analyze genome-wide experimental data with the PANTHER classification system.

  10. Rare Cell Detection by Single-Cell RNA Sequencing as Guided by Single-Molecule RNA FISH.

    PubMed

    Torre, Eduardo; Dueck, Hannah; Shaffer, Sydney; Gospocic, Janko; Gupte, Rohit; Bonasio, Roberto; Kim, Junhyong; Murray, John; Raj, Arjun

    2018-02-28

    Although single-cell RNA sequencing can reliably detect large-scale transcriptional programs, it is unclear whether it accurately captures the behavior of individual genes, especially those that express only in rare cells. Here, we use single-molecule RNA fluorescence in situ hybridization as a gold standard to assess trade-offs in single-cell RNA-sequencing data for detecting rare cell expression variability. We quantified the gene expression distribution for 26 genes that range from ubiquitous to rarely expressed and found that the correspondence between estimates across platforms improved with both transcriptome coverage and increased number of cells analyzed. Further, by characterizing the trade-off between transcriptome coverage and number of cells analyzed, we show that when the number of genes required to answer a given biological question is small, then greater transcriptome coverage is more important than analyzing large numbers of cells. More generally, our report provides guidelines for selecting quality thresholds for single-cell RNA-sequencing experiments aimed at rare cell analyses. Copyright © 2018 Elsevier Inc. All rights reserved.

  11. Lateralized Feeding Behavior is Associated with Asymmetrical Neuroanatomy and Lateralized Gene Expressions in the Brain in Scale-Eating Cichlid Fish

    PubMed Central

    Lee, Hyuk Je; Schneider, Ralf F; Manousaki, Tereza; Kang, Ji Hyoun; Lein, Etienne; Franchini, Paolo

    2017-01-01

    Abstract Lateralized behavior (“handedness”) is unusual, but consistently found across diverse animal lineages, including humans. It is thought to reflect brain anatomical and/or functional asymmetries, but its neuro-molecular mechanisms remain largely unknown. Lake Tanganyika scale-eating cichlid fish, Perissodus microlepis show pronounced asymmetry in their jaw morphology as well as handedness in feeding behavior—biting scales preferentially only from one or the other side of their victims. This makes them an ideal model in which to investigate potential laterality in neuroanatomy and transcription in the brain in relation to behavioral handedness. After determining behavioral handedness in P. microlepis (preferred attack side), we estimated the volume of the hemispheres of brain regions and captured their gene expression profiles. Our analyses revealed that the degree of behavioral handedness is mirrored at the level of neuroanatomical asymmetry, particularly in the tectum opticum. Transcriptome analyses showed that different brain regions (tectum opticum, telencephalon, hypothalamus, and cerebellum) display distinct expression patterns, potentially reflecting their developmental interrelationships. For numerous genes in each brain region, their extent of expression differences between hemispheres was found to be correlated with the degree of behavioral lateralization. Interestingly, the tectum opticum and telencephalon showed divergent biases on the direction of up- or down-regulation of the laterality candidate genes (e.g., grm2) in the hemispheres, highlighting the connection of handedness with gene expression profiles and the different roles of these brain regions. Hence, handedness in predation behavior may be caused by asymmetric size of brain hemispheres and also by lateralized gene expressions in the brain. PMID:29069363

  12. Lateralized Feeding Behavior is Associated with Asymmetrical Neuroanatomy and Lateralized Gene Expressions in the Brain in Scale-Eating Cichlid Fish.

    PubMed

    Lee, Hyuk Je; Schneider, Ralf F; Manousaki, Tereza; Kang, Ji Hyoun; Lein, Etienne; Franchini, Paolo; Meyer, Axel

    2017-11-01

    Lateralized behavior ("handedness") is unusual, but consistently found across diverse animal lineages, including humans. It is thought to reflect brain anatomical and/or functional asymmetries, but its neuro-molecular mechanisms remain largely unknown. Lake Tanganyika scale-eating cichlid fish, Perissodus microlepis show pronounced asymmetry in their jaw morphology as well as handedness in feeding behavior-biting scales preferentially only from one or the other side of their victims. This makes them an ideal model in which to investigate potential laterality in neuroanatomy and transcription in the brain in relation to behavioral handedness. After determining behavioral handedness in P. microlepis (preferred attack side), we estimated the volume of the hemispheres of brain regions and captured their gene expression profiles. Our analyses revealed that the degree of behavioral handedness is mirrored at the level of neuroanatomical asymmetry, particularly in the tectum opticum. Transcriptome analyses showed that different brain regions (tectum opticum, telencephalon, hypothalamus, and cerebellum) display distinct expression patterns, potentially reflecting their developmental interrelationships. For numerous genes in each brain region, their extent of expression differences between hemispheres was found to be correlated with the degree of behavioral lateralization. Interestingly, the tectum opticum and telencephalon showed divergent biases on the direction of up- or down-regulation of the laterality candidate genes (e.g., grm2) in the hemispheres, highlighting the connection of handedness with gene expression profiles and the different roles of these brain regions. Hence, handedness in predation behavior may be caused by asymmetric size of brain hemispheres and also by lateralized gene expressions in the brain. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  13. Efficient production of human acidic fibroblast growth factor in pea (Pisum sativum L.) plants by agroinfection of germinated seeds

    PubMed Central

    2011-01-01

    Background For efficient and large scale production of recombinant proteins in plants transient expression by agroinfection has a number of advantages over stable transformation. Simple manipulation, rapid analysis and high expression efficiency are possible. In pea, Pisum sativum, a Virus Induced Gene Silencing System using the pea early browning virus has been converted into an efficient agroinfection system by converting the two RNA genomes of the virus into binary expression vectors for Agrobacterium transformation. Results By vacuum infiltration (0.08 Mpa, 1 min) of germinating pea seeds with 2-3 cm roots with Agrobacteria carrying the binary vectors, expression of the gene for Green Fluorescent Protein as marker and the gene for the human acidic fibroblast growth factor (aFGF) was obtained in 80% of the infiltrated developing seedlings. Maximal production of the recombinant proteins was achieved 12-15 days after infiltration. Conclusions Compared to the leaf injection method vacuum infiltration of germinated seeds is highly efficient allowing large scale production of plants transiently expressing recombinant proteins. The production cycle of plants for harvesting the recombinant protein was shortened from 30 days for leaf injection to 15 days by applying vacuum infiltration. The synthesized aFGF was purified by heparin-affinity chromatography and its mitogenic activity on NIH 3T3 cells confirmed to be similar to a commercial product. PMID:21548923

  14. Gene expression profiling of single cells on large-scale oligonucleotide arrays

    PubMed Central

    Hartmann, Claudia H.; Klein, Christoph A.

    2006-01-01

    Over the last decade, important insights into the regulation of cellular responses to various stimuli were gained by global gene expression analyses of cell populations. More recently, specific cell functions and underlying regulatory networks of rare cells isolated from their natural environment moved to the center of attention. However, low cell numbers still hinder gene expression profiling of rare ex vivo material in biomedical research. Therefore, we developed a robust method for gene expression profiling of single cells on high-density oligonucleotide arrays with excellent coverage of low abundance transcripts. The protocol was extensively tested with freshly isolated single cells of very low mRNA content including single epithelial, mature and immature dendritic cells and hematopoietic stem cells. Quantitative PCR confirmed that the PCR-based global amplification method did not change the relative ratios of transcript abundance and unsupervised hierarchical cluster analysis revealed that the histogenetic origin of an individual cell is correctly reflected by the gene expression profile. Moreover, the gene expression data from dendritic cells demonstrate that cellular differentiation and pathway activation can be monitored in individual cells. PMID:17071717

  15. Emory University: High-Throughput Protein-Protein Interaction Dataset for Lung Cancer-Associated Genes | Office of Cancer Genomics

    Cancer.gov

    To discover novel PPI signaling hubs for lung cancer, CTD2 Center at Emory utilized large-scale genomics datasets and literature to compile a set of lung cancer-associated genes. A library of expression vectors were generated for these genes and utilized for detecting pairwise PPIs with cell lysate-based TR-FRET assays in high-throughput screening format. Read the abstract.

  16. Expression Atlas: gene and protein expression across multiple studies and organisms

    PubMed Central

    Tang, Y Amy; Bazant, Wojciech; Burke, Melissa; Fuentes, Alfonso Muñoz-Pomer; George, Nancy; Koskinen, Satu; Mohammed, Suhaib; Geniza, Matthew; Preece, Justin; Jarnuczak, Andrew F; Huber, Wolfgang; Stegle, Oliver; Brazma, Alvis; Petryszak, Robert

    2018-01-01

    Abstract Expression Atlas (http://www.ebi.ac.uk/gxa) is an added value database that provides information about gene and protein expression in different species and contexts, such as tissue, developmental stage, disease or cell type. The available public and controlled access data sets from different sources are curated and re-analysed using standardized, open source pipelines and made available for queries, download and visualization. As of August 2017, Expression Atlas holds data from 3,126 studies across 33 different species, including 731 from plants. Data from large-scale RNA sequencing studies including Blueprint, PCAWG, ENCODE, GTEx and HipSci can be visualized next to each other. In Expression Atlas, users can query genes or gene-sets of interest and explore their expression across or within species, tissues, developmental stages in a constitutive or differential context, representing the effects of diseases, conditions or experimental interventions. All processed data matrices are available for direct download in tab-delimited format or as R-data. In addition to the web interface, data sets can now be searched and downloaded through the Expression Atlas R package. Novel features and visualizations include the on-the-fly analysis of gene set overlaps and the option to view gene co-expression in experiments investigating constitutive gene expression across tissues or other conditions. PMID:29165655

  17. Expression profiles of urbilaterian genes uniquely shared between honey bee and vertebrates

    PubMed Central

    Matsui, Toshiaki; Yamamoto, Toshiyuki; Wyder, Stefan; Zdobnov, Evgeny M; Kadowaki, Tatsuhiko

    2009-01-01

    Background Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of these ancient genes have not been addressed extensively. To understand the functions and evolutionary mechanisms of such ancient Urbilaterian genes, we carried out comprehensive expression profile analysis of genes shared between vertebrates and honey bees but not with the other sequenced ecdysozoan genomes (honey bee-vertebrate specific, HVS genes) as a model. Results We identified 30 honey bee and 55 mouse HVS genes. Many HVS genes exhibited tissue-selective expression patterns; intriguingly, the expression of 60% of honey bee HVS genes was found to be brain enriched, and 24% of mouse HVS genes were highly expressed in either or both the brain and testis. Moreover, a minimum of 38% of mouse HVS genes demonstrated neuron-enriched expression patterns, and 62% of them exhibited expression in selective brain areas, particularly the forebrain and cerebellum. Furthermore, gene ontology (GO) analysis of HVS genes predicted that 35% of genes are associated with DNA transcription and RNA processing. Conclusion These results suggest that HVS genes include genes that are biased towards expression in the brain and gonads. They also demonstrate that at least some of Urbilaterian genes retained in the specific animal lineage may be selectively maintained to support the species-specific phenotypes. PMID:19138430

  18. Expression profiles of urbilaterian genes uniquely shared between honey bee and vertebrates.

    PubMed

    Matsui, Toshiaki; Yamamoto, Toshiyuki; Wyder, Stefan; Zdobnov, Evgeny M; Kadowaki, Tatsuhiko

    2009-01-12

    Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of these ancient genes have not been addressed extensively. To understand the functions and evolutionary mechanisms of such ancient Urbilaterian genes, we carried out comprehensive expression profile analysis of genes shared between vertebrates and honey bees but not with the other sequenced ecdysozoan genomes (honey bee-vertebrate specific, HVS genes) as a model. We identified 30 honey bee and 55 mouse HVS genes. Many HVS genes exhibited tissue-selective expression patterns; intriguingly, the expression of 60% of honey bee HVS genes was found to be brain enriched, and 24% of mouse HVS genes were highly expressed in either or both the brain and testis. Moreover, a minimum of 38% of mouse HVS genes demonstrated neuron-enriched expression patterns, and 62% of them exhibited expression in selective brain areas, particularly the forebrain and cerebellum. Furthermore, gene ontology (GO) analysis of HVS genes predicted that 35% of genes are associated with DNA transcription and RNA processing. These results suggest that HVS genes include genes that are biased towards expression in the brain and gonads. They also demonstrate that at least some of Urbilaterian genes retained in the specific animal lineage may be selectively maintained to support the species-specific phenotypes.

  19. TOXICOGENOMICS DRUG DISCOVERY AND THE PATHOLOGIST

    EPA Science Inventory

    Toxicogenomics, drug discovery, and pathologist.

    The field of toxicogenomics, which currently focuses on the application of large-scale differential gene expression (DGE) data to toxicology, is starting to influence drug discovery and development in the pharmaceutical indu...

  20. A Normalization-Free and Nonparametric Method Sharpens Large-Scale Transcriptome Analysis and Reveals Common Gene Alteration Patterns in Cancers.

    PubMed

    Li, Qi-Gang; He, Yong-Han; Wu, Huan; Yang, Cui-Ping; Pu, Shao-Yan; Fan, Song-Qing; Jiang, Li-Ping; Shen, Qiu-Shuo; Wang, Xiao-Xiong; Chen, Xiao-Qiong; Yu, Qin; Li, Ying; Sun, Chang; Wang, Xiangting; Zhou, Jumin; Li, Hai-Peng; Chen, Yong-Bin; Kong, Qing-Peng

    2017-01-01

    Heterogeneity in transcriptional data hampers the identification of differentially expressed genes (DEGs) and understanding of cancer, essentially because current methods rely on cross-sample normalization and/or distribution assumption-both sensitive to heterogeneous values. Here, we developed a new method, Cross-Value Association Analysis (CVAA), which overcomes the limitation and is more robust to heterogeneous data than the other methods. Applying CVAA to a more complex pan-cancer dataset containing 5,540 transcriptomes discovered numerous new DEGs and many previously rarely explored pathways/processes; some of them were validated, both in vitro and in vivo , to be crucial in tumorigenesis, e.g., alcohol metabolism ( ADH1B ), chromosome remodeling ( NCAPH ) and complement system ( Adipsin ). Together, we present a sharper tool to navigate large-scale expression data and gain new mechanistic insights into tumorigenesis.

  1. Comparison of gene expression changes induced by biguanides in db/db mice liver.

    PubMed

    Heishi, Masayuki; Hayashi, Koji; Ichihara, Junji; Ishikawa, Hironori; Kawamura, Takao; Kanaoka, Masaharu; Taiji, Mutsuo; Kimura, Toru

    2008-08-01

    Large-scale clinical studies have shown that the biguanide drug metformin, widely used for type 2 diabetes, to be very safe. By contrast, another biguanide, phenformin, has been withdrawn from major markets because of a high incidence of serious adverse effects. The difference in mode of action between the two biguanides remains unclear. To gain insight into the different modes of action of the two drugs, we performed global gene expression profiling using the livers of obese diabetic db/db mice after a single administration of phenformin or metformin at levels sufficient to cause a significant reduction in blood glucose level. Metformin induced modest expression changes, including G6pc in the liver as previously reported. By contrast, phenformin caused changes in expression level of many additional genes. We used a knowledge-based bioinformatic analysis to study the effects of phenformin. Differentially expressed genes identified in this study constitute a large gene network, which may be related to cell death, inflammation or wound response. Our results suggest that the two biguanides show a similar hypoglycemic effect in db/db mice, but phenformin induces a greater stress on the liver even a short time after a single administration. These findings provide a novel insight into the cause of the relatively high occurrence of serious adverse effect after phenformin treatment.

  2. Concordant integrative gene set enrichment analysis of multiple large-scale two-sample expression data sets.

    PubMed

    Lai, Yinglei; Zhang, Fanni; Nayak, Tapan K; Modarres, Reza; Lee, Norman H; McCaffrey, Timothy A

    2014-01-01

    Gene set enrichment analysis (GSEA) is an important approach to the analysis of coordinate expression changes at a pathway level. Although many statistical and computational methods have been proposed for GSEA, the issue of a concordant integrative GSEA of multiple expression data sets has not been well addressed. Among different related data sets collected for the same or similar study purposes, it is important to identify pathways or gene sets with concordant enrichment. We categorize the underlying true states of differential expression into three representative categories: no change, positive change and negative change. Due to data noise, what we observe from experiments may not indicate the underlying truth. Although these categories are not observed in practice, they can be considered in a mixture model framework. Then, we define the mathematical concept of concordant gene set enrichment and calculate its related probability based on a three-component multivariate normal mixture model. The related false discovery rate can be calculated and used to rank different gene sets. We used three published lung cancer microarray gene expression data sets to illustrate our proposed method. One analysis based on the first two data sets was conducted to compare our result with a previous published result based on a GSEA conducted separately for each individual data set. This comparison illustrates the advantage of our proposed concordant integrative gene set enrichment analysis. Then, with a relatively new and larger pathway collection, we used our method to conduct an integrative analysis of the first two data sets and also all three data sets. Both results showed that many gene sets could be identified with low false discovery rates. A consistency between both results was also observed. A further exploration based on the KEGG cancer pathway collection showed that a majority of these pathways could be identified by our proposed method. This study illustrates that we can improve detection power and discovery consistency through a concordant integrative analysis of multiple large-scale two-sample gene expression data sets.

  3. Genetic Approaches to Study Meiosis and Meiosis-Specific Gene Expression in Saccharomyces cerevisiae.

    PubMed

    Kassir, Yona; Stuart, David T

    2017-01-01

    The budding yeast Saccharomyces cerevisiae has a long history as a model organism for studies of meiosis and the cell cycle. The popularity of this yeast as a model is in large part due to the variety of genetic and cytological approaches that can be effectively performed with the cells. Cultures of the cells can be induced to synchronously progress through meiosis and sporulation allowing large-scale gene expression and biochemical studies to be performed. Additionally, the spore tetrads resulting from meiosis make it possible to characterize the haploid products of meiosis allowing investigation of meiotic recombination and chromosome segregation. Here we describe genetic methods for analysis progression of S. cerevisiae through meiosis and sporulation with an emphasis on strategies for the genetic analysis of regulators of meiosis-specific genes.

  4. Activity-based protein profiling for biochemical pathway discovery in cancer

    PubMed Central

    Nomura, Daniel K.; Dix, Melissa M.; Cravatt, Benjamin F.

    2011-01-01

    Large-scale profiling methods have uncovered numerous gene and protein expression changes that correlate with tumorigenesis. However, determining the relevance of these expression changes and which biochemical pathways they affect has been hindered by our incomplete understanding of the proteome and its myriad functions and modes of regulation. Activity-based profiling platforms enable both the discovery of cancer-relevant enzymes and selective pharmacological probes to perturb and characterize these proteins in tumour cells. When integrated with other large-scale profiling methods, activity-based proteomics can provide insight into the metabolic and signalling pathways that support cancer pathogenesis and illuminate new strategies for disease diagnosis and treatment. PMID:20703252

  5. Bi-Force: large-scale bicluster editing and its application to gene expression data biclustering.

    PubMed

    Sun, Peng; Speicher, Nora K; Röttger, Richard; Guo, Jiong; Baumbach, Jan

    2014-05-01

    The explosion of the biological data has dramatically reformed today's biological research. The need to integrate and analyze high-dimensional biological data on a large scale is driving the development of novel bioinformatics approaches. Biclustering, also known as 'simultaneous clustering' or 'co-clustering', has been successfully utilized to discover local patterns in gene expression data and similar biomedical data types. Here, we contribute a new heuristic: 'Bi-Force'. It is based on the weighted bicluster editing model, to perform biclustering on arbitrary sets of biological entities, given any kind of pairwise similarities. We first evaluated the power of Bi-Force to solve dedicated bicluster editing problems by comparing Bi-Force with two existing algorithms in the BiCluE software package. We then followed a biclustering evaluation protocol in a recent review paper from Eren et al. (2013) (A comparative analysis of biclustering algorithms for gene expressiondata. Brief. Bioinform., 14:279-292.) and compared Bi-Force against eight existing tools: FABIA, QUBIC, Cheng and Church, Plaid, BiMax, Spectral, xMOTIFs and ISA. To this end, a suite of synthetic datasets as well as nine large gene expression datasets from Gene Expression Omnibus were analyzed. All resulting biclusters were subsequently investigated by Gene Ontology enrichment analysis to evaluate their biological relevance. The distinct theoretical foundation of Bi-Force (bicluster editing) is more powerful than strict biclustering. We thus outperformed existing tools with Bi-Force at least when following the evaluation protocols from Eren et al. Bi-Force is implemented in Java and integrated into the open source software package of BiCluE. The software as well as all used datasets are publicly available at http://biclue.mpi-inf.mpg.de. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Hidden among the crowd: differential DNA methylation-expression correlations in cancer occur at important oncogenic pathways.

    PubMed

    Mosquera Orgueira, Adrián

    2015-01-01

    DNA methylation is a frequent epigenetic mechanism that participates in transcriptional repression. Variations in DNA methylation with respect to gene expression are constant, and, for unknown reasons, some genes with highly methylated promoters are sometimes overexpressed. In this study we have analyzed the expression and methylation patterns of thousands of genes in five groups of cancer and normal tissue samples in order to determine local and genome-wide differences. We observed significant changes in global methylation-expression correlation in all the neoplasms, which suggests that differential correlation events are frequent in cancer. A focused analysis in the breast cancer cohort identified 1662 genes whose correlation varies significantly between normal and cancerous breast, but whose DNA methylation and gene expression patterns do not change substantially. These genes were enriched in cancer-related pathways and repressive chromatin features across various model cell lines, such as PRC2 binding and H3K27me3 marks. Substantial changes in methylation-expression correlation indicate that these genes are subject to epigenetic remodeling, where the differential activity of other factors break the expected relationship between both variables. Our findings suggest a complex regulatory landscape where a redistribution of local and large-scale chromatin repressive domains at differentially correlated genes (DCGs) creates epigenetic hotspots that modulate cancer-specific gene expression.

  7. Highly specific gene silencing in a monocot species by artificial microRNAs derived from chimeric miRNA precursors

    DOE PAGES

    Carbonell, Alberto; Fahlgren, Noah; Mitchell, Skyler; ...

    2015-05-20

    Artificial microRNAs (amiRNAs) are used for selective gene silencing in plants. However, current methods to produce amiRNA constructs for silencing transcripts in monocot species are not suitable for simple, cost-effective and large-scale synthesis. Here, a series of expression vectors based on Oryza sativa MIR390 (OsMIR390) precursor was developed for high-throughput cloning and high expression of amiRNAs in monocots. Four different amiRNA sequences designed to target specifically endogenous genes and expressed from OsMIR390-based vectors were validated in transgenic Brachypodium distachyon plants. Surprisingly, amiRNAs accumulated to higher levels and were processed more accurately when expressed from chimeric OsMIR390-based precursors that include distalmore » stem-loop sequences from Arabidopsis thaliana MIR390a (AtMIR390a). In all cases, transgenic plants displayed the predicted phenotypes induced by target gene repression, and accumulated high levels of amiRNAs and low levels of the corresponding target transcripts. Genome-wide transcriptome profiling combined with 5-RLM-RACE analysis in transgenic plants confirmed that amiRNAs were highly specific. Finally, significance Statement A series of amiRNA vectors based on Oryza sativa MIR390 (OsMIR390) precursor were developed for simple, cost-effective and large-scale synthesis of amiRNA constructs to silence genes in monocots. Unexpectedly, amiRNAs produced from chimeric OsMIR390-based precursors including Arabidopsis thaliana MIR390a distal stem-loop sequences accumulated elevated levels of highly effective and specific amiRNAs in transgenic Brachypodium distachyon plants.« less

  8. Microarray analysis identifies candidate genes for key roles in coral development

    PubMed Central

    Grasso, Lauretta C; Maindonald, John; Rudd, Stephen; Hayward, David C; Saint, Robert; Miller, David J; Ball, Eldon E

    2008-01-01

    Background Anthozoan cnidarians are amongst the simplest animals at the tissue level of organization, but are surprisingly complex and vertebrate-like in terms of gene repertoire. As major components of tropical reef ecosystems, the stony corals are anthozoans of particular ecological significance. To better understand the molecular bases of both cnidarian development in general and coral-specific processes such as skeletogenesis and symbiont acquisition, microarray analysis was carried out through the period of early development – when skeletogenesis is initiated, and symbionts are first acquired. Results Of 5081 unique peptide coding genes, 1084 were differentially expressed (P ≤ 0.05) in comparisons between four different stages of coral development, spanning key developmental transitions. Genes of likely relevance to the processes of settlement, metamorphosis, calcification and interaction with symbionts were characterised further and their spatial expression patterns investigated using whole-mount in situ hybridization. Conclusion This study is the first large-scale investigation of developmental gene expression for any cnidarian, and has provided candidate genes for key roles in many aspects of coral biology, including calcification, metamorphosis and symbiont uptake. One surprising finding is that some of these genes have clear counterparts in higher animals but are not present in the closely-related sea anemone Nematostella. Secondly, coral-specific processes (i.e. traits which distinguish corals from their close relatives) may be analogous to similar processes in distantly related organisms. This first large-scale application of microarray analysis demonstrates the potential of this approach for investigating many aspects of coral biology, including the effects of stress and disease. PMID:19014561

  9. MacroBac: New Technologies for Robust and Efficient Large-Scale Production of Recombinant Multiprotein Complexes.

    PubMed

    Gradia, Scott D; Ishida, Justin P; Tsai, Miaw-Sheue; Jeans, Chris; Tainer, John A; Fuss, Jill O

    2017-01-01

    Recombinant expression of large, multiprotein complexes is essential and often rate limiting for determining structural, biophysical, and biochemical properties of DNA repair, replication, transcription, and other key cellular processes. Baculovirus-infected insect cell expression systems are especially well suited for producing large, human proteins recombinantly, and multigene baculovirus systems have facilitated studies of multiprotein complexes. In this chapter, we describe a multigene baculovirus system called MacroBac that uses a Biobricks-type assembly method based on restriction and ligation (Series 11) or ligation-independent cloning (Series 438). MacroBac cloning and assembly is efficient and equally well suited for either single subcloning reactions or high-throughput cloning using 96-well plates and liquid handling robotics. MacroBac vectors are polypromoter with each gene flanked by a strong polyhedrin promoter and an SV40 poly(A) termination signal that minimize gene order expression level effects seen in many polycistronic assemblies. Large assemblies are robustly achievable, and we have successfully assembled as many as 10 genes into a single MacroBac vector. Importantly, we have observed significant increases in expression levels and quality of large, multiprotein complexes using a single, multigene, polypromoter virus rather than coinfection with multiple, single-gene viruses. Given the importance of characterizing functional complexes, we believe that MacroBac provides a critical enabling technology that may change the way that structural, biophysical, and biochemical research is done. © 2017 Elsevier Inc. All rights reserved.

  10. Genotype by watering regime interaction in cultivated tomato: lessons from linkage mapping and gene expression.

    PubMed

    Albert, Elise; Gricourt, Justine; Bertin, Nadia; Bonnefoi, Julien; Pateyron, Stéphanie; Tamby, Jean-Philippe; Bitton, Frédérique; Causse, Mathilde

    2016-02-01

    In tomato, genotype by watering interaction resulted from genotype re-ranking more than scale changes. Interactive QTLs according to watering regime were detected. Differentially expressed genes were identified in some intervals. As a result of climate change, drought will increasingly limit crop production in the future. Studying genotype by watering regime interactions is necessary to improve plant adaptation to low water availability. In cultivated tomato (Solanum lycopersicum L.), extensively grown in dry areas, well-mastered water deficits can stimulate metabolite production, increasing plant defenses and concentration of compounds involved in fruit quality, at the same time. However, few tomato Quantitative Trait Loci (QTLs) and genes involved in response to drought are identified or only in wild species. In this study, we phenotyped a population of 119 recombinant inbred lines derived from a cross between a cherry tomato and a large fruit tomato, grown in greenhouse under two watering regimes, in two locations. A large genetic variability was measured for 19 plant and fruit traits, under the two watering treatments. Highly significant genotype by watering regime interactions were detected and resulted from re-ranking more than scale changes. The population was genotyped for 679 SNP markers to develop a genetic map. In total, 56 QTLs were identified among which 11 were interactive between watering regimes. These later mainly exhibited antagonist effects according to watering treatment. Variation in gene expression in leaves of parental accessions revealed 2259 differentially expressed genes, among which candidate genes presenting sequence polymorphisms were identified under two main interactive QTLs. Our results provide knowledge about the genetic control of genotype by watering regime interactions in cultivated tomato and the possible use of deficit irrigation to improve tomato quality.

  11. Sex genes for genomic analysis in human brain: internal controls for comparison of probe level data extraction.

    PubMed Central

    Galfalvy, Hanga C; Erraji-Benchekroun, Loubna; Smyrniotopoulos, Peggy; Pavlidis, Paul; Ellis, Steven P; Mann, J John; Sibille, Etienne; Arango, Victoria

    2003-01-01

    Background Genomic studies of complex tissues pose unique analytical challenges for assessment of data quality, performance of statistical methods used for data extraction, and detection of differentially expressed genes. Ideally, to assess the accuracy of gene expression analysis methods, one needs a set of genes which are known to be differentially expressed in the samples and which can be used as a "gold standard". We introduce the idea of using sex-chromosome genes as an alternative to spiked-in control genes or simulations for assessment of microarray data and analysis methods. Results Expression of sex-chromosome genes were used as true internal biological controls to compare alternate probe-level data extraction algorithms (Microarray Suite 5.0 [MAS5.0], Model Based Expression Index [MBEI] and Robust Multi-array Average [RMA]), to assess microarray data quality and to establish some statistical guidelines for analyzing large-scale gene expression. These approaches were implemented on a large new dataset of human brain samples. RMA-generated gene expression values were markedly less variable and more reliable than MAS5.0 and MBEI-derived values. A statistical technique controlling the false discovery rate was applied to adjust for multiple testing, as an alternative to the Bonferroni method, and showed no evidence of false negative results. Fourteen probesets, representing nine Y- and two X-chromosome linked genes, displayed significant sex differences in brain prefrontal cortex gene expression. Conclusion In this study, we have demonstrated the use of sex genes as true biological internal controls for genomic analysis of complex tissues, and suggested analytical guidelines for testing alternate oligonucleotide microarray data extraction protocols and for adjusting multiple statistical analysis of differentially expressed genes. Our results also provided evidence for sex differences in gene expression in the brain prefrontal cortex, supporting the notion of a putative direct role of sex-chromosome genes in differentiation and maintenance of sexual dimorphism of the central nervous system. Importantly, these analytical approaches are applicable to all microarray studies that include male and female human or animal subjects. PMID:12962547

  12. Sex genes for genomic analysis in human brain: internal controls for comparison of probe level data extraction.

    PubMed

    Galfalvy, Hanga C; Erraji-Benchekroun, Loubna; Smyrniotopoulos, Peggy; Pavlidis, Paul; Ellis, Steven P; Mann, J John; Sibille, Etienne; Arango, Victoria

    2003-09-08

    Genomic studies of complex tissues pose unique analytical challenges for assessment of data quality, performance of statistical methods used for data extraction, and detection of differentially expressed genes. Ideally, to assess the accuracy of gene expression analysis methods, one needs a set of genes which are known to be differentially expressed in the samples and which can be used as a "gold standard". We introduce the idea of using sex-chromosome genes as an alternative to spiked-in control genes or simulations for assessment of microarray data and analysis methods. Expression of sex-chromosome genes were used as true internal biological controls to compare alternate probe-level data extraction algorithms (Microarray Suite 5.0 [MAS5.0], Model Based Expression Index [MBEI] and Robust Multi-array Average [RMA]), to assess microarray data quality and to establish some statistical guidelines for analyzing large-scale gene expression. These approaches were implemented on a large new dataset of human brain samples. RMA-generated gene expression values were markedly less variable and more reliable than MAS5.0 and MBEI-derived values. A statistical technique controlling the false discovery rate was applied to adjust for multiple testing, as an alternative to the Bonferroni method, and showed no evidence of false negative results. Fourteen probesets, representing nine Y- and two X-chromosome linked genes, displayed significant sex differences in brain prefrontal cortex gene expression. In this study, we have demonstrated the use of sex genes as true biological internal controls for genomic analysis of complex tissues, and suggested analytical guidelines for testing alternate oligonucleotide microarray data extraction protocols and for adjusting multiple statistical analysis of differentially expressed genes. Our results also provided evidence for sex differences in gene expression in the brain prefrontal cortex, supporting the notion of a putative direct role of sex-chromosome genes in differentiation and maintenance of sexual dimorphism of the central nervous system. Importantly, these analytical approaches are applicable to all microarray studies that include male and female human or animal subjects.

  13. Hepatic gene expression patterns following trauma-hemorrhage: effect of posttreatment with estrogen.

    PubMed

    Yu, Huang-Ping; Pang, See-Tong; Chaudry, Irshad H

    2013-01-01

    The aim of this study was to examine the role of estrogen on hepatic gene expression profiles at an early time point following trauma-hemorrhage in rats. Groups of injured and sham controls receiving estrogen or vehicle were killed 2 h after injury and resuscitation, and liver tissue was harvested. Complementary RNA was synthesized from each RNA sample and hybridized to microarrays. A large number of genes were differentially expressed at the 2-h time point in injured animals with or without estrogen treatment. The upregulation or downregulation of a cohort of 14 of these genes was validated by reverse transcription-polymerase chain reaction. This large-scale microarray analysis shows that at the 2-h time point, there is marked alteration in hepatic gene expression following trauma-hemorrhage. However, estrogen treatment attenuated these changes in injured animals. Pathway analysis demonstrated predominant changes in the expression of genes involved in metabolism, immunity, and apoptosis. Upregulation of low-density lipoprotein receptor, protein phosphatase 1, regulatory subunit 3C, ring-finger protein 11, pyroglutamyl-peptidase I, bactericidal/permeability-increasing protein, integrin, αD, BCL2-like 11, leukemia inhibitory factor receptor, ATPase, Cu transporting, α polypeptide, and Mk1 protein was found in estrogen-treated trauma-hemorrhaged animals. Thus, estrogen produces hepatoprotection following trauma-hemorrhage likely via antiapoptosis and improving/restoring metabolism and immunity pathways.

  14. Applications of Proteomic Technologies to Toxicology

    EPA Science Inventory

    Proteomics is the large-scale study of gene expression at the protein level. This cutting edge technology has been extensively applied to toxicology research recently. The up-to-date development of proteomics has presented the toxicology community with an unprecedented opportunit...

  15. Macro optical projection tomography for large scale 3D imaging of plant structures and gene activity

    PubMed Central

    Lee, Karen J. I.; Calder, Grant M.; Hindle, Christopher R.; Newman, Jacob L.; Robinson, Simon N.; Avondo, Jerome J. H. Y.

    2017-01-01

    Abstract Optical projection tomography (OPT) is a well-established method for visualising gene activity in plants and animals. However, a limitation of conventional OPT is that the specimen upper size limit precludes its application to larger structures. To address this problem we constructed a macro version called Macro OPT (M-OPT). We apply M-OPT to 3D live imaging of gene activity in growing whole plants and to visualise structural morphology in large optically cleared plant and insect specimens up to 60 mm tall and 45 mm deep. We also show how M-OPT can be used to image gene expression domains in 3D within fixed tissue and to visualise gene activity in 3D in clones of growing young whole Arabidopsis plants. A further application of M-OPT is to visualise plant-insect interactions. Thus M-OPT provides an effective 3D imaging platform that allows the study of gene activity, internal plant structures and plant-insect interactions at a macroscopic scale. PMID:28025317

  16. The large-scale investigation of gene expression in Leymus chinensis stigmas provides a valuable resource for understanding the mechanisms of poaceae self-incompatibility.

    PubMed

    Zhou, Qingyuan; Jia, Junting; Huang, Xing; Yan, Xueqing; Cheng, Liqin; Chen, Shuangyan; Li, Xiaoxia; Peng, Xianjun; Liu, Gongshe

    2014-05-26

    Many Poaceae species show a gametophytic self-incompatibility (GSI) system, which is controlled by at least two independent and multiallelic loci, S and Z. Until currently, the gene products for S and Z were unknown. Grass SI plant stigmas discriminate between pollen grains that land on its surface and support compatible pollen tube growth and penetration into the stigma, whereas recognizing incompatible pollen and thus inhibiting pollination behaviors. Leymus chinensis (Trin.) Tzvel. (sheepgrass) is a Poaceae SI species. A comprehensive analysis of sheepgrass stigma transcriptome may provide valuable information for understanding the mechanism of pollen-stigma interactions and grass SI. The transcript abundance profiles of mature stigmas, mature ovaries and leaves were examined using high-throughput next generation sequencing technology. A comparative transcriptomic analysis of these tissues identified 1,025 specifically or preferentially expressed genes in sheepgrass stigmas. These genes contained a significant proportion of genes predicted to function in cell-cell communication and signal transduction. We identified 111 putative transcription factors (TFs) genes and the most abundant groups were MYB, C2H2, C3H, FAR1, MADS. Comparative analysis of the sheepgrass, rice and Arabidopsis stigma-specific or preferential datasets showed broad similarities and some differences in the proportion of genes in the Gene Ontology (GO) functional categories. Potential SI candidate genes identified in other grasses were also detected in the sheepgrass stigma-specific or preferential dataset. Quantitative real-time PCR experiments validated the expression pattern of stigma preferential genes including homologous grass SI candidate genes. This study represents the first large-scale investigation of gene expression in the stigmas of an SI grass species. We uncovered many notable genes that are potentially involved in pollen-stigma interactions and SI mechanisms, including genes encoding receptor-like protein kinases (RLK), CBL (calcineurin B-like proteins) interacting protein kinases, calcium-dependent protein kinase, expansins, pectinesterase, peroxidases and various transcription factors. The availability of a pool of stigma-specific or preferential genes for L. chinensis offers an opportunity to elucidate the mechanisms of SI in Poaceae.

  17. Gene expression signature of cerebellar hypoplasia in a mouse model of Down syndrome during postnatal development

    PubMed Central

    Laffaire, Julien; Rivals, Isabelle; Dauphinot, Luce; Pasteau, Fabien; Wehrle, Rosine; Larrat, Benoit; Vitalis, Tania; Moldrich, Randal X; Rossier, Jean; Sinkus, Ralph; Herault, Yann; Dusart, Isabelle; Potier, Marie-Claude

    2009-01-01

    Background Down syndrome is a chromosomal disorder caused by the presence of three copies of chromosome 21. The mechanisms by which this aneuploidy produces the complex and variable phenotype observed in people with Down syndrome are still under discussion. Recent studies have demonstrated an increased transcript level of the three-copy genes with some dosage compensation or amplification for a subset of them. The impact of this gene dosage effect on the whole transcriptome is still debated and longitudinal studies assessing the variability among samples, tissues and developmental stages are needed. Results We thus designed a large scale gene expression study in mice (the Ts1Cje Down syndrome mouse model) in which we could measure the effects of trisomy 21 on a large number of samples (74 in total) in a tissue that is affected in Down syndrome (the cerebellum) and where we could quantify the defect during postnatal development in order to correlate gene expression changes to the phenotype observed. Statistical analysis of microarray data revealed a major gene dosage effect: for the three-copy genes as well as for a 2 Mb segment from mouse chromosome 12 that we show for the first time as being deleted in the Ts1Cje mice. This gene dosage effect impacts moderately on the expression of euploid genes (2.4 to 7.5% differentially expressed). Only 13 genes were significantly dysregulated in Ts1Cje mice at all four postnatal development stages studied from birth to 10 days after birth, and among them are 6 three-copy genes. The decrease in granule cell proliferation demonstrated in newborn Ts1Cje cerebellum was correlated with a major gene dosage effect on the transcriptome in dissected cerebellar external granule cell layer. Conclusion High throughput gene expression analysis in the cerebellum of a large number of samples of Ts1Cje and euploid mice has revealed a prevailing gene dosage effect on triplicated genes. Moreover using an enriched cell population that is thought responsible for the cerebellar hypoplasia in Down syndrome, a global destabilization of gene expression was not detected. Altogether these results strongly suggest that the three-copy genes are directly responsible for the phenotype present in cerebellum. We provide here a short list of candidate genes. PMID:19331679

  18. Distal-less regulates eyespot patterns and melanization in Bicyclus butterflies.

    PubMed

    Monteiro, Antónia; Chen, Bin; Ramos, Diane M; Oliver, Jeffrey C; Tong, Xiaoling; Guo, Min; Wang, Wen-Kai; Fazzino, Lisa; Kamal, Firdous

    2013-07-01

    Butterfly eyespots represent novel complex traits that display substantial diversity in number and size within and across species. Correlative gene expression studies have implicated a large suite of transcription factors, including Distal-less (Dll), Engrailed (En), and Spalt (Sal), in eyespot development in butterflies, but direct evidence testing the function of any of these proteins is still missing. Here we show that the characteristic two-eyespot pattern of wildtype Bicyclus anynana forewings is correlated with dynamic progression of Dll, En, and Sal expression in larval wings from four spots to two spots, whereas no such decline in gene expression ensues in a four-eyespot mutant. We then conduct transgenic experiments testing whether over-expression of any of these genes in a wild-type genetic background is sufficient to induce eyespot differentiation in these pre-patterned wing compartments. We also produce a Dll-RNAi transgenic line to test how Dll down-regulation affects eyespot development. Finally we test how ectopic expression of these genes during the pupal stages of development alters adults color patters. We show that over-expressing Dll in larvae is sufficient to induce the differentiation of additional eyespots and increase the size of eyespots, whereas down-regulating Dll leads to a decrease in eyespot size. Furthermore, ectopic expression of Dll in the early pupal wing led to the appearance of ectopic patches of black scales. We conclude that Dll is a positive regulator of focal differentiation and eyespot signaling and that this gene is also a possible selector gene for scale melanization in butterflies. Copyright © 2013 Wiley Periodicals, Inc.

  19. Removing Batch Effects from Longitudinal Gene Expression - Quantile Normalization Plus ComBat as Best Approach for Microarray Transcriptome Data

    PubMed Central

    Müller, Christian; Schillert, Arne; Röthemeier, Caroline; Trégouët, David-Alexandre; Proust, Carole; Binder, Harald; Pfeiffer, Norbert; Beutel, Manfred; Lackner, Karl J.; Schnabel, Renate B.; Tiret, Laurence; Wild, Philipp S.; Blankenberg, Stefan

    2016-01-01

    Technical variation plays an important role in microarray-based gene expression studies, and batch effects explain a large proportion of this noise. It is therefore mandatory to eliminate technical variation while maintaining biological variability. Several strategies have been proposed for the removal of batch effects, although they have not been evaluated in large-scale longitudinal gene expression data. In this study, we aimed at identifying a suitable method for batch effect removal in a large study of microarray-based longitudinal gene expression. Monocytic gene expression was measured in 1092 participants of the Gutenberg Health Study at baseline and 5-year follow up. Replicates of selected samples were measured at both time points to identify technical variability. Deming regression, Passing-Bablok regression, linear mixed models, non-linear models as well as ReplicateRUV and ComBat were applied to eliminate batch effects between replicates. In a second step, quantile normalization prior to batch effect correction was performed for each method. Technical variation between batches was evaluated by principal component analysis. Associations between body mass index and transcriptomes were calculated before and after batch removal. Results from association analyses were compared to evaluate maintenance of biological variability. Quantile normalization, separately performed in each batch, combined with ComBat successfully reduced batch effects and maintained biological variability. ReplicateRUV performed perfectly in the replicate data subset of the study, but failed when applied to all samples. All other methods did not substantially reduce batch effects in the replicate data subset. Quantile normalization plus ComBat appears to be a valuable approach for batch correction in longitudinal gene expression data. PMID:27272489

  20. Improved luciferase gene expression using ultrasound targeted microbubble destruction therapy in swine

    NASA Astrophysics Data System (ADS)

    Noble, Misty L.; Song, Shuxian; Sun, Ryan R.; Fan, Luping; DiBlasi, Robert M.; O'Kelly-Priddy, Colleen; Loeb, Keith R.; Miao, Carol H.

    2012-11-01

    Ultrasound (US) targeted microbubble (MB) destruction (UTMD) has been shown to be an effective method in delivering drugs and plasmid DNA (pDNA) into cells. We previously reported successful gene transfection of a reporter luciferase gene, pGL4, into livers of mice and rats using UTMD. The challenge is to translate and achieve similar gene expression in large animals, like swine, where the treated tissue volume is substantially larger. The scale-up study requires proportionally increased amount of pDNA/MBs delivered to tissues and an equivalent increase in US energy. We use different MBs and surgical strategies to retain most of pDNA/MB locally during US application in order to maximize the effect of UTMD in gene transfection. Our results show significant increase in luciferase expression in swine injected with MBs and exposed to 2.7 MPa US. We obtained up to 1800-fold enhancement in the pig experiment using Definity® MBs, and 2000-fold and 6300-fold enhancement in two pig studies using RN18 MBs compared to sham. These results represent an important developmental step towards US mediated gene delivery in large animals and clinical trials.

  1. Engineering of Baeyer-Villiger monooxygenase-based Escherichia coli biocatalyst for large scale biotransformation of ricinoleic acid into (Z)-11-(heptanoyloxy)undec-9-enoic acid

    PubMed Central

    Seo, Joo-Hyun; Kim, Hwan-Hee; Jeon, Eun-Yeong; Song, Young-Ha; Shin, Chul-Soo; Park, Jin-Byung

    2016-01-01

    Baeyer-Villiger monooxygenases (BVMOs) are able to catalyze regiospecific Baeyer-Villiger oxygenation of a variety of cyclic and linear ketones to generate the corresponding lactones and esters, respectively. However, the enzymes are usually difficult to express in a functional form in microbial cells and are rather unstable under process conditions hindering their large-scale applications. Thereby, we investigated engineering of the BVMO from Pseudomonas putida KT2440 and the gene expression system to improve its activity and stability for large-scale biotransformation of ricinoleic acid (1) into the ester (i.e., (Z)-11-(heptanoyloxy)undec-9-enoic acid) (3), which can be hydrolyzed into 11-hydroxyundec-9-enoic acid (5) (i.e., a precursor of polyamide-11) and n-heptanoic acid (4). The polyionic tag-based fusion engineering of the BVMO and the use of a synthetic promoter for constitutive enzyme expression allowed the recombinant Escherichia coli expressing the BVMO and the secondary alcohol dehydrogenase of Micrococcus luteus to produce the ester (3) to 85 mM (26.6 g/L) within 5 h. The 5 L scale biotransformation process was then successfully scaled up to a 70 L bioreactor; 3 was produced to over 70 mM (21.9 g/L) in the culture medium 6 h after biotransformation. This study demonstrated that the BVMO-based whole-cell reactions can be applied for large-scale biotransformations. PMID:27311560

  2. Design and construction of functional AAV vectors.

    PubMed

    Gray, John T; Zolotukhin, Serge

    2011-01-01

    Using the basic principles of molecular biology and laboratory techniques presented in this chapter, researchers should be able to create a wide variety of AAV vectors for both clinical and basic research applications. Basic vector design concepts are covered for both protein coding gene expression and small non-coding RNA gene expression cassettes. AAV plasmid vector backbones (available via AddGene) are described, along with critical sequence details for a variety of modular expression components that can be inserted as needed for specific applications. Protocols are provided for assembling the various DNA components into AAV vector plasmids in Escherichia coli, as well as for transferring these vector sequences into baculovirus genomes for large-scale production of AAV in the insect cell production system.

  3. Genomic analysis of expressed sequence tags in American black bear Ursus americanus

    PubMed Central

    2010-01-01

    Background Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Results Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. Conclusion We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes. PMID:20338065

  4. Genomic analysis of expressed sequence tags in American black bear Ursus americanus.

    PubMed

    Zhao, Sen; Shao, Chunxuan; Goropashnaya, Anna V; Stewart, Nathan C; Xu, Yichi; Tøien, Øivind; Barnes, Brian M; Fedorov, Vadim B; Yan, Jun

    2010-03-26

    Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.

  5. Construction of a Food Grade Recombinant Bacillus subtilis Based on Replicative Plasmids with an Auxotrophic Marker for Biotransformation of d-Fructose to d-Allulose.

    PubMed

    He, Weiwei; Mu, Wanmeng; Jiang, Bo; Yan, Xin; Zhang, Tao

    2016-04-27

    A food grade recombinant Bacillus subtilis that produces d-psicose 3-epimerase (DPEase; EC 5.1.3.30) was constructed by transforming a replicative multicopy plasmid with a d-alanine racemase gene marker into B. subtilis 1A751 with the d-alanine racemase gene knocked out. The DPEase was expressed in B. subtilis without antibiotic resistance genes and without adding antibiotics during fermentation. Whole cells of the food grade recombinant B. subtilis were used to biotransform d-fructose to d-allulose. The two tandem promoters, including the HpaII and P43 promoters, increased expression levels compared to the use of one promoter, HpaII. For large-scale d-allulose production, the optimal enzyme dose was 40 enzyme activity units of dry cells per gram of d-fructose, which produced a 28.5% turnover yield in 60 min. The recombinant plasmid exhibited stability over 100 generations. This food grade recombinant B. subtilis may be used for large-scale d-allulose production in the food industry.

  6. Large-scale identification of differentially expressed genes during pupa development reveals solute carrier gene is essential for pupal pigmentation in Chilo suppressalis.

    PubMed

    Sun, Yang; Huang, Shuijin; Wang, Shuping; Guo, Dianhao; Ge, Chang; Xiao, Huamei; Jie, Wencai; Yang, Qiupu; Teng, Xiaolu; Li, Fei

    2017-04-01

    Insects undergo metamorphosis, involving an abrupt change in body structure through cell growth and differentiation. Rice stem stripped borer (SSB), Chilo suppressalis, is one of the most destructive rice pests. However, little is known about the regulation mechanism of metamorphosis development in this notorious insect pest. Here, we studied the expression of 22,197 SSB genes at seven time points during pupa development with a customized microarray, identifying 622 differentially expressed genes (DEG) during pupa development. Gene ontology (GO) analysis of these DEGs indicated that the genes related to substance metabolism were highly expressed in the early pupa, which participate in the physiological processes of larval tissue disintegration at these stages. In comparison, highly expressed genes in the late pupal stages were mainly associated with substance biosynthesis, consistent with adult organ formation at these stages. There were 27 solute carrier (SLC) genes that were highly expressed during pupa development. We knocked down SLC22A3 at the prepupal stage, demonstrating that silencing SLC22A3 induced a deficiency in pupa stiffness and pigmentation. The RNAi-treated individuals had white and soft pupa, suggesting that this gene has an essential role in pupal development. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Imprinted gene expression in fetal growth and development.

    PubMed

    Lambertini, L; Marsit, C J; Sharma, P; Maccani, M; Ma, Y; Hu, J; Chen, J

    2012-06-01

    Experimental studies showed that genomic imprinting is fundamental in fetoplacental development by timely regulating the expression of the imprinted genes to overlook a set of events determining placenta implantation, growth and embryogenesis. We examined the expression profile of 22 imprinted genes which have been linked to pregnancy abnormalities that may ultimately influence childhood development. The study was conducted in a subset of 106 placenta samples, overrepresented with small and large for gestational age cases, from the Rhode Island Child Health Study. We investigated associations between imprinted gene expression and three fetal development parameters: newborn head circumference, birth weight, and size for gestational age. Results from our investigation show that the maternally imprinted/paternally expressed gene ZNF331 inversely associates with each parameter to drive smaller fetal size, while paternally imprinted/maternally expressed gene SLC22A18 directly associates with the newborn head circumference promoting growth. Multidimensional Scaling analysis revealed two clusters within the 22 imprinted genes which are independently associated with fetoplacental development. Our data suggest that cluster 1 genes work by assuring cell growth and tissue development, while cluster 2 genes act by coordinating these processes. Results from this epidemiologic study offer solid support for the key role of imprinting in fetoplacental development. Copyright © 2012 Elsevier Ltd. All rights reserved.

  8. Expression atlas and comparative coexpression network analyses reveal important genes involved in the formation of lignified cell wall in Brachypodium distachyon.

    PubMed

    Sibout, Richard; Proost, Sebastian; Hansen, Bjoern Oest; Vaid, Neha; Giorgi, Federico M; Ho-Yue-Kuang, Severine; Legée, Frédéric; Cézart, Laurent; Bouchabké-Coussa, Oumaya; Soulhat, Camille; Provart, Nicholas; Pasha, Asher; Le Bris, Philippe; Roujol, David; Hofte, Herman; Jamet, Elisabeth; Lapierre, Catherine; Persson, Staffan; Mutwil, Marek

    2017-08-01

    While Brachypodium distachyon (Brachypodium) is an emerging model for grasses, no expression atlas or gene coexpression network is available. Such tools are of high importance to provide insights into the function of Brachypodium genes. We present a detailed Brachypodium expression atlas, capturing gene expression in its major organs at different developmental stages. The data were integrated into a large-scale coexpression database ( www.gene2function.de), enabling identification of duplicated pathways and conserved processes across 10 plant species, thus allowing genome-wide inference of gene function. We highlight the importance of the atlas and the platform through the identification of duplicated cell wall modules, and show that a lignin biosynthesis module is conserved across angiosperms. We identified and functionally characterised a putative ferulate 5-hydroxylase gene through overexpression of it in Brachypodium, which resulted in an increase in lignin syringyl units and reduced lignin content of mature stems, and led to improved saccharification of the stem biomass. Our Brachypodium expression atlas thus provides a powerful resource to reveal functionally related genes, which may advance our understanding of important biological processes in grasses. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.

  9. Gene Expression Analysis: Teaching Students to Do 30,000 Experiments at Once with Microarray

    ERIC Educational Resources Information Center

    Carvalho, Felicia I.; Johns, Christopher; Gillespie, Marc E.

    2012-01-01

    Genome scale experiments routinely produce large data sets that require computational analysis, yet there are few student-based labs that illustrate the design and execution of these experiments. In order for students to understand and participate in the genomic world, teaching labs must be available where students generate and analyze large data…

  10. Optimal consistency in microRNA expression analysis using reference-gene-based normalization.

    PubMed

    Wang, Xi; Gardiner, Erin J; Cairns, Murray J

    2015-05-01

    Normalization of high-throughput molecular expression profiles secures differential expression analysis between samples of different phenotypes or biological conditions, and facilitates comparison between experimental batches. While the same general principles apply to microRNA (miRNA) normalization, there is mounting evidence that global shifts in their expression patterns occur in specific circumstances, which pose a challenge for normalizing miRNA expression data. As an alternative to global normalization, which has the propensity to flatten large trends, normalization against constitutively expressed reference genes presents an advantage through their relative independence. Here we investigated the performance of reference-gene-based (RGB) normalization for differential miRNA expression analysis of microarray expression data, and compared the results with other normalization methods, including: quantile, variance stabilization, robust spline, simple scaling, rank invariant, and Loess regression. The comparative analyses were executed using miRNA expression in tissue samples derived from subjects with schizophrenia and non-psychiatric controls. We proposed a consistency criterion for evaluating methods by examining the overlapping of differentially expressed miRNAs detected using different partitions of the whole data. Based on this criterion, we found that RGB normalization generally outperformed global normalization methods. Thus we recommend the application of RGB normalization for miRNA expression data sets, and believe that this will yield a more consistent and useful readout of differentially expressed miRNAs, particularly in biological conditions characterized by large shifts in miRNA expression.

  11. MALDI-TOF mass spectrometry for quantitative gene expression analysis of acid responses in Staphylococcus aureus.

    PubMed

    Rode, Tone Mari; Berget, Ingunn; Langsrud, Solveig; Møretrø, Trond; Holck, Askild

    2009-07-01

    Microorganisms are constantly exposed to new and altered growth conditions, and respond by changing gene expression patterns. Several methods for studying gene expression exist. During the last decade, the analysis of microarrays has been one of the most common approaches applied for large scale gene expression studies. A relatively new method for gene expression analysis is MassARRAY, which combines real competitive-PCR and MALDI-TOF (matrix-assisted laser desorption/ionization time-of-flight) mass spectrometry. In contrast to microarray methods, MassARRAY technology is suitable for analysing a larger number of samples, though for a smaller set of genes. In this study we compare the results from MassARRAY with microarrays on gene expression responses of Staphylococcus aureus exposed to acid stress at pH 4.5. RNA isolated from the same stress experiments was analysed using both the MassARRAY and the microarray methods. The MassARRAY and microarray methods showed good correlation. Both MassARRAY and microarray estimated somewhat lower fold changes compared with quantitative real-time PCR (qRT-PCR). The results confirmed the up-regulation of the urease genes in acidic environments, and also indicated the importance of metal ion regulation. This study shows that the MassARRAY technology is suitable for gene expression analysis in prokaryotes, and has advantages when a set of genes is being analysed for an organism exposed to many different environmental conditions.

  12. Ol-Prx 3, a member of an additional class of homeobox genes, is unimodally expressed in several domains of the developing and adult central nervous system of the medaka (Oryzias latipes)

    PubMed Central

    Joly, Jean-Stephane; Bourrat, Franck; Nguyen, Van; Chourrout, Daniel

    1997-01-01

    Large-scale genetic screens for mutations affecting early neurogenesis of vertebrates have recently been performed with an aquarium fish, the zebrafish. Later stages of neural morphogenesis have attracted less attention in small fish species, partly because of the lack of molecular markers of developing structures that may facilitate the detection of discrete structural alterations. In this context, we report the characterization of Ol-Prx 3 (Oryzias latipes-Prx 3). This gene was isolated in the course of a large-scale screen for brain cDNAs containing a highly conserved DNA binding region, the homeobox helix-three. Sequence analysis revealed that this gene belongs to another class of homeobox genes, together with a previously isolated mouse ortholog, called OG-12 [Rovescalli, A. C., Asoh, S. & Nirenberg, M. (1996) Proc. Natl. Acad. Sci. USA 93, 10691–10696] and with the human SHOX gene [Rao, E., Weiss, B., Fukami, M., Rump, A., Niesler, B., et al. (1997) Nat. Genet. 16, 54–62], thought to be involved in the short-stature phenotype of Turner syndrome patients. These three genes exhibit a moderate level of identity in the homeobox with the other genes of the paired-related (PRX) gene family. Ol-Prx 3, as well as the PRX genes, are expressed in various cartilaginous structures of head and limbs. These genes might thus be involved in common regulatory pathways during the morphogenesis of these structures. Moreover, this paper reports a complex and monophasic pattern of Ol-Prx 3 expression in the central nervous system, which differs markedly from the patterns reported for the PRX genes, Prx 3 excluded: this gene begins to be expressed in a variety of central nervous system territories at late neurula stage. Strikingly, it remains turned on in some of the derivatives of each territory during the entire life of the fish. We hope this work will thus help identify common features for the PRX 3 family of homeobox genes. PMID:9371787

  13. TLM-Quant: an open-source pipeline for visualization and quantification of gene expression heterogeneity in growing microbial cells.

    PubMed

    Piersma, Sjouke; Denham, Emma L; Drulhe, Samuel; Tonk, Rudi H J; Schwikowski, Benno; van Dijl, Jan Maarten

    2013-01-01

    Gene expression heterogeneity is a key driver for microbial adaptation to fluctuating environmental conditions, cell differentiation and the evolution of species. This phenomenon has therefore enormous implications, not only for life in general, but also for biotechnological applications where unwanted subpopulations of non-producing cells can emerge in large-scale fermentations. Only time-lapse fluorescence microscopy allows real-time measurements of gene expression heterogeneity. A major limitation in the analysis of time-lapse microscopy data is the lack of fast, cost-effective, open, simple and adaptable protocols. Here we describe TLM-Quant, a semi-automatic pipeline for the analysis of time-lapse fluorescence microscopy data that enables the user to visualize and quantify gene expression heterogeneity. Importantly, our pipeline builds on the open-source packages ImageJ and R. To validate TLM-Quant, we selected three possible scenarios, namely homogeneous expression, highly 'noisy' heterogeneous expression, and bistable heterogeneous expression in the Gram-positive bacterium Bacillus subtilis. This bacterium is both a paradigm for systems-level studies on gene expression and a highly appreciated biotechnological 'cell factory'. We conclude that the temporal resolution of such analyses with TLM-Quant is only limited by the numbers of recorded images.

  14. A regulation probability model-based meta-analysis of multiple transcriptomics data sets for cancer biomarker identification.

    PubMed

    Xie, Xin-Ping; Xie, Yu-Feng; Wang, Hong-Qiang

    2017-08-23

    Large-scale accumulation of omics data poses a pressing challenge of integrative analysis of multiple data sets in bioinformatics. An open question of such integrative analysis is how to pinpoint consistent but subtle gene activity patterns across studies. Study heterogeneity needs to be addressed carefully for this goal. This paper proposes a regulation probability model-based meta-analysis, jGRP, for identifying differentially expressed genes (DEGs). The method integrates multiple transcriptomics data sets in a gene regulatory space instead of in a gene expression space, which makes it easy to capture and manage data heterogeneity across studies from different laboratories or platforms. Specifically, we transform gene expression profiles into a united gene regulation profile across studies by mathematically defining two gene regulation events between two conditions and estimating their occurring probabilities in a sample. Finally, a novel differential expression statistic is established based on the gene regulation profiles, realizing accurate and flexible identification of DEGs in gene regulation space. We evaluated the proposed method on simulation data and real-world cancer datasets and showed the effectiveness and efficiency of jGRP in identifying DEGs identification in the context of meta-analysis. Data heterogeneity largely influences the performance of meta-analysis of DEGs identification. Existing different meta-analysis methods were revealed to exhibit very different degrees of sensitivity to study heterogeneity. The proposed method, jGRP, can be a standalone tool due to its united framework and controllable way to deal with study heterogeneity.

  15. Gene expression inference with deep learning.

    PubMed

    Chen, Yifei; Li, Yi; Narayan, Rajiv; Subramanian, Aravind; Xie, Xiaohui

    2016-06-15

    Large-scale gene expression profiling has been widely used to characterize cellular states in response to various disease conditions, genetic perturbations, etc. Although the cost of whole-genome expression profiles has been dropping steadily, generating a compendium of expression profiling over thousands of samples is still very expensive. Recognizing that gene expressions are often highly correlated, researchers from the NIH LINCS program have developed a cost-effective strategy of profiling only ∼1000 carefully selected landmark genes and relying on computational methods to infer the expression of remaining target genes. However, the computational approach adopted by the LINCS program is currently based on linear regression (LR), limiting its accuracy since it does not capture complex nonlinear relationship between expressions of genes. We present a deep learning method (abbreviated as D-GEX) to infer the expression of target genes from the expression of landmark genes. We used the microarray-based Gene Expression Omnibus dataset, consisting of 111K expression profiles, to train our model and compare its performance to those from other methods. In terms of mean absolute error averaged across all genes, deep learning significantly outperforms LR with 15.33% relative improvement. A gene-wise comparative analysis shows that deep learning achieves lower error than LR in 99.97% of the target genes. We also tested the performance of our learned model on an independent RNA-Seq-based GTEx dataset, which consists of 2921 expression profiles. Deep learning still outperforms LR with 6.57% relative improvement, and achieves lower error in 81.31% of the target genes. D-GEX is available at https://github.com/uci-cbcl/D-GEX CONTACT: xhx@ics.uci.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. Gene expression inference with deep learning

    PubMed Central

    Chen, Yifei; Li, Yi; Narayan, Rajiv; Subramanian, Aravind; Xie, Xiaohui

    2016-01-01

    Motivation: Large-scale gene expression profiling has been widely used to characterize cellular states in response to various disease conditions, genetic perturbations, etc. Although the cost of whole-genome expression profiles has been dropping steadily, generating a compendium of expression profiling over thousands of samples is still very expensive. Recognizing that gene expressions are often highly correlated, researchers from the NIH LINCS program have developed a cost-effective strategy of profiling only ∼1000 carefully selected landmark genes and relying on computational methods to infer the expression of remaining target genes. However, the computational approach adopted by the LINCS program is currently based on linear regression (LR), limiting its accuracy since it does not capture complex nonlinear relationship between expressions of genes. Results: We present a deep learning method (abbreviated as D-GEX) to infer the expression of target genes from the expression of landmark genes. We used the microarray-based Gene Expression Omnibus dataset, consisting of 111K expression profiles, to train our model and compare its performance to those from other methods. In terms of mean absolute error averaged across all genes, deep learning significantly outperforms LR with 15.33% relative improvement. A gene-wise comparative analysis shows that deep learning achieves lower error than LR in 99.97% of the target genes. We also tested the performance of our learned model on an independent RNA-Seq-based GTEx dataset, which consists of 2921 expression profiles. Deep learning still outperforms LR with 6.57% relative improvement, and achieves lower error in 81.31% of the target genes. Availability and implementation: D-GEX is available at https://github.com/uci-cbcl/D-GEX. Contact: xhx@ics.uci.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26873929

  17. Cloud-scale genomic signals processing classification analysis for gene expression microarray data.

    PubMed

    Harvey, Benjamin; Soo-Yeon Ji

    2014-01-01

    As microarray data available to scientists continues to increase in size and complexity, it has become overwhelmingly important to find multiple ways to bring inference though analysis of DNA/mRNA sequence data that is useful to scientists. Though there have been many attempts to elucidate the issue of bringing forth biological inference by means of wavelet preprocessing and classification, there has not been a research effort that focuses on a cloud-scale classification analysis of microarray data using Wavelet thresholding in a Cloud environment to identify significantly expressed features. This paper proposes a novel methodology that uses Wavelet based Denoising to initialize a threshold for determination of significantly expressed genes for classification. Additionally, this research was implemented and encompassed within cloud-based distributed processing environment. The utilization of Cloud computing and Wavelet thresholding was used for the classification 14 tumor classes from the Global Cancer Map (GCM). The results proved to be more accurate than using a predefined p-value for differential expression classification. This novel methodology analyzed Wavelet based threshold features of gene expression in a Cloud environment, furthermore classifying the expression of samples by analyzing gene patterns, which inform us of biological processes. Moreover, enabling researchers to face the present and forthcoming challenges that may arise in the analysis of data in functional genomics of large microarray datasets.

  18. Construction of two vectors for gene expression in Trichoderma reesei.

    PubMed

    Lv, Dandan; Wang, Wei; Wei, Dongzhi

    2012-01-01

    We report the construction of two filamentous fungi Trichoderma reesei expression vectors, pWEF31 and pWEF32. Both vectors possess the hygromycin phosphotransferase B gene expression cassette and the strong promoter and terminator of the cellobiohydrolase 1 gene (cbh1) from T. reesei. The two newly constructed vectors can be efficiently transformed into T. reesei with Agrobacterium-mediated transformation. The difference between pWEF31 and pWEF32 is that pWEF32 has two longer homologous arms. As a result, pWEF32 easily undergoes homologous recombination. On the other hand, pWEF31 undergoes random recombination. The applicability of both vectors was tested by first generating the expression vectors pWEF31-red and pWEF32-red and then detecting the expression of the DsRed2 gene in T. reesei Rut C30. Additionally, we measured the exo-1,4-β-glucanase activity of the recombinant cells. Our work provides an effective transformation system for homologous and heterologous gene expression and gene knockout in T. reesei. It also provides a method for recombination at a specific chromosomal location. Finally, both vectors will be useful for the large-scale gene expression industry. Copyright © 2011 Elsevier Inc. All rights reserved.

  19. Integrating genome-wide association studies and gene expression data highlights dysregulated multiple sclerosis risk pathways.

    PubMed

    Liu, Guiyou; Zhang, Fang; Jiang, Yongshuai; Hu, Yang; Gong, Zhongying; Liu, Shoufeng; Chen, Xiuju; Jiang, Qinghua; Hao, Junwei

    2017-02-01

    Much effort has been expended on identifying the genetic determinants of multiple sclerosis (MS). Existing large-scale genome-wide association study (GWAS) datasets provide strong support for using pathway and network-based analysis methods to investigate the mechanisms underlying MS. However, no shared genetic pathways have been identified to date. We hypothesize that shared genetic pathways may indeed exist in different MS-GWAS datasets. Here, we report results from a three-stage analysis of GWAS and expression datasets. In stage 1, we conducted multiple pathway analyses of two MS-GWAS datasets. In stage 2, we performed a candidate pathway analysis of the large-scale MS-GWAS dataset. In stage 3, we performed a pathway analysis using the dysregulated MS gene list from seven human MS case-control expression datasets. In stage 1, we identified 15 shared pathways. In stage 2, we successfully replicated 14 of these 15 significant pathways. In stage 3, we found that dysregulated MS genes were significantly enriched in 10 of 15 MS risk pathways identified in stages 1 and 2. We report shared genetic pathways in different MS-GWAS datasets and highlight some new MS risk pathways. Our findings provide new insights on the genetic determinants of MS.

  20. -A curated transcriptomic dataset collection relevant to embryonic development associated with in vitro fertilization in healthy individuals and patients with polycystic ovary syndrome.

    PubMed

    Mackeh, Rafah; Boughorbel, Sabri; Chaussabel, Damien; Kino, Tomoshige

    2017-01-01

    The collection of large-scale datasets available in public repositories is rapidly growing and providing opportunities to identify and fill gaps in different fields of biomedical research. However, users of these datasets should be able to selectively browse datasets related to their field of interest. Here we made available a collection of transcriptome datasets related to human follicular cells from normal individuals or patients with polycystic ovary syndrome, in the process of their development, during in vitro fertilization. After RNA-seq dataset exclusion and careful selection based on study description and sample information, 12 datasets, encompassing a total of 85 unique transcriptome profiles, were identified in NCBI Gene Expression Omnibus and uploaded to the Gene Expression Browser (GXB), a web application specifically designed for interactive query and visualization of integrated large-scale data. Once annotated in GXB, multiple sample grouping has been made in order to create rank lists to allow easy data interpretation and comparison. The GXB tool also allows the users to browse a single gene across multiple projects to evaluate its expression profiles in multiple biological systems/conditions in a web-based customized graphical views. The curated dataset is accessible at the following link: http://ivf.gxbsidra.org/dm3/landing.gsp.

  1. ­A curated transcriptomic dataset collection relevant to embryonic development associated with in vitro fertilization in healthy individuals and patients with polycystic ovary syndrome

    PubMed Central

    Mackeh, Rafah; Boughorbel, Sabri; Chaussabel, Damien; Kino, Tomoshige

    2017-01-01

    The collection of large-scale datasets available in public repositories is rapidly growing and providing opportunities to identify and fill gaps in different fields of biomedical research. However, users of these datasets should be able to selectively browse datasets related to their field of interest. Here we made available a collection of transcriptome datasets related to human follicular cells from normal individuals or patients with polycystic ovary syndrome, in the process of their development, during in vitro fertilization. After RNA-seq dataset exclusion and careful selection based on study description and sample information, 12 datasets, encompassing a total of 85 unique transcriptome profiles, were identified in NCBI Gene Expression Omnibus and uploaded to the Gene Expression Browser (GXB), a web application specifically designed for interactive query and visualization of integrated large-scale data. Once annotated in GXB, multiple sample grouping has been made in order to create rank lists to allow easy data interpretation and comparison. The GXB tool also allows the users to browse a single gene across multiple projects to evaluate its expression profiles in multiple biological systems/conditions in a web-based customized graphical views. The curated dataset is accessible at the following link: http://ivf.gxbsidra.org/dm3/landing.gsp. PMID:28413616

  2. Systematic Analysis of Zn2Cys6 Transcription Factors Required for Development and Pathogenicity by High-Throughput Gene Knockout in the Rice Blast Fungus

    PubMed Central

    Huang, Pengyun; Lin, Fucheng

    2014-01-01

    Because of great challenges and workload in deleting genes on a large scale, the functions of most genes in pathogenic fungi are still unclear. In this study, we developed a high-throughput gene knockout system using a novel yeast-Escherichia-Agrobacterium shuttle vector, pKO1B, in the rice blast fungus Magnaporthe oryzae. Using this method, we deleted 104 fungal-specific Zn2Cys6 transcription factor (TF) genes in M. oryzae. We then analyzed the phenotypes of these mutants with regard to growth, asexual and infection-related development, pathogenesis, and 9 abiotic stresses. The resulting data provide new insights into how this rice pathogen of global significance regulates important traits in the infection cycle through Zn2Cys6TF genes. A large variation in biological functions of Zn2Cys6TF genes was observed under the conditions tested. Sixty-one of 104 Zn2Cys6 TF genes were found to be required for fungal development. In-depth analysis of TF genes revealed that TF genes involved in pathogenicity frequently tend to function in multiple development stages, and disclosed many highly conserved but unidentified functional TF genes of importance in the fungal kingdom. We further found that the virulence-required TF genes GPF1 and CNF2 have similar regulation mechanisms in the gene expression involved in pathogenicity. These experimental validations clearly demonstrated the value of a high-throughput gene knockout system in understanding the biological functions of genes on a genome scale in fungi, and provided a solid foundation for elucidating the gene expression network that regulates the development and pathogenicity of M. oryzae. PMID:25299517

  3. Modular and coordinated expression of immune system regulatory and signaling components in the developing and adult nervous system.

    PubMed

    Monzón-Sandoval, Jimena; Castillo-Morales, Atahualpa; Crampton, Sean; McKelvey, Laura; Nolan, Aoife; O'Keeffe, Gerard; Gutierrez, Humberto

    2015-01-01

    During development, the nervous system (NS) is assembled and sculpted through a concerted series of neurodevelopmental events orchestrated by a complex genetic programme. While neural-specific gene expression plays a critical part in this process, in recent years, a number of immune-related signaling and regulatory components have also been shown to play key physiological roles in the developing and adult NS. While the involvement of individual immune-related signaling components in neural functions may reflect their ubiquitous character, it may also reflect a much wider, as yet undescribed, genetic network of immune-related molecules acting as an intrinsic component of the neural-specific regulatory machinery that ultimately shapes the NS. In order to gain insights into the scale and wider functional organization of immune-related genetic networks in the NS, we examined the large scale pattern of expression of these genes in the brain. Our results show a highly significant correlated expression and transcriptional clustering among immune-related genes in the developing and adult brain, and this correlation was the highest in the brain when compared to muscle, liver, kidney and endothelial cells. We experimentally tested the regulatory clustering of immune system (IS) genes by using microarray expression profiling in cultures of dissociated neurons stimulated with the pro-inflammatory cytokine TNF-alpha, and found a highly significant enrichment of immune system-related genes among the resulting differentially expressed genes. Our findings strongly suggest a coherent recruitment of entire immune-related genetic regulatory modules by the neural-specific genetic programme that shapes the NS.

  4. DGEM--a microarray gene expression database for primary human disease tissues.

    PubMed

    Xia, Yuni; Campen, Andrew; Rigsby, Dan; Guo, Ying; Feng, Xingdong; Su, Eric W; Palakal, Mathew; Li, Shuyu

    2007-01-01

    Gene expression patterns can reflect gene regulations in human tissues under normal or pathologic conditions. Gene expression profiling data from studies of primary human disease samples are particularly valuable since these studies often span many years in order to collect patient clinical information and achieve a large sample size. Disease-to-Gene Expression Mapper (DGEM) provides a beneficial community resource to access and analyze these data; it currently includes Affymetrix oligonucleotide array datasets for more than 40 human diseases and 1400 samples. The data are normalized to the same scale and stored in a relational database. A statistical-analysis pipeline was implemented to identify genes abnormally expressed in disease tissues or genes whose expressions are associated with clinical parameters such as cancer patient survival. Data-mining results can be queried through a web-based interface at http://dgem.dhcp.iupui.edu/. The query tool enables dynamic generation of graphs and tables that are further linked to major gene and pathway resources that connect the data to relevant biology, including Entrez Gene and Kyoto Encyclopedia of Genes and Genomes (KEGG). In summary, DGEM provides scientists and physicians a valuable tool to study disease mechanisms, to discover potential disease biomarkers for diagnosis and prognosis, and to identify novel gene targets for drug discovery. The source code is freely available for non-profit use, on request to the authors.

  5. Genome-Level Longitudinal Expression of Signaling Pathways and Gene Networks in Pediatric Septic Shock

    PubMed Central

    Shanley, Thomas P; Cvijanovich, Natalie; Lin, Richard; Allen, Geoffrey L; Thomas, Neal J; Doctor, Allan; Kalyanaraman, Meena; Tofil, Nancy M; Penfil, Scott; Monaco, Marie; Odoms, Kelli; Barnes, Michael; Sakthivel, Bhuvaneswari; Aronow, Bruce J; Wong, Hector R

    2007-01-01

    We have conducted longitudinal studies focused on the expression profiles of signaling pathways and gene networks in children with septic shock. Genome-level expression profiles were generated from whole blood-derived RNA of children with septic shock (n = 30) corresponding to day one and day three of septic shock, respectively. Based on sequential statistical and expression filters, day one and day three of septic shock were characterized by differential regulation of 2,142 and 2,504 gene probes, respectively, relative to controls (n = 15). Venn analysis demonstrated 239 unique genes in the day one dataset, 598 unique genes in the day three dataset, and 1,906 genes common to both datasets. Functional analyses demonstrated time-dependent, differential regulation of genes involved in multiple signaling pathways and gene networks primarily related to immunity and inflammation. Notably, multiple and distinct gene networks involving T cell- and MHC antigen-related biology were persistently downregulated on both day one and day three. Further analyses demonstrated large scale, persistent downregulation of genes corresponding to functional annotations related to zinc homeostasis. These data represent the largest reported cohort of patients with septic shock subjected to longitudinal genome-level expression profiling. The data further advance our genome-level understanding of pediatric septic shock and support novel hypotheses. PMID:17932561

  6. Prediction of gene expression in embryonic structures of Drosophila melanogaster.

    PubMed

    Samsonova, Anastasia A; Niranjan, Mahesan; Russell, Steven; Brazma, Alvis

    2007-07-01

    Understanding how sets of genes are coordinately regulated in space and time to generate the diversity of cell types that characterise complex metazoans is a major challenge in modern biology. The use of high-throughput approaches, such as large-scale in situ hybridisation and genome-wide expression profiling via DNA microarrays, is beginning to provide insights into the complexities of development. However, in many organisms the collection and annotation of comprehensive in situ localisation data is a difficult and time-consuming task. Here, we present a widely applicable computational approach, integrating developmental time-course microarray data with annotated in situ hybridisation studies, that facilitates the de novo prediction of tissue-specific expression for genes that have no in vivo gene expression localisation data available. Using a classification approach, trained with data from microarray and in situ hybridisation studies of gene expression during Drosophila embryonic development, we made a set of predictions on the tissue-specific expression of Drosophila genes that have not been systematically characterised by in situ hybridisation experiments. The reliability of our predictions is confirmed by literature-derived annotations in FlyBase, by overrepresentation of Gene Ontology biological process annotations, and, in a selected set, by detailed gene-specific studies from the literature. Our novel organism-independent method will be of considerable utility in enriching the annotation of gene function and expression in complex multicellular organisms.

  7. Prediction of Gene Expression in Embryonic Structures of Drosophila melanogaster

    PubMed Central

    Samsonova, Anastasia A; Niranjan, Mahesan; Russell, Steven; Brazma, Alvis

    2007-01-01

    Understanding how sets of genes are coordinately regulated in space and time to generate the diversity of cell types that characterise complex metazoans is a major challenge in modern biology. The use of high-throughput approaches, such as large-scale in situ hybridisation and genome-wide expression profiling via DNA microarrays, is beginning to provide insights into the complexities of development. However, in many organisms the collection and annotation of comprehensive in situ localisation data is a difficult and time-consuming task. Here, we present a widely applicable computational approach, integrating developmental time-course microarray data with annotated in situ hybridisation studies, that facilitates the de novo prediction of tissue-specific expression for genes that have no in vivo gene expression localisation data available. Using a classification approach, trained with data from microarray and in situ hybridisation studies of gene expression during Drosophila embryonic development, we made a set of predictions on the tissue-specific expression of Drosophila genes that have not been systematically characterised by in situ hybridisation experiments. The reliability of our predictions is confirmed by literature-derived annotations in FlyBase, by overrepresentation of Gene Ontology biological process annotations, and, in a selected set, by detailed gene-specific studies from the literature. Our novel organism-independent method will be of considerable utility in enriching the annotation of gene function and expression in complex multicellular organisms. PMID:17658945

  8. Hidden among the crowd: differential DNA methylation-expression correlations in cancer occur at important oncogenic pathways

    PubMed Central

    Mosquera Orgueira, Adrián

    2015-01-01

    DNA methylation is a frequent epigenetic mechanism that participates in transcriptional repression. Variations in DNA methylation with respect to gene expression are constant, and, for unknown reasons, some genes with highly methylated promoters are sometimes overexpressed. In this study we have analyzed the expression and methylation patterns of thousands of genes in five groups of cancer and normal tissue samples in order to determine local and genome-wide differences. We observed significant changes in global methylation-expression correlation in all the neoplasms, which suggests that differential correlation events are frequent in cancer. A focused analysis in the breast cancer cohort identified 1662 genes whose correlation varies significantly between normal and cancerous breast, but whose DNA methylation and gene expression patterns do not change substantially. These genes were enriched in cancer-related pathways and repressive chromatin features across various model cell lines, such as PRC2 binding and H3K27me3 marks. Substantial changes in methylation-expression correlation indicate that these genes are subject to epigenetic remodeling, where the differential activity of other factors break the expected relationship between both variables. Our findings suggest a complex regulatory landscape where a redistribution of local and large-scale chromatin repressive domains at differentially correlated genes (DCGs) creates epigenetic hotspots that modulate cancer-specific gene expression. PMID:26029238

  9. Influence of age, sex, and strength training on human muscle gene expression determined by microarray

    PubMed Central

    ROTH, STEPHEN M.; FERRELL, ROBERT E.; PETERS, DAVID G.; METTER, E. JEFFREY; HURLEY, BEN F.; ROGERS, MARC A.

    2010-01-01

    The purpose of this study was to determine the influence of age, sex, and strength training (ST) on large-scale gene expression patterns in vastus lateralis muscle biopsies using high-density cDNA microarrays and quantitative PCR. Muscle samples from sedentary young (20–30 yr) and older (65–75 yr) men and women (5 per group) were obtained before and after a 9-wk unilateral heavy resistance ST program. RNA was hybridized to cDNA filter microarrays representing ~4,000 known human genes and comparisons were made among arrays to determine differential gene expression as a result of age and sex differences, and/or response to ST. Sex had the strongest influence on muscle gene expression, with differential expression (>1.7-fold) observed for ~200 genes between men and women (~75% with higher expression in men). Age contributed to differential expression as well, as ~50 genes were identified as differentially expressed (>1.7-fold) in relation to age, representing structural, metabolic, and regulatory gene classes. Sixty-nine genes were identified as being differentially expressed (>1.7-fold) in all groups in response to ST, and the majority of these were downregulated. Quantitative PCR was employed to validate expression levels for caldesmon, SWI/SNF (BAF60b), and four-and-a-half LIM domains 1. These significant differences suggest that in the analysis of skeletal muscle gene expression issues of sex, age, and habitual physical activity must be addressed, with sex being the most critical variable. PMID:12209020

  10. Transcriptional analysis of the Arabidopsis ovule by massively parallel signature sequencing

    PubMed Central

    Sánchez-León, Nidia; Arteaga-Vázquez, Mario; Alvarez-Mejía, César; Mendiola-Soto, Javier; Durán-Figueroa, Noé; Rodríguez-Leal, Daniel; Rodríguez-Arévalo, Isaac; García-Campayo, Vicenta; García-Aguilar, Marcelina; Olmedo-Monfil, Vianey; Arteaga-Sánchez, Mario; Martínez de la Vega, Octavio; Nobuta, Kan; Vemaraju, Kalyan; Meyers, Blake C.; Vielle-Calzada, Jean-Philippe

    2012-01-01

    The life cycle of flowering plants alternates between a predominant sporophytic (diploid) and an ephemeral gametophytic (haploid) generation that only occurs in reproductive organs. In Arabidopsis thaliana, the female gametophyte is deeply embedded within the ovule, complicating the study of the genetic and molecular interactions involved in the sporophytic to gametophytic transition. Massively parallel signature sequencing (MPSS) was used to conduct a quantitative large-scale transcriptional analysis of the fully differentiated Arabidopsis ovule prior to fertilization. The expression of 9775 genes was quantified in wild-type ovules, additionally detecting >2200 new transcripts mapping to antisense or intergenic regions. A quantitative comparison of global expression in wild-type and sporocyteless (spl) individuals resulted in 1301 genes showing 25-fold reduced or null activity in ovules lacking a female gametophyte, including those encoding 92 signalling proteins, 75 transcription factors, and 72 RNA-binding proteins not reported in previous studies based on microarray profiling. A combination of independent genetic and molecular strategies confirmed the differential expression of 28 of them, showing that they are either preferentially active in the female gametophyte, or dependent on the presence of a female gametophyte to be expressed in sporophytic cells of the ovule. Among 18 genes encoding pentatricopeptide-repeat proteins (PPRs) that show transcriptional activity in wild-type but not spl ovules, CIHUATEOTL (At4g38150) is specifically expressed in the female gametophyte and necessary for female gametogenesis. These results expand the nature of the transcriptional universe present in the ovule of Arabidopsis, and offer a large-scale quantitative reference of global expression for future genomic and developmental studies. PMID:22442422

  11. Transcriptional analysis of the Arabidopsis ovule by massively parallel signature sequencing.

    PubMed

    Sánchez-León, Nidia; Arteaga-Vázquez, Mario; Alvarez-Mejía, César; Mendiola-Soto, Javier; Durán-Figueroa, Noé; Rodríguez-Leal, Daniel; Rodríguez-Arévalo, Isaac; García-Campayo, Vicenta; García-Aguilar, Marcelina; Olmedo-Monfil, Vianey; Arteaga-Sánchez, Mario; de la Vega, Octavio Martínez; Nobuta, Kan; Vemaraju, Kalyan; Meyers, Blake C; Vielle-Calzada, Jean-Philippe

    2012-06-01

    The life cycle of flowering plants alternates between a predominant sporophytic (diploid) and an ephemeral gametophytic (haploid) generation that only occurs in reproductive organs. In Arabidopsis thaliana, the female gametophyte is deeply embedded within the ovule, complicating the study of the genetic and molecular interactions involved in the sporophytic to gametophytic transition. Massively parallel signature sequencing (MPSS) was used to conduct a quantitative large-scale transcriptional analysis of the fully differentiated Arabidopsis ovule prior to fertilization. The expression of 9775 genes was quantified in wild-type ovules, additionally detecting >2200 new transcripts mapping to antisense or intergenic regions. A quantitative comparison of global expression in wild-type and sporocyteless (spl) individuals resulted in 1301 genes showing 25-fold reduced or null activity in ovules lacking a female gametophyte, including those encoding 92 signalling proteins, 75 transcription factors, and 72 RNA-binding proteins not reported in previous studies based on microarray profiling. A combination of independent genetic and molecular strategies confirmed the differential expression of 28 of them, showing that they are either preferentially active in the female gametophyte, or dependent on the presence of a female gametophyte to be expressed in sporophytic cells of the ovule. Among 18 genes encoding pentatricopeptide-repeat proteins (PPRs) that show transcriptional activity in wild-type but not spl ovules, CIHUATEOTL (At4g38150) is specifically expressed in the female gametophyte and necessary for female gametogenesis. These results expand the nature of the transcriptional universe present in the ovule of Arabidopsis, and offer a large-scale quantitative reference of global expression for future genomic and developmental studies.

  12. A high resolution atlas of gene expression in the domestic sheep (Ovis aries)

    PubMed Central

    Farquhar, Iseabail L.; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G.; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C. Bruce; Freeman, Tom C.; Archibald, Alan L.; Hume, David A.

    2017-01-01

    Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of ‘guilt by association’ was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages. PMID:28915238

  13. A high resolution atlas of gene expression in the domestic sheep (Ovis aries).

    PubMed

    Clark, Emily L; Bush, Stephen J; McCulloch, Mary E B; Farquhar, Iseabail L; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G; Wu, Chunlei; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C Bruce; Freeman, Tom C; Summers, Kim M; Archibald, Alan L; Hume, David A

    2017-09-01

    Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of 'guilt by association' was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages.

  14. Functional regression method for whole genome eQTL epistasis analysis with sequencing data.

    PubMed

    Xu, Kelin; Jin, Li; Xiong, Momiao

    2017-05-18

    Epistasis plays an essential rule in understanding the regulation mechanisms and is an essential component of the genetic architecture of the gene expressions. However, interaction analysis of gene expressions remains fundamentally unexplored due to great computational challenges and data availability. Due to variation in splicing, transcription start sites, polyadenylation sites, post-transcriptional RNA editing across the entire gene, and transcription rates of the cells, RNA-seq measurements generate large expression variability and collectively create the observed position level read count curves. A single number for measuring gene expression which is widely used for microarray measured gene expression analysis is highly unlikely to sufficiently account for large expression variation across the gene. Simultaneously analyzing epistatic architecture using the RNA-seq and whole genome sequencing (WGS) data poses enormous challenges. We develop a nonlinear functional regression model (FRGM) with functional responses where the position-level read counts within a gene are taken as a function of genomic position, and functional predictors where genotype profiles are viewed as a function of genomic position, for epistasis analysis with RNA-seq data. Instead of testing the interaction of all possible pair-wises SNPs, the FRGM takes a gene as a basic unit for epistasis analysis, which tests for the interaction of all possible pairs of genes and use all the information that can be accessed to collectively test interaction between all possible pairs of SNPs within two genome regions. By large-scale simulations, we demonstrate that the proposed FRGM for epistasis analysis can achieve the correct type 1 error and has higher power to detect the interactions between genes than the existing methods. The proposed methods are applied to the RNA-seq and WGS data from the 1000 Genome Project. The numbers of pairs of significantly interacting genes after Bonferroni correction identified using FRGM, RPKM and DESeq were 16,2361, 260 and 51, respectively, from the 350 European samples. The proposed FRGM for epistasis analysis of RNA-seq can capture isoform and position-level information and will have a broad application. Both simulations and real data analysis highlight the potential for the FRGM to be a good choice of the epistatic analysis with sequencing data.

  15. Effects of Gene Duplication, Positive Selection, and Shifts in Gene Expression on the Evolution of the Venom Gland Transcriptome in Widow Spiders

    PubMed Central

    Haney, Robert A.; Clarke, Thomas H.; Gadgil, Rujuta; Fitzpatrick, Ryan; Hayashi, Cheryl Y.; Ayoub, Nadia A.; Garb, Jessica E.

    2016-01-01

    Gene duplication and positive selection can be important determinants of the evolution of venom, a protein-rich secretion used in prey capture and defense. In a typical model of venom evolution, gene duplicates switch to venom gland expression and change function under the action of positive selection, which together with further duplication produces large gene families encoding diverse toxins. Although these processes have been demonstrated for individual toxin families, high-throughput multitissue sequencing of closely related venomous species can provide insights into evolutionary dynamics at the scale of the entire venom gland transcriptome. By assembling and analyzing multitissue transcriptomes from the Western black widow spider and two closely related species with distinct venom toxicity phenotypes, we do not find that gene duplication and duplicate retention is greater in gene families with venom gland biased expression in comparison with broadly expressed families. Positive selection has acted on some venom toxin families, but does not appear to be in excess for families with venom gland biased expression. Moreover, we find 309 distinct gene families that have single transcripts with venom gland biased expression, suggesting that the switching of genes to venom gland expression in numerous unrelated gene families has been a dominant mode of evolution. We also find ample variation in protein sequences of venom gland–specific transcripts, lineage-specific family sizes, and ortholog expression among species. This variation might contribute to the variable venom toxicity of these species. PMID:26733576

  16. [Isolation and function of genes regulating aphB expression in Vibrio cholerae].

    PubMed

    Chen, Haili; Zhu, Zhaoqin; Zhong, Zengtao; Zhu, Jun; Kan, Biao

    2012-02-04

    We identified genes that regulate the expression of aphB, the gene encoding a key virulence regulator in Vibrio cholerae O1 E1 Tor C6706(-). We constructed a transposon library in V. cholerae C6706 strain containing a P(aphB)-luxCDABE and P(aphB)-lacZ transcriptional reporter plasmids. Using a chemiluminescence imager system, we rapidly detected aphB promoter expression level at a large scale. We then sequenced the transposon insertion sites by arbitrary PCR and sequencing analysis. We obtained two candidate mutants T1 and T2 which displayed reduced aphB expression from approximately 40,000 transposon insertion mutants. Sequencing analysis shows that Tn inserted in vc1585 reading frame in the T1 mutant and Tn inserted in the end of coding sequence of vc1602 in the T2 mutant. By using a genetic screen, we identified two potential genes that may involve in regulation of the expression of the key virulence regulator AphB. This study sheds light on our further investigation to fully understand V. cholerae virulence gene regulatory cascades.

  17. CellLineNavigator: a workbench for cancer cell line analysis

    PubMed Central

    Krupp, Markus; Itzel, Timo; Maass, Thorsten; Hildebrandt, Andreas; Galle, Peter R.; Teufel, Andreas

    2013-01-01

    The CellLineNavigator database, freely available at http://www.medicalgenomics.org/celllinenavigator, is a web-based workbench for large scale comparisons of a large collection of diverse cell lines. It aims to support experimental design in the fields of genomics, systems biology and translational biomedical research. Currently, this compendium holds genome wide expression profiles of 317 different cancer cell lines, categorized into 57 different pathological states and 28 individual tissues. To enlarge the scope of CellLineNavigator, the database was furthermore closely linked to commonly used bioinformatics databases and knowledge repositories. To ensure easy data access and search ability, a simple data and an intuitive querying interface were implemented. It allows the user to explore and filter gene expression, focusing on pathological or physiological conditions. For a more complex search, the advanced query interface may be used to query for (i) differentially expressed genes; (ii) pathological or physiological conditions; or (iii) gene names or functional attributes, such as Kyoto Encyclopaedia of Genes and Genomes pathway maps. These queries may also be combined. Finally, CellLineNavigator allows additional advanced analysis of differentially regulated genes by a direct link to the Database for Annotation, Visualization and Integrated Discovery (DAVID) Bioinformatics Resources. PMID:23118487

  18. A regulatory toolbox of MiniPromoters to drive selective expression in the brain.

    PubMed

    Portales-Casamar, Elodie; Swanson, Douglas J; Liu, Li; de Leeuw, Charles N; Banks, Kathleen G; Ho Sui, Shannan J; Fulton, Debra L; Ali, Johar; Amirabbasi, Mahsa; Arenillas, David J; Babyak, Nazar; Black, Sonia F; Bonaguro, Russell J; Brauer, Erich; Candido, Tara R; Castellarin, Mauro; Chen, Jing; Chen, Ying; Cheng, Jason C Y; Chopra, Vik; Docking, T Roderick; Dreolini, Lisa; D'Souza, Cletus A; Flynn, Erin K; Glenn, Randy; Hatakka, Kristi; Hearty, Taryn G; Imanian, Behzad; Jiang, Steven; Khorasan-zadeh, Shadi; Komljenovic, Ivana; Laprise, Stéphanie; Liao, Nancy Y; Lim, Jonathan S; Lithwick, Stuart; Liu, Flora; Liu, Jun; Lu, Meifen; McConechy, Melissa; McLeod, Andrea J; Milisavljevic, Marko; Mis, Jacek; O'Connor, Katie; Palma, Betty; Palmquist, Diana L; Schmouth, Jean-François; Swanson, Magdalena I; Tam, Bonny; Ticoll, Amy; Turner, Jenna L; Varhol, Richard; Vermeulen, Jenny; Watkins, Russell F; Wilson, Gary; Wong, Bibiana K Y; Wong, Siaw H; Wong, Tony Y T; Yang, George S; Ypsilanti, Athena R; Jones, Steven J M; Holt, Robert A; Goldowitz, Daniel; Wasserman, Wyeth W; Simpson, Elizabeth M

    2010-09-21

    The Pleiades Promoter Project integrates genomewide bioinformatics with large-scale knockin mouse production and histological examination of expression patterns to develop MiniPromoters and related tools designed to study and treat the brain by directed gene expression. Genes with brain expression patterns of interest are subjected to bioinformatic analysis to delineate candidate regulatory regions, which are then incorporated into a panel of compact human MiniPromoters to drive expression to brain regions and cell types of interest. Using single-copy, homologous-recombination "knockins" in embryonic stem cells, each MiniPromoter reporter is integrated immediately 5' of the Hprt locus in the mouse genome. MiniPromoter expression profiles are characterized in differentiation assays of the transgenic cells or in mouse brains following transgenic mouse production. Histological examination of adult brains, eyes, and spinal cords for reporter gene activity is coupled to costaining with cell-type-specific markers to define expression. The publicly available Pleiades MiniPromoter Project is a key resource to facilitate research on brain development and therapies.

  19. Diversification of Root Hair Development Genes in Vascular Plants.

    PubMed

    Huang, Ling; Shi, Xinhui; Wang, Wenjia; Ryu, Kook Hui; Schiefelbein, John

    2017-07-01

    The molecular genetic program for root hair development has been studied intensively in Arabidopsis ( Arabidopsis thaliana ). To understand the extent to which this program might operate in other plants, we conducted a large-scale comparative analysis of root hair development genes from diverse vascular plants, including eudicots, monocots, and a lycophyte. Combining phylogenetics and transcriptomics, we discovered conservation of a core set of root hair genes across all vascular plants, which may derive from an ancient program for unidirectional cell growth coopted for root hair development during vascular plant evolution. Interestingly, we also discovered preferential diversification in the structure and expression of root hair development genes, relative to other root hair- and root-expressed genes, among these species. These differences enabled the definition of sets of genes and gene functions that were acquired or lost in specific lineages during vascular plant evolution. In particular, we found substantial divergence in the structure and expression of genes used for root hair patterning, suggesting that the Arabidopsis transcriptional regulatory mechanism is not shared by other species. To our knowledge, this study provides the first comprehensive view of gene expression in a single plant cell type across multiple species. © 2017 American Society of Plant Biologists. All Rights Reserved.

  20. Diversification of Root Hair Development Genes in Vascular Plants1[OPEN

    PubMed Central

    Shi, Xinhui; Wang, Wenjia; Ryu, Kook Hui

    2017-01-01

    The molecular genetic program for root hair development has been studied intensively in Arabidopsis (Arabidopsis thaliana). To understand the extent to which this program might operate in other plants, we conducted a large-scale comparative analysis of root hair development genes from diverse vascular plants, including eudicots, monocots, and a lycophyte. Combining phylogenetics and transcriptomics, we discovered conservation of a core set of root hair genes across all vascular plants, which may derive from an ancient program for unidirectional cell growth coopted for root hair development during vascular plant evolution. Interestingly, we also discovered preferential diversification in the structure and expression of root hair development genes, relative to other root hair- and root-expressed genes, among these species. These differences enabled the definition of sets of genes and gene functions that were acquired or lost in specific lineages during vascular plant evolution. In particular, we found substantial divergence in the structure and expression of genes used for root hair patterning, suggesting that the Arabidopsis transcriptional regulatory mechanism is not shared by other species. To our knowledge, this study provides the first comprehensive view of gene expression in a single plant cell type across multiple species. PMID:28487476

  1. Harnessing Diversity towards the Reconstructing of Large Scale Gene Regulatory Networks

    PubMed Central

    Yamanaka, Ryota; Kitano, Hiroaki

    2013-01-01

    Elucidating gene regulatory network (GRN) from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i) a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii) TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks. PMID:24278007

  2. Integrative Approach to Pain Genetics Identifies Pain Sensitivity Loci across Diseases

    PubMed Central

    Ruau, David; Dudley, Joel T.; Chen, Rong; Phillips, Nicholas G.; Swan, Gary E.; Lazzeroni, Laura C.; Clark, J. David

    2012-01-01

    Identifying human genes relevant for the processing of pain requires difficult-to-conduct and expensive large-scale clinical trials. Here, we examine a novel integrative paradigm for data-driven discovery of pain gene candidates, taking advantage of the vast amount of existing disease-related clinical literature and gene expression microarray data stored in large international repositories. First, thousands of diseases were ranked according to a disease-specific pain index (DSPI), derived from Medical Subject Heading (MESH) annotations in MEDLINE. Second, gene expression profiles of 121 of these human diseases were obtained from public sources. Third, genes with expression variation significantly correlated with DSPI across diseases were selected as candidate pain genes. Finally, selected candidate pain genes were genotyped in an independent human cohort and prospectively evaluated for significant association between variants and measures of pain sensitivity. The strongest signal was with rs4512126 (5q32, ABLIM3, P = 1.3×10−10) for the sensitivity to cold pressor pain in males, but not in females. Significant associations were also observed with rs12548828, rs7826700 and rs1075791 on 8q22.2 within NCALD (P = 1.7×10−4, 1.8×10−4, and 2.2×10−4 respectively). Our results demonstrate the utility of a novel paradigm that integrates publicly available disease-specific gene expression data with clinical data curated from MEDLINE to facilitate the discovery of pain-relevant genes. This data-derived list of pain gene candidates enables additional focused and efficient biological studies validating additional candidates. PMID:22685391

  3. A two-step hierarchical hypothesis set testing framework, with applications to gene expression data on ordered categories

    PubMed Central

    2014-01-01

    Background In complex large-scale experiments, in addition to simultaneously considering a large number of features, multiple hypotheses are often being tested for each feature. This leads to a problem of multi-dimensional multiple testing. For example, in gene expression studies over ordered categories (such as time-course or dose-response experiments), interest is often in testing differential expression across several categories for each gene. In this paper, we consider a framework for testing multiple sets of hypothesis, which can be applied to a wide range of problems. Results We adopt the concept of the overall false discovery rate (OFDR) for controlling false discoveries on the hypothesis set level. Based on an existing procedure for identifying differentially expressed gene sets, we discuss a general two-step hierarchical hypothesis set testing procedure, which controls the overall false discovery rate under independence across hypothesis sets. In addition, we discuss the concept of the mixed-directional false discovery rate (mdFDR), and extend the general procedure to enable directional decisions for two-sided alternatives. We applied the framework to the case of microarray time-course/dose-response experiments, and proposed three procedures for testing differential expression and making multiple directional decisions for each gene. Simulation studies confirm the control of the OFDR and mdFDR by the proposed procedures under independence and positive correlations across genes. Simulation results also show that two of our new procedures achieve higher power than previous methods. Finally, the proposed methodology is applied to a microarray dose-response study, to identify 17 β-estradiol sensitive genes in breast cancer cells that are induced at low concentrations. Conclusions The framework we discuss provides a platform for multiple testing procedures covering situations involving two (or potentially more) sources of multiplicity. The framework is easy to use and adaptable to various practical settings that frequently occur in large-scale experiments. Procedures generated from the framework are shown to maintain control of the OFDR and mdFDR, quantities that are especially relevant in the case of multiple hypothesis set testing. The procedures work well in both simulations and real datasets, and are shown to have better power than existing methods. PMID:24731138

  4. Probabilistic representation of gene regulatory networks.

    PubMed

    Mao, Linyong; Resat, Haluk

    2004-09-22

    Recent experiments have established unambiguously that biological systems can have significant cell-to-cell variations in gene expression levels even in isogenic populations. Computational approaches to studying gene expression in cellular systems should capture such biological variations for a more realistic representation. In this paper, we present a new fully probabilistic approach to the modeling of gene regulatory networks that allows for fluctuations in the gene expression levels. The new algorithm uses a very simple representation for the genes, and accounts for the repression or induction of the genes and for the biological variations among isogenic populations simultaneously. Because of its simplicity, introduced algorithm is a very promising approach to model large-scale gene regulatory networks. We have tested the new algorithm on the synthetic gene network library bioengineered recently. The good agreement between the computed and the experimental results for this library of networks, and additional tests, demonstrate that the new algorithm is robust and very successful in explaining the experimental data. The simulation software is available upon request. Supplementary material will be made available on the OUP server.

  5. Gene expression profiles of auxin metabolism in maturing apple fruit

    USDA-ARS?s Scientific Manuscript database

    Variation exists among apple genotypes in fruit maturation and ripening patterns that influences at-harvest fruit firmness and postharvest storability. Based on the results from our previous large-scale transcriptome profiling on apple fruit maturation and well-documented auxin-ethylene crosstalk, t...

  6. The Effect of Gestational Age on Angiogenic Gene Expression in the Rat Placenta

    PubMed Central

    Vaswani, Kanchan; Hum, Melissa Wen-Ching; Chan, Hsiu-Wen; Ryan, Jennifer; Wood-Bradley, Ryan J.; Nitert, Marloes Dekker; Mitchell, Murray D.; Armitage, James A.; Rice, Gregory E.

    2013-01-01

    The placenta plays a central role in determining the outcome of pregnancy. It undergoes changes during gestation as the fetus develops and as demands for energy substrate transfer and gas exchange increase. The molecular mechanisms that coordinate these changes have yet to be fully elucidated. The study performed a large scale screen of the transcriptome of the rat placenta throughout mid-late gestation (E14.25–E20) with emphasis on characterizing gestational age associated changes in the expression of genes invoved in angiogenic pathways. Sprague Dawley dams were sacrificed at E14.25, E15.25, E17.25 and E20 (n = 6 per group) and RNA was isolated from one placenta per dam. Changes in placental gene expression were identifed using Illumina Rat Ref-12 Expression BeadChip Microarrays. Differentially expressed genes (>2-fold change, <1% false discovery rate, FDR) were functionally categorised by gene ontology pathway analysis. A subset of differentially expressed genes identified by microarrays were confirmed using Real-Time qPCR. The expression of thirty one genes involved in the angiogenic pathway was shown to change over time, using microarray analysis (22 genes displayed increased and 9 gene decreased expression). Five genes (4 up regulated: Cd36, Mmp14, Rhob and Angpt4 and 1 down regulated: Foxm1) involved in angiogenesis and blood vessel morphogenesis were subjected to further validation. qPCR confirmed late gestational increased expression of Cd36, Mmp14, Rhob and Angpt4 and a decrease in expression of Foxm1 before labour onset (P<0.0001). The observed acute, pre-labour changes in the expression of the 31 genes during gestation warrant further investigation to elucidate their role in pregnancy. PMID:24391823

  7. Identification of reference genes for quantitative expression analysis using large-scale RNA-seq data of Arabidopsis thaliana and model crop plants.

    PubMed

    Kudo, Toru; Sasaki, Yohei; Terashima, Shin; Matsuda-Imai, Noriko; Takano, Tomoyuki; Saito, Misa; Kanno, Maasa; Ozaki, Soichi; Suwabe, Keita; Suzuki, Go; Watanabe, Masao; Matsuoka, Makoto; Takayama, Seiji; Yano, Kentaro

    2016-10-13

    In quantitative gene expression analysis, normalization using a reference gene as an internal control is frequently performed for appropriate interpretation of the results. Efforts have been devoted to exploring superior novel reference genes using microarray transcriptomic data and to evaluating commonly used reference genes by targeting analysis. However, because the number of specifically detectable genes is totally dependent on probe design in the microarray analysis, exploration using microarray data may miss some of the best choices for the reference genes. Recently emerging RNA sequencing (RNA-seq) provides an ideal resource for comprehensive exploration of reference genes since this method is capable of detecting all expressed genes, in principle including even unknown genes. We report the results of a comprehensive exploration of reference genes using public RNA-seq data from plants such as Arabidopsis thaliana (Arabidopsis), Glycine max (soybean), Solanum lycopersicum (tomato) and Oryza sativa (rice). To select reference genes suitable for the broadest experimental conditions possible, candidates were surveyed by the following four steps: (1) evaluation of the basal expression level of each gene in each experiment; (2) evaluation of the expression stability of each gene in each experiment; (3) evaluation of the expression stability of each gene across the experiments; and (4) selection of top-ranked genes, after ranking according to the number of experiments in which the gene was expressed stably. Employing this procedure, 13, 10, 12 and 21 top candidates for reference genes were proposed in Arabidopsis, soybean, tomato and rice, respectively. Microarray expression data confirmed that the expression of the proposed reference genes under broad experimental conditions was more stable than that of commonly used reference genes. These novel reference genes will be useful for analyzing gene expression profiles across experiments carried out under various experimental conditions.

  8. Analysis of the Nicotiana tabacum Stigma/Style Transcriptome Reveals Gene Expression Differences between Wet and Dry Stigma Species1[W][OA

    PubMed Central

    Quiapim, Andréa C.; Brito, Michael S.; Bernardes, Luciano A.S.; daSilva, Idalete; Malavazi, Iran; DePaoli, Henrique C.; Molfetta-Machado, Jeanne B.; Giuliatti, Silvana; Goldman, Gustavo H.; Goldman, Maria Helena S.

    2009-01-01

    The success of plant reproduction depends on pollen-pistil interactions occurring at the stigma/style. These interactions vary depending on the stigma type: wet or dry. Tobacco (Nicotiana tabacum) represents a model of wet stigma, and its stigmas/styles express genes to accomplish the appropriate functions. For a large-scale study of gene expression during tobacco pistil development and preparation for pollination, we generated 11,216 high-quality expressed sequence tags (ESTs) from stigmas/styles and created the TOBEST database. These ESTs were assembled in 6,177 clusters, from which 52.1% are pistil transcripts/genes of unknown function. The 21 clusters with the highest number of ESTs (putative higher expression levels) correspond to genes associated with defense mechanisms or pollen-pistil interactions. The database analysis unraveled tobacco sequences homologous to the Arabidopsis (Arabidopsis thaliana) genes involved in specifying pistil identity or determining normal pistil morphology and function. Additionally, 782 independent clusters were examined by macroarray, revealing 46 stigma/style preferentially expressed genes. Real-time reverse transcription-polymerase chain reaction experiments validated the pistil-preferential expression for nine out of 10 genes tested. A search for these 46 genes in the Arabidopsis pistil data sets demonstrated that only 11 sequences, with putative equivalent molecular functions, are expressed in this dry stigma species. The reverse search for the Arabidopsis pistil genes in the TOBEST exposed a partial overlap between these dry and wet stigma transcriptomes. The TOBEST represents the most extensive survey of gene expression in the stigmas/styles of wet stigma plants, and our results indicate that wet and dry stigmas/styles express common as well as distinct genes in preparation for the pollination process. PMID:19052150

  9. Identification of human circadian genes based on time course gene expression profiles by using a deep learning method.

    PubMed

    Cui, Peng; Zhong, Tingyan; Wang, Zhuo; Wang, Tao; Zhao, Hongyu; Liu, Chenglin; Lu, Hui

    2018-06-01

    Circadian genes express periodically in an approximate 24-h period and the identification and study of these genes can provide deep understanding of the circadian control which plays significant roles in human health. Although many circadian gene identification algorithms have been developed, large numbers of false positives and low coverage are still major problems in this field. In this study we constructed a novel computational framework for circadian gene identification using deep neural networks (DNN) - a deep learning algorithm which can represent the raw form of data patterns without imposing assumptions on the expression distribution. Firstly, we transformed time-course gene expression data into categorical-state data to denote the changing trend of gene expression. Two distinct expression patterns emerged after clustering of the state data for circadian genes from our manually created learning dataset. DNN was then applied to discriminate the aperiodic genes and the two subtypes of periodic genes. In order to assess the performance of DNN, four commonly used machine learning methods including k-nearest neighbors, logistic regression, naïve Bayes, and support vector machines were used for comparison. The results show that the DNN model achieves the best balanced precision and recall. Next, we conducted large scale circadian gene detection using the trained DNN model for the remaining transcription profiles. Comparing with JTK_CYCLE and a study performed by Möller-Levet et al. (doi: https://doi.org/10.1073/pnas.1217154110), we identified 1132 novel periodic genes. Through the functional analysis of these novel circadian genes, we found that the GTPase superfamily exhibits distinct circadian expression patterns and may provide a molecular switch of circadian control of the functioning of the immune system in human blood. Our study provides novel insights into both the circadian gene identification field and the study of complex circadian-driven biological control. This article is part of a Special Issue entitled: Accelerating Precision Medicine through Genetic and Genomic Big Data Analysis edited by Yudong Cai & Tao Huang. Copyright © 2017. Published by Elsevier B.V.

  10. Evolution of Sulfobacillus thermosulfidooxidans secreting alginate during bioleaching of chalcopyrite concentrate.

    PubMed

    Yu, R-L; Liu, A; Liu, Y; Yu, Z; Peng, T; Wu, X; Shen, L; Liu, Y; Li, J; Liu, X; Qiu, G; Chen, M; Zeng, W

    2017-06-01

    To explore the distribution disciplinarian of alginate on the chalcopyrite concentrate surface during bioleaching. The evolution of Sulfobacillus thermosulfidooxidans secreting alginate during bioleaching of chalcopyrite concentrate was investigated through gas chromatography coupled with mass spectrometry (GC-MS) and confocal laser scanning microscope (CLSM), and the critical synthetic genes (algA, algC, algD) of alginate were analysed by real-time polymerase chain reaction (RT-PCR). The GC-MS analysis results indicated that there was a little amount of alginate formed on the mineral surface at the early stage, while increasing largely to the maximum value at the intermediate stage, and then kept a stable value at the end stage. The CLSM analysis of chalcopyrite slice showed the same variation trend of alginate content on the mineral surface. Furthermore, the RT-PCR results showed that during the early stage of bioleaching, the expressions of the algA, algC and the algD genes were all overexpressed. However, at the final stage, the algD gene expression decreased in a large scale, and the algA and algC decreased slightly. This expression pattern was attributed to the fact that algA and algC genes were involved in several biosynthesis reactions, but the algD gene only participated in the alginate biosynthesis and this was considered as the key gene to control alginate synthesis. The content of alginate on the mineral surface increased largely at the beginning of bioleaching, and remained stable at the end of bioleaching due to the restriction of algD gene expression. Our findings provide valuable information to explore the relationship between alginate formation and bioleaching of chalcopyrite. © 2017 The Society for Applied Microbiology.

  11. A combinatorial code for pattern formation in Drosophila oogenesis.

    PubMed

    Yakoby, Nir; Bristow, Christopher A; Gong, Danielle; Schafer, Xenia; Lembong, Jessica; Zartman, Jeremiah J; Halfon, Marc S; Schüpbach, Trudi; Shvartsman, Stanislav Y

    2008-11-01

    Two-dimensional patterning of the follicular epithelium in Drosophila oogenesis is required for the formation of three-dimensional eggshell structures. Our analysis of a large number of published gene expression patterns in the follicle cells suggests that they follow a simple combinatorial code based on six spatial building blocks and the operations of union, difference, intersection, and addition. The building blocks are related to the distribution of inductive signals, provided by the highly conserved epidermal growth factor receptor and bone morphogenetic protein signaling pathways. We demonstrate the validity of the code by testing it against a set of patterns obtained in a large-scale transcriptional profiling experiment. Using the proposed code, we distinguish 36 distinct patterns for 81 genes expressed in the follicular epithelium and characterize their joint dynamics over four stages of oogenesis. The proposed combinatorial framework allows systematic analysis of the diversity and dynamics of two-dimensional transcriptional patterns and guides future studies of gene regulation.

  12. Cloning and heterologous expression of genes from the kinamycin biosynthetic pathway of Streptomyces murayamaensis.

    PubMed

    Gould, S J; Hong, S T; Carney, J R

    1998-01-01

    The genes for most of the biosynthesis of the kinamycin antibiotics have been cloned and heterologously expressed. Genomic DNA of Streptomyces murayamaensis was partially digested with MboI and a library of approximately 40 kb fragments in E. coli XL1-BlueMR was prepared using the cosmid vector pOJ446. Hybridization with the actI probe from the actinorhodin polyketide synthase genes identified two clusters of polyketide genes. After transferal of these clusters to S. lividans ZX7, expression of one cluster was established by HPLC with photodiode array detection. Peaks were identified from the kin cluster for dehydrorabelomycin, kinobscurinone, and stealthin C, which are known intermediates in kinamycin biosynthesis. Two shunt metabolites, kinafluorenone and seongomycin were also identified. The structure of the latter was determined from a quantity obtained from large-scale fermentation of one of the clones.

  13. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

    PubMed Central

    Azad, Ariful; Ouzounis, Christos A; Kyrpides, Nikos C; Buluç, Aydin

    2018-01-01

    Abstract Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times and memory demands. Here, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ∼70 million nodes with ∼68 billion edges in ∼2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license. PMID:29315405

  14. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

    DOE PAGES

    Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.; ...

    2018-01-05

    Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less

  15. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.

    Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less

  16. High-Throughput Screening Using iPSC-Derived Neuronal Progenitors to Identify Compounds Counteracting Epigenetic Gene Silencing in Fragile X Syndrome.

    PubMed

    Kaufmann, Markus; Schuffenhauer, Ansgar; Fruh, Isabelle; Klein, Jessica; Thiemeyer, Anke; Rigo, Pierre; Gomez-Mancilla, Baltazar; Heidinger-Millot, Valerie; Bouwmeester, Tewis; Schopfer, Ulrich; Mueller, Matthias; Fodor, Barna D; Cobos-Correa, Amanda

    2015-10-01

    Fragile X syndrome (FXS) is the most common form of inherited mental retardation, and it is caused in most of cases by epigenetic silencing of the Fmr1 gene. Today, no specific therapy exists for FXS, and current treatments are only directed to improve behavioral symptoms. Neuronal progenitors derived from FXS patient induced pluripotent stem cells (iPSCs) represent a unique model to study the disease and develop assays for large-scale drug discovery screens since they conserve the Fmr1 gene silenced within the disease context. We have established a high-content imaging assay to run a large-scale phenotypic screen aimed to identify compounds that reactivate the silenced Fmr1 gene. A set of 50,000 compounds was tested, including modulators of several epigenetic targets. We describe an integrated drug discovery model comprising iPSC generation, culture scale-up, and quality control and screening with a very sensitive high-content imaging assay assisted by single-cell image analysis and multiparametric data analysis based on machine learning algorithms. The screening identified several compounds that induced a weak expression of fragile X mental retardation protein (FMRP) and thus sets the basis for further large-scale screens to find candidate drugs or targets tackling the underlying mechanism of FXS with potential for therapeutic intervention. © 2015 Society for Laboratory Automation and Screening.

  17. Evolution of a Cellular Immune Response in Drosophila: A Phenotypic and Genomic Comparative Analysis

    PubMed Central

    Salazar-Jaramillo, Laura; Paspati, Angeliki; van de Zande, Louis; Vermeulen, Cornelis Joseph; Schwander, Tanja; Wertheim, Bregje

    2014-01-01

    Understanding the genomic basis of evolutionary adaptation requires insight into the molecular basis underlying phenotypic variation. However, even changes in molecular pathways associated with extreme variation, gains and losses of specific phenotypes, remain largely uncharacterized. Here, we investigate the large interspecific differences in the ability to survive infection by parasitoids across 11 Drosophila species and identify genomic changes associated with gains and losses of parasitoid resistance. We show that a cellular immune defense, encapsulation, and the production of a specialized blood cell, lamellocytes, are restricted to a sublineage of Drosophila, but that encapsulation is absent in one species of this sublineage, Drosophila sechellia. Our comparative analyses of hemopoiesis pathway genes and of genes differentially expressed during the encapsulation response revealed that hemopoiesis-associated genes are highly conserved and present in all species independently of their resistance. In contrast, 11 genes that are differentially expressed during the response to parasitoids are novel genes, specific to the Drosophila sublineage capable of lamellocyte-mediated encapsulation. These novel genes, which are predominantly expressed in hemocytes, arose via duplications, whereby five of them also showed signatures of positive selection, as expected if they were recruited for new functions. Three of these novel genes further showed large-scale and presumably loss-of-function sequence changes in D. sechellia, consistent with the loss of resistance in this species. In combination, these convergent lines of evidence suggest that co-option of duplicated genes in existing pathways and subsequent neofunctionalization are likely to have contributed to the evolution of the lamellocyte-mediated encapsulation in Drosophila. PMID:24443439

  18. Evolution of a cellular immune response in Drosophila: a phenotypic and genomic comparative analysis.

    PubMed

    Salazar-Jaramillo, Laura; Paspati, Angeliki; van de Zande, Louis; Vermeulen, Cornelis Joseph; Schwander, Tanja; Wertheim, Bregje

    2014-02-01

    Understanding the genomic basis of evolutionary adaptation requires insight into the molecular basis underlying phenotypic variation. However, even changes in molecular pathways associated with extreme variation, gains and losses of specific phenotypes, remain largely uncharacterized. Here, we investigate the large interspecific differences in the ability to survive infection by parasitoids across 11 Drosophila species and identify genomic changes associated with gains and losses of parasitoid resistance. We show that a cellular immune defense, encapsulation, and the production of a specialized blood cell, lamellocytes, are restricted to a sublineage of Drosophila, but that encapsulation is absent in one species of this sublineage, Drosophila sechellia. Our comparative analyses of hemopoiesis pathway genes and of genes differentially expressed during the encapsulation response revealed that hemopoiesis-associated genes are highly conserved and present in all species independently of their resistance. In contrast, 11 genes that are differentially expressed during the response to parasitoids are novel genes, specific to the Drosophila sublineage capable of lamellocyte-mediated encapsulation. These novel genes, which are predominantly expressed in hemocytes, arose via duplications, whereby five of them also showed signatures of positive selection, as expected if they were recruited for new functions. Three of these novel genes further showed large-scale and presumably loss-of-function sequence changes in D. sechellia, consistent with the loss of resistance in this species. In combination, these convergent lines of evidence suggest that co-option of duplicated genes in existing pathways and subsequent neofunctionalization are likely to have contributed to the evolution of the lamellocyte-mediated encapsulation in Drosophila.

  19. Evaluation of Bias-Variance Trade-Off for Commonly Used Post-Summarizing Normalization Procedures in Large-Scale Gene Expression Studies

    PubMed Central

    Qiu, Xing; Hu, Rui; Wu, Zhixin

    2014-01-01

    Normalization procedures are widely used in high-throughput genomic data analyses to remove various technological noise and variations. They are known to have profound impact to the subsequent gene differential expression analysis. Although there has been some research in evaluating different normalization procedures, few attempts have been made to systematically evaluate the gene detection performances of normalization procedures from the bias-variance trade-off point of view, especially with strong gene differentiation effects and large sample size. In this paper, we conduct a thorough study to evaluate the effects of normalization procedures combined with several commonly used statistical tests and MTPs under different configurations of effect size and sample size. We conduct theoretical evaluation based on a random effect model, as well as simulation and biological data analyses to verify the results. Based on our findings, we provide some practical guidance for selecting a suitable normalization procedure under different scenarios. PMID:24941114

  20. Metadata Analysis of Phanerochaete chrysosporium Gene Expression Data Identified Common CAZymes Encoding Gene Expression Profiles Involved in Cellulose and Hemicellulose Degradation.

    PubMed

    Kameshwar, Ayyappa Kumar Sista; Qin, Wensheng

    2017-01-01

    In literature, extensive studies have been conducted on popular wood degrading white rot fungus, Phanerochaete chrysosporium about its lignin degrading mechanisms compared to the cellulose and hemicellulose degrading abilities. This study delineates cellulose and hemicellulose degrading mechanisms through large scale metadata analysis of P. chrysosporium gene expression data (retrieved from NCBI GEO) to understand the common expression patterns of differentially expressed genes when cultured on different growth substrates. Genes encoding glycoside hydrolase classes commonly expressed during breakdown of cellulose such as GH-5,6,7,9,44,45,48 and hemicellulose are GH-2,8,10,11,26,30,43,47 were found to be highly expressed among varied growth conditions including simple customized and complex natural plant biomass growth mediums. Genes encoding carbohydrate esterase class enzymes CE (1,4,8,9,15,16) polysaccharide lyase class enzymes PL-8 and PL-14, and glycosyl transferases classes GT (1,2,4,8,15,20,35,39,48) were differentially expressed in natural plant biomass growth mediums. Based on these results, P. chrysosporium, on natural plant biomass substrates was found to express lignin and hemicellulose degrading enzymes more than cellulolytic enzymes except GH-61 (LPMO) class enzymes, in early stages. It was observed that the fate of P. chrysosporium transcriptome is significantly affected by the wood substrate provided. We believe, the gene expression findings in this study plays crucial role in developing genetically efficient microbe with effective cellulose and hemicellulose degradation abilities.

  1. Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies

    PubMed Central

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance

    2013-01-01

    RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets. PMID:25937948

  2. Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies.

    PubMed

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance

    2013-01-01

    RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets.

  3. Microbial forensics: predicting phenotypic characteristics and environmental conditions from large-scale gene expression profiles.

    PubMed

    Kim, Minseung; Zorraquino, Violeta; Tagkopoulos, Ilias

    2015-03-01

    A tantalizing question in cellular physiology is whether the cellular state and environmental conditions can be inferred by the expression signature of an organism. To investigate this relationship, we created an extensive normalized gene expression compendium for the bacterium Escherichia coli that was further enriched with meta-information through an iterative learning procedure. We then constructed an ensemble method to predict environmental and cellular state, including strain, growth phase, medium, oxygen level, antibiotic and carbon source presence. Results show that gene expression is an excellent predictor of environmental structure, with multi-class ensemble models achieving balanced accuracy between 70.0% (±3.5%) to 98.3% (±2.3%) for the various characteristics. Interestingly, this performance can be significantly boosted when environmental and strain characteristics are simultaneously considered, as a composite classifier that captures the inter-dependencies of three characteristics (medium, phase and strain) achieved 10.6% (±1.0%) higher performance than any individual models. Contrary to expectations, only 59% of the top informative genes were also identified as differentially expressed under the respective conditions. Functional analysis of the respective genetic signatures implicates a wide spectrum of Gene Ontology terms and KEGG pathways with condition-specific information content, including iron transport, transferases, and enterobactin synthesis. Further experimental phenotypic-to-genotypic mapping that we conducted for knock-out mutants argues for the information content of top-ranked genes. This work demonstrates the degree at which genome-scale transcriptional information can be predictive of latent, heterogeneous and seemingly disparate phenotypic and environmental characteristics, with far-reaching applications.

  4. The Use of Mouse Models to Study Epigenetics

    PubMed Central

    Blewitt, Marnie; Whitelaw, Emma

    2013-01-01

    Much of what we know about the role of epigenetics in the determination of phenotype has come from studies of inbred mice. Some unusual expression patterns arising from endogenous and transgenic murine alleles, such as the Agouti coat color alleles, have allowed the study of variegation, variable expressivity, transgenerational epigenetic inheritance, parent-of-origin effects, and position effects. These phenomena have taught us much about gene silencing and the probabilistic nature of epigenetic processes. Based on some of these alleles, large-scale mutagenesis screens have broadened our knowledge of epigenetic control by identifying and characterizing novel genes involved in these processes. PMID:24186070

  5. TLM-Quant: An Open-Source Pipeline for Visualization and Quantification of Gene Expression Heterogeneity in Growing Microbial Cells

    PubMed Central

    Piersma, Sjouke; Denham, Emma L.; Drulhe, Samuel; Tonk, Rudi H. J.; Schwikowski, Benno; van Dijl, Jan Maarten

    2013-01-01

    Gene expression heterogeneity is a key driver for microbial adaptation to fluctuating environmental conditions, cell differentiation and the evolution of species. This phenomenon has therefore enormous implications, not only for life in general, but also for biotechnological applications where unwanted subpopulations of non-producing cells can emerge in large-scale fermentations. Only time-lapse fluorescence microscopy allows real-time measurements of gene expression heterogeneity. A major limitation in the analysis of time-lapse microscopy data is the lack of fast, cost-effective, open, simple and adaptable protocols. Here we describe TLM-Quant, a semi-automatic pipeline for the analysis of time-lapse fluorescence microscopy data that enables the user to visualize and quantify gene expression heterogeneity. Importantly, our pipeline builds on the open-source packages ImageJ and R. To validate TLM-Quant, we selected three possible scenarios, namely homogeneous expression, highly ‘noisy’ heterogeneous expression, and bistable heterogeneous expression in the Gram-positive bacterium Bacillus subtilis. This bacterium is both a paradigm for systems-level studies on gene expression and a highly appreciated biotechnological ‘cell factory’. We conclude that the temporal resolution of such analyses with TLM-Quant is only limited by the numbers of recorded images. PMID:23874729

  6. Rapid Y degeneration and dosage compensation in plant sex chromosomes

    PubMed Central

    Papadopulos, Alexander S. T.; Chester, Michael; Ridout, Kate; Filatov, Dmitry A.

    2015-01-01

    The nonrecombining regions of animal Y chromosomes are known to undergo genetic degeneration, but previous work has failed to reveal large-scale gene degeneration on plant Y chromosomes. Here, we uncover rapid and extensive degeneration of Y-linked genes in a plant species, Silene latifolia, that evolved sex chromosomes de novo in the last 10 million years. Previous transcriptome-based studies of this species missed unexpressed, degenerate Y-linked genes. To identify sex-linked genes, regardless of their expression, we sequenced male and female genomes of S. latifolia and integrated the genomic contigs with a high-density genetic map. This revealed that 45% of Y-linked genes are not expressed, and 23% are interrupted by premature stop codons. This contrasts with X-linked genes, in which only 1.3% of genes contained stop codons and 4.3% of genes were not expressed in males. Loss of functional Y-linked genes is partly compensated for by gene-specific up-regulation of X-linked genes. Our results demonstrate that the rate of genetic degeneration of Y-linked genes in S. latifolia is as fast as in animals, and that the evolutionary trajectories of sex chromosomes are similar in the two kingdoms. PMID:26438872

  7. Large-scale identification of wheat genes resistant to cereal cyst nematode Heterodera avenae using comparative transcriptomic analysis.

    PubMed

    Kong, Ling-An; Wu, Du-Qing; Huang, Wen-Kun; Peng, Huan; Wang, Gao-Feng; Cui, Jiang-Kuan; Liu, Shi-Ming; Li, Zhi-Gang; Yang, Jun; Peng, De-Liang

    2015-10-16

    Cereal cyst nematode Heterodera avenae, an important soil-borne pathogen in wheat, causes numerous annual yield losses worldwide, and use of resistant cultivars is the best strategy for control. However, target genes are not readily available for breeding resistant cultivars. Therefore, comparative transcriptomic analyses were performed to identify more applicable resistance genes for cultivar breeding. The developing nematodes within roots were stained with acid fuchsin solution. Transcriptome assemblies and redundancy filteration were obtained by Trinity, TGI Clustering Tool and BLASTN, respectively. Gene Ontology annotation was yielded by Blast2GO program, and metabolic pathways of transcripts were analyzed by Path_finder. The ROS levels were determined by luminol-chemiluminescence assay. The transcriptional gene expression profiles were obtained by quantitative RT-PCR. The RNA-sequencing was performed using an incompatible wheat cultivar VP1620 and a compatible control cultivar WEN19 infected with H. avenae at 24 h, 3 d and 8 d. Infection assays showed that VP1620 failed to block penetration of H. avenae but disturbed the transition of developmental stages, leading to a significant reduction in cyst formation. Two types of expression profiles were established to predict candidate resistance genes after developing a novel strategy to generate clean RNA-seq data by removing the transcripts of H. avenae within the raw data before assembly. Using the uncoordinated expression profiles with transcript abundance as a standard, 424 candidate resistance genes were identified, including 302 overlapping genes and 122 VP1620-specific genes. Genes with similar expression patterns were further classified according to the scales of changed transcript abundances, and 182 genes were rescued as supplementary candidate resistance genes. Functional characterizations revealed that diverse defense-related pathways were responsible for wheat resistance against H. avenae. Moreover, phospholipase was involved in many defense-related pathways and localized in the connection position. Furthermore, strong bursts of reactive oxygen species (ROS) within VP1620 roots infected with H. avenae were induced at 24 h and 3 d, and eight ROS-producing genes were significantly upregulated, including three class III peroxidase and five lipoxygenase genes. Large-scale identification of wheat resistance genes were processed by comparative transcriptomic analysis. Functional characterization showed that phospholipases associated with ROS production played vital roles in early defense responses to H. avenae via involvement in diverse defense-related pathways as a hub switch. This study is the first to investigate the early defense responses of wheat against H. avenae, not only provides applicable candidate resistance genes for breeding novel wheat cultivars, but also enables a better understanding of the defense mechanisms of wheat against H. avenae.

  8. Gene expression analysis of flax seed development

    PubMed Central

    2011-01-01

    Background Flax, Linum usitatissimum L., is an important crop whose seed oil and stem fiber have multiple industrial applications. Flax seeds are also well-known for their nutritional attributes, viz., omega-3 fatty acids in the oil and lignans and mucilage from the seed coat. In spite of the importance of this crop, there are few molecular resources that can be utilized toward improving seed traits. Here, we describe flax embryo and seed development and generation of comprehensive genomic resources for the flax seed. Results We describe a large-scale generation and analysis of expressed sequences in various tissues. Collectively, the 13 libraries we have used provide a broad representation of genes active in developing embryos (globular, heart, torpedo, cotyledon and mature stages) seed coats (globular and torpedo stages) and endosperm (pooled globular to torpedo stages) and genes expressed in flowers, etiolated seedlings, leaves, and stem tissue. A total of 261,272 expressed sequence tags (EST) (GenBank accessions LIBEST_026995 to LIBEST_027011) were generated. These EST libraries included transcription factor genes that are typically expressed at low levels, indicating that the depth is adequate for in silico expression analysis. Assembly of the ESTs resulted in 30,640 unigenes and 82% of these could be identified on the basis of homology to known and hypothetical genes from other plants. When compared with fully sequenced plant genomes, the flax unigenes resembled poplar and castor bean more than grape, sorghum, rice or Arabidopsis. Nearly one-fifth of these (5,152) had no homologs in sequences reported for any organism, suggesting that this category represents genes that are likely unique to flax. Digital analyses revealed gene expression dynamics for the biosynthesis of a number of important seed constituents during seed development. Conclusions We have developed a foundational database of expressed sequences and collection of plasmid clones that comprise even low-expressed genes such as those encoding transcription factors. This has allowed us to delineate the spatio-temporal aspects of gene expression underlying the biosynthesis of a number of important seed constituents in flax. Flax belongs to a taxonomic group of diverse plants and the large sequence database will allow for evolutionary studies as well. PMID:21529361

  9. Expression profiling reveals distinct sets of genes altered during induction and regression of cardiac hypertrophy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Friddle, Carl J; Koga, Teiichiro; Rubin, Edward M.

    2000-03-15

    While cardiac hypertrophy has been the subject of intensive investigation, regression of hypertrophy has been significantly less studied, precluding large-scale analysis of the relationship between these processes. In the present study, using pharmacological models of hypertrophy in mice, expression profiling was performed with fragments of more than 3,000 genes to characterize and contrast expression changes during induction and regression of hypertrophy. Administration of angiotensin II and isoproterenol by osmotic minipump produced increases in heart weight (15% and 40% respectively) that returned to pre-induction size following drug withdrawal. From multiple expression analyses of left ventricular RNA isolated at daily time-points duringmore » cardiac hypertrophy and regression, we identified sets of genes whose expression was altered at specific stages of this process. While confirming the participation of 25 genes or pathways previously known to be altered by hypertrophy, a larger set of 30 genes was identified whose expression had not previously been associated with cardiac hypertrophy or regression. Of the 55 genes that showed reproducible changes during the time course of induction and regression, 32 genes were altered only during induction and 8 were altered only during regression. This study identified both known and novel genes whose expression is affected at different stages of cardiac hypertrophy and regression and demonstrates that cardiac remodeling during regression utilizes a set of genes that are distinct from those used during induction of hypertrophy.« less

  10. Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks

    PubMed Central

    Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis

    2012-01-01

    Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606

  11. Adaptive Mutations in RNA Polymerase and the Transcriptional Terminator Rho Have Similar Effects on Escherichia coli Gene Expression.

    PubMed

    González-González, Andrea; Hug, Shaun M; Rodríguez-Verdugo, Alejandra; Patel, Jagdish Suresh; Gaut, Brandon S

    2017-11-01

    Modifications to transcriptional regulators play a major role in adaptation. Here, we compared the effects of multiple beneficial mutations within and between Escherichia coli rpoB, the gene encoding the RNA polymerase β subunit, and rho, which encodes a transcriptional terminator. These two genes have harbored adaptive mutations in numerous E. coli evolution experiments but particularly in our previous large-scale thermal stress experiment, where the two genes characterized alternative adaptive pathways. To compare the effects of beneficial mutations, we engineered four advantageous mutations into each of the two genes and measured their effects on fitness, growth, gene expression and transcriptional termination at 42.2 °C. Among the eight mutations, two rho mutations had no detectable effect on relative fitness, suggesting they were beneficial only in the context of epistatic interactions. The remaining six mutations had an average relative fitness benefit of ∼20%. The rpoB mutations affected the expression of ∼1,700 genes; rho mutations affected the expression of fewer genes but most (83%) were a subset of those altered by rpoB mutants. Across the eight mutants, relative fitness correlated with the degree to which a mutation restored gene expression back to the unstressed, 37.0 °C state. The beneficial mutations in the two genes did not have identical effects on fitness, growth or gene expression, but they caused parallel phenotypic effects on gene expression and genome-wide transcriptional termination. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  12. Transcriptome Analysis of the Differentially Expressed Genes in the Male and Female Shrub Willows (Salix suchowensis)

    PubMed Central

    Liu, Jingjing; Yin, Tongming; Ye, Ning; Chen, Yingnan; Yin, Tingting; Liu, Min; Hassani, Danial

    2013-01-01

    Background The dioecious system is relatively rare in plants. Shrub willow is an annual flowering dioecious woody plant, and possesses many characteristics that lend it as a great model for tracking the missing pieces of sex determination evolution. To gain a global view of the genes differentially expressed in the male and female shrub willows and to develop a database for further studies, we performed a large-scale transcriptome sequencing of flower buds which were separately collected from two types of sexes. Results Totally, 1,201,931 high quality reads were obtained, with an average length of 389 bp and a total length of 467.96 Mb. The ESTs were assembled into 29,048 contigs, and 132,709 singletons. These unigenes were further functionally annotated by comparing their sequences to different proteins and functional domain databases and assigned with Gene Ontology (GO) terms. A biochemical pathway database containing 291 predicted pathways was also created based on the annotations of the unigenes. Digital expression analysis identified 806 differentially expressed genes between the male and female flower buds. And 33 of them located on the incipient sex chromosome of Salicaceae, among which, 12 genes might involve in plant sex determination empirically. These genes were worthy of special notification in future studies. Conclusions In this study, a large number of EST sequences were generated from the flower buds of a male and a female shrub willow. We also reported the differentially expressed genes between the two sex-type flowers. This work provides valuable information and sequence resources for uncovering the sex determining genes and for future functional genomics analysis of Salicaceae spp. PMID:23560075

  13. A Modified ABCDE Model of Flowering in Orchids Based on Gene Expression Profiling Studies of the Moth Orchid Phalaenopsis aphrodite

    PubMed Central

    Lee, Ann-Ying; Chen, Chun-Yi; Chang, Yao-Chien Alex; Chao, Ya-Ting; Shih, Ming-Che

    2013-01-01

    Previously we developed genomic resources for orchids, including transcriptomic analyses using next-generation sequencing techniques and construction of a web-based orchid genomic database. Here, we report a modified molecular model of flower development in the Orchidaceae based on functional analysis of gene expression profiles in Phalaenopsis aphrodite (a moth orchid) that revealed novel roles for the transcription factors involved in floral organ pattern formation. Phalaenopsis orchid floral organ-specific genes were identified by microarray analysis. Several critical transcription factors including AP3, PI, AP1 and AGL6, displayed distinct spatial distribution patterns. Phylogenetic analysis of orchid MADS box genes was conducted to infer the evolutionary relationship among floral organ-specific genes. The results suggest that gene duplication MADS box genes in orchid may have resulted in their gaining novel functions during evolution. Based on these analyses, a modified model of orchid flowering was proposed. Comparison of the expression profiles of flowers of a peloric mutant and wild-type Phalaenopsis orchid further identified genes associated with lip morphology and peloric effects. Large scale investigation of gene expression profiles revealed that homeotic genes from the ABCDE model of flower development classes A and B in the Phalaenopsis orchid have novel functions due to evolutionary diversification, and display differential expression patterns. PMID:24265826

  14. Integrative analysis of RUNX1 downstream pathways and target genes

    PubMed Central

    Michaud, Joëlle; Simpson, Ken M; Escher, Robert; Buchet-Poyau, Karine; Beissbarth, Tim; Carmichael, Catherine; Ritchie, Matthew E; Schütz, Frédéric; Cannon, Ping; Liu, Marjorie; Shen, Xiaofeng; Ito, Yoshiaki; Raskind, Wendy H; Horwitz, Marshall S; Osato, Motomi; Turner, David R; Speed, Terence P; Kavallaris, Maria; Smyth, Gordon K; Scott, Hamish S

    2008-01-01

    Background The RUNX1 transcription factor gene is frequently mutated in sporadic myeloid and lymphoid leukemia through translocation, point mutation or amplification. It is also responsible for a familial platelet disorder with predisposition to acute myeloid leukemia (FPD-AML). The disruption of the largely unknown biological pathways controlled by RUNX1 is likely to be responsible for the development of leukemia. We have used multiple microarray platforms and bioinformatic techniques to help identify these biological pathways to aid in the understanding of why RUNX1 mutations lead to leukemia. Results Here we report genes regulated either directly or indirectly by RUNX1 based on the study of gene expression profiles generated from 3 different human and mouse platforms. The platforms used were global gene expression profiling of: 1) cell lines with RUNX1 mutations from FPD-AML patients, 2) over-expression of RUNX1 and CBFβ, and 3) Runx1 knockout mouse embryos using either cDNA or Affymetrix microarrays. We observe that our datasets (lists of differentially expressed genes) significantly correlate with published microarray data from sporadic AML patients with mutations in either RUNX1 or its cofactor, CBFβ. A number of biological processes were identified among the differentially expressed genes and functional assays suggest that heterozygous RUNX1 point mutations in patients with FPD-AML impair cell proliferation, microtubule dynamics and possibly genetic stability. In addition, analysis of the regulatory regions of the differentially expressed genes has for the first time systematically identified numerous potential novel RUNX1 target genes. Conclusion This work is the first large-scale study attempting to identify the genetic networks regulated by RUNX1, a master regulator in the development of the hematopoietic system and leukemia. The biological pathways and target genes controlled by RUNX1 will have considerable importance in disease progression in both familial and sporadic leukemia as well as therapeutic implications. PMID:18671852

  15. GECKO: a complete large-scale gene expression analysis platform.

    PubMed

    Theilhaber, Joachim; Ulyanov, Anatoly; Malanthara, Anish; Cole, Jack; Xu, Dapeng; Nahf, Robert; Heuer, Michael; Brockel, Christoph; Bushnell, Steven

    2004-12-10

    Gecko (Gene Expression: Computation and Knowledge Organization) is a complete, high-capacity centralized gene expression analysis system, developed in response to the needs of a distributed user community. Based on a client-server architecture, with a centralized repository of typically many tens of thousands of Affymetrix scans, Gecko includes automatic processing pipelines for uploading data from remote sites, a data base, a computational engine implementing approximately 50 different analysis tools, and a client application. Among available analysis tools are clustering methods, principal component analysis, supervised classification including feature selection and cross-validation, multi-factorial ANOVA, statistical contrast calculations, and various post-processing tools for extracting data at given error rates or significance levels. On account of its open architecture, Gecko also allows for the integration of new algorithms. The Gecko framework is very general: non-Affymetrix and non-gene expression data can be analyzed as well. A unique feature of the Gecko architecture is the concept of the Analysis Tree (actually, a directed acyclic graph), in which all successive results in ongoing analyses are saved. This approach has proven invaluable in allowing a large (approximately 100 users) and distributed community to share results, and to repeatedly return over a span of years to older and potentially very complex analyses of gene expression data. The Gecko system is being made publicly available as free software http://sourceforge.net/projects/geckoe. In totality or in parts, the Gecko framework should prove useful to users and system developers with a broad range of analysis needs.

  16. Optimal Scaling of Digital Transcriptomes

    PubMed Central

    Glusman, Gustavo; Caballero, Juan; Robinson, Max; Kutlu, Burak; Hood, Leroy

    2013-01-01

    Deep sequencing of transcriptomes has become an indispensable tool for biology, enabling expression levels for thousands of genes to be compared across multiple samples. Since transcript counts scale with sequencing depth, counts from different samples must be normalized to a common scale prior to comparison. We analyzed fifteen existing and novel algorithms for normalizing transcript counts, and evaluated the effectiveness of the resulting normalizations. For this purpose we defined two novel and mutually independent metrics: (1) the number of “uniform” genes (genes whose normalized expression levels have a sufficiently low coefficient of variation), and (2) low Spearman correlation between normalized expression profiles of gene pairs. We also define four novel algorithms, one of which explicitly maximizes the number of uniform genes, and compared the performance of all fifteen algorithms. The two most commonly used methods (scaling to a fixed total value, or equalizing the expression of certain ‘housekeeping’ genes) yielded particularly poor results, surpassed even by normalization based on randomly selected gene sets. Conversely, seven of the algorithms approached what appears to be optimal normalization. Three of these algorithms rely on the identification of “ubiquitous” genes: genes expressed in all the samples studied, but never at very high or very low levels. We demonstrate that these include a “core” of genes expressed in many tissues in a mutually consistent pattern, which is suitable for use as an internal normalization guide. The new methods yield robustly normalized expression values, which is a prerequisite for the identification of differentially expressed and tissue-specific genes as potential biomarkers. PMID:24223126

  17. LINCS Canvas Browser: interactive web app to query, browse and interrogate LINCS L1000 gene expression signatures.

    PubMed

    Duan, Qiaonan; Flynn, Corey; Niepel, Mario; Hafner, Marc; Muhlich, Jeremy L; Fernandez, Nicolas F; Rouillard, Andrew D; Tan, Christopher M; Chen, Edward Y; Golub, Todd R; Sorger, Peter K; Subramanian, Aravind; Ma'ayan, Avi

    2014-07-01

    For the Library of Integrated Network-based Cellular Signatures (LINCS) project many gene expression signatures using the L1000 technology have been produced. The L1000 technology is a cost-effective method to profile gene expression in large scale. LINCS Canvas Browser (LCB) is an interactive HTML5 web-based software application that facilitates querying, browsing and interrogating many of the currently available LINCS L1000 data. LCB implements two compacted layered canvases, one to visualize clustered L1000 expression data, and the other to display enrichment analysis results using 30 different gene set libraries. Clicking on an experimental condition highlights gene-sets enriched for the differentially expressed genes from the selected experiment. A search interface allows users to input gene lists and query them against over 100 000 conditions to find the top matching experiments. The tool integrates many resources for an unprecedented potential for new discoveries in systems biology and systems pharmacology. The LCB application is available at http://www.maayanlab.net/LINCS/LCB. Customized versions will be made part of the http://lincscloud.org and http://lincs.hms.harvard.edu websites. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. The high-level expression of human tissue plasminogen activator in the milk of transgenic mice with hybrid gene locus strategy.

    PubMed

    Zhou, Yanrong; Lin, Yanli; Wu, Xiaojie; Xiong, Fuyin; Lv, Yuemeng; Zheng, Tao; Huang, Peitang; Chen, Hongxing

    2012-02-01

    Transgene expression for the mammary gland bioreactor aimed at producing recombinant proteins requires optimized expression vector construction. Previously we presented a hybrid gene locus strategy, which was originally tested with human lactoferrin (hLF) as target transgene, and an extremely high-level expression of rhLF ever been achieved as to 29.8 g/l in mice milk. Here to demonstrate the broad application of this strategy, another 38.4 kb mWAP-htPA hybrid gene locus was constructed, in which the 3-kb genomic coding sequence in the 24-kb mouse whey acidic protein (mWAP) gene locus was substituted by the 17.4-kb genomic coding sequence of human tissue plasminogen activator (htPA), exactly from the start codon to the end codon. Corresponding five transgenic mice lines were generated and the highest expression level of rhtPA in the milk attained as to 3.3 g/l. Our strategy will provide a universal way for the large-scale production of pharmaceutical proteins in the mammary gland of transgenic animals.

  19. Integrating Colon Cancer Microarray Data: Associating Locus-Specific Methylation Groups to Gene Expression-Based Classifications.

    PubMed

    Barat, Ana; Ruskin, Heather J; Byrne, Annette T; Prehn, Jochen H M

    2015-11-23

    Recently, considerable attention has been paid to gene expression-based classifications of colorectal cancers (CRC) and their association with patient prognosis. In addition to changes in gene expression, abnormal DNA-methylation is known to play an important role in cancer onset and development, and colon cancer is no exception to this rule. Large-scale technologies, such as methylation microarray assays and specific sequencing of methylated DNA, have been used to determine whole genome profiles of CpG island methylation in tissue samples. In this article, publicly available microarray-based gene expression and methylation data sets are used to characterize expression subtypes with respect to locus-specific methylation. A major objective was to determine whether integration of these data types improves previously characterized subtypes, or provides evidence for additional subtypes. We used unsupervised clustering techniques to determine methylation-based subgroups, which are subsequently annotated with three published expression-based classifications, comprising from three to six subtypes. Our results showed that, while methylation profiles provide a further basis for segregation of certain (Inflammatory and Goblet-like) finer-grained expression-based subtypes, they also suggest that other finer-grained subtypes are not distinctive and can be considered as a single subtype.

  20. Integrating Colon Cancer Microarray Data: Associating Locus-Specific Methylation Groups to Gene Expression-Based Classifications

    PubMed Central

    Barat, Ana; Ruskin, Heather J.; Byrne, Annette T.; Prehn, Jochen H. M.

    2015-01-01

    Recently, considerable attention has been paid to gene expression-based classifications of colorectal cancers (CRC) and their association with patient prognosis. In addition to changes in gene expression, abnormal DNA-methylation is known to play an important role in cancer onset and development, and colon cancer is no exception to this rule. Large-scale technologies, such as methylation microarray assays and specific sequencing of methylated DNA, have been used to determine whole genome profiles of CpG island methylation in tissue samples. In this article, publicly available microarray-based gene expression and methylation data sets are used to characterize expression subtypes with respect to locus-specific methylation. A major objective was to determine whether integration of these data types improves previously characterized subtypes, or provides evidence for additional subtypes. We used unsupervised clustering techniques to determine methylation-based subgroups, which are subsequently annotated with three published expression-based classifications, comprising from three to six subtypes. Our results showed that, while methylation profiles provide a further basis for segregation of certain (Inflammatory and Goblet-like) finer-grained expression-based subtypes, they also suggest that other finer-grained subtypes are not distinctive and can be considered as a single subtype. PMID:27600244

  1. Identifying spatially similar gene expression patterns in early stage fruit fly embryo images: binary feature versus invariant moment digital representations

    PubMed Central

    Gurunathan, Rajalakshmi; Van Emden, Bernard; Panchanathan, Sethuraman; Kumar, Sudhir

    2004-01-01

    Background Modern developmental biology relies heavily on the analysis of embryonic gene expression patterns. Investigators manually inspect hundreds or thousands of expression patterns to identify those that are spatially similar and to ultimately infer potential gene interactions. However, the rapid accumulation of gene expression pattern data over the last two decades, facilitated by high-throughput techniques, has produced a need for the development of efficient approaches for direct comparison of images, rather than their textual descriptions, to identify spatially similar expression patterns. Results The effectiveness of the Binary Feature Vector (BFV) and Invariant Moment Vector (IMV) based digital representations of the gene expression patterns in finding biologically meaningful patterns was compared for a small (226 images) and a large (1819 images) dataset. For each dataset, an ordered list of images, with respect to a query image, was generated to identify overlapping and similar gene expression patterns, in a manner comparable to what a developmental biologist might do. The results showed that the BFV representation consistently outperforms the IMV representation in finding biologically meaningful matches when spatial overlap of the gene expression pattern and the genes involved are considered. Furthermore, we explored the value of conducting image-content based searches in a dataset where individual expression components (or domains) of multi-domain expression patterns were also included separately. We found that this technique improves performance of both IMV and BFV based searches. Conclusions We conclude that the BFV representation consistently produces a more extensive and better list of biologically useful patterns than the IMV representation. The high quality of results obtained scales well as the search database becomes larger, which encourages efforts to build automated image query and retrieval systems for spatial gene expression patterns. PMID:15603586

  2. Computerized image analysis for quantitative neuronal phenotyping in zebrafish.

    PubMed

    Liu, Tianming; Lu, Jianfeng; Wang, Ye; Campbell, William A; Huang, Ling; Zhu, Jinmin; Xia, Weiming; Wong, Stephen T C

    2006-06-15

    An integrated microscope image analysis pipeline is developed for automatic analysis and quantification of phenotypes in zebrafish with altered expression of Alzheimer's disease (AD)-linked genes. We hypothesize that a slight impairment of neuronal integrity in a large number of zebrafish carrying the mutant genotype can be detected through the computerized image analysis method. Key functionalities of our zebrafish image processing pipeline include quantification of neuron loss in zebrafish embryos due to knockdown of AD-linked genes, automatic detection of defective somites, and quantitative measurement of gene expression levels in zebrafish with altered expression of AD-linked genes or treatment with a chemical compound. These quantitative measurements enable the archival of analyzed results and relevant meta-data. The structured database is organized for statistical analysis and data modeling to better understand neuronal integrity and phenotypic changes of zebrafish under different perturbations. Our results show that the computerized analysis is comparable to manual counting with equivalent accuracy and improved efficacy and consistency. Development of such an automated data analysis pipeline represents a significant step forward to achieve accurate and reproducible quantification of neuronal phenotypes in large scale or high-throughput zebrafish imaging studies.

  3. The evolution of duplicate gene expression in mammalian organs

    PubMed Central

    Guschanski, Katerina; Warnefors, Maria; Kaessmann, Henrik

    2017-01-01

    Gene duplications generate genomic raw material that allows the emergence of novel functions, likely facilitating adaptive evolutionary innovations. However, global assessments of the functional and evolutionary relevance of duplicate genes in mammals were until recently limited by the lack of appropriate comparative data. Here, we report a large-scale study of the expression evolution of DNA-based functional gene duplicates in three major mammalian lineages (placental mammals, marsupials, egg-laying monotremes) and birds, on the basis of RNA sequencing (RNA-seq) data from nine species and eight organs. We observe dynamic changes in tissue expression preference of paralogs with different duplication ages, suggesting differential contribution of paralogs to specific organ functions during vertebrate evolution. Specifically, we show that paralogs that emerged in the common ancestor of bony vertebrates are enriched for genes with brain-specific expression and provide evidence for differential forces underlying the preferential emergence of young testis- and liver-specific expressed genes. Further analyses uncovered that the overall spatial expression profiles of gene families tend to be conserved, with several exceptions of pronounced tissue specificity shifts among lineage-specific gene family expansions. Finally, we trace new lineage-specific genes that may have contributed to the specific biology of mammalian organs, including the little-studied placenta. Overall, our study provides novel and taxonomically broad evidence for the differential contribution of duplicate genes to tissue-specific transcriptomes and for their importance for the phenotypic evolution of vertebrates. PMID:28743766

  4. Modulation of gene expression in heart and liver of hibernating black bears (Ursus americanus)

    PubMed Central

    2011-01-01

    Background Hibernation is an adaptive strategy to survive in highly seasonal or unpredictable environments. The molecular and genetic basis of hibernation physiology in mammals has only recently been studied using large scale genomic approaches. We analyzed gene expression in the American black bear, Ursus americanus, using a custom 12,800 cDNA probe microarray to detect differences in expression that occur in heart and liver during winter hibernation in comparison to summer active animals. Results We identified 245 genes in heart and 319 genes in liver that were differentially expressed between winter and summer. The expression of 24 genes was significantly elevated during hibernation in both heart and liver. These genes are mostly involved in lipid catabolism and protein biosynthesis and include RNA binding protein motif 3 (Rbm3), which enhances protein synthesis at mildly hypothermic temperatures. Elevated expression of protein biosynthesis genes suggests induction of translation that may be related to adaptive mechanisms reducing cardiac and muscle atrophies over extended periods of low metabolism and immobility during hibernation in bears. Coordinated reduction of transcription of genes involved in amino acid catabolism suggests redirection of amino acids from catabolic pathways to protein biosynthesis. We identify common for black bears and small mammalian hibernators transcriptional changes in the liver that include induction of genes responsible for fatty acid β oxidation and carbohydrate synthesis and depression of genes involved in lipid biosynthesis, carbohydrate catabolism, cellular respiration and detoxification pathways. Conclusions Our findings show that modulation of gene expression during winter hibernation represents molecular mechanism of adaptation to extreme environments. PMID:21453527

  5. Multi-tissue analysis of co-expression networks by higher-order generalized singular value decomposition identifies functionally coherent transcriptional modules.

    PubMed

    Xiao, Xiaolin; Moreno-Moral, Aida; Rotival, Maxime; Bottolo, Leonardo; Petretto, Enrico

    2014-01-01

    Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed heat shock protein (Hsp) and cardiomyopathy genes (Bag3, Cryab, Kras, Emd, Plec), which was significantly replicated using separate failing heart and liver gene expression datasets in humans, thus revealing a conserved functional role for Hsp genes in cardiovascular disease.

  6. Large-scale atlas of microarray data reveals biological landscape of gene expression in Arabidopsis

    USDA-ARS?s Scientific Manuscript database

    Transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metad...

  7. A transcriptional dynamic network during Arabidopsis thaliana pollen development.

    PubMed

    Wang, Jigang; Qiu, Xiaojie; Li, Yuhua; Deng, Youping; Shi, Tieliu

    2011-01-01

    To understand transcriptional regulatory networks (TRNs), especially the coordinated dynamic regulation between transcription factors (TFs) and their corresponding target genes during development, computational approaches would represent significant advances in the genome-wide expression analysis. The major challenges for the experiments include monitoring the time-specific TFs' activities and identifying the dynamic regulatory relationships between TFs and their target genes, both of which are currently not yet available at the large scale. However, various methods have been proposed to computationally estimate those activities and regulations. During the past decade, significant progresses have been made towards understanding pollen development at each development stage under the molecular level, yet the regulatory mechanisms that control the dynamic pollen development processes remain largely unknown. Here, we adopt Networks Component Analysis (NCA) to identify TF activities over time course, and infer their regulatory relationships based on the coexpression of TFs and their target genes during pollen development. We carried out meta-analysis by integrating several sets of gene expression data related to Arabidopsis thaliana pollen development (stages range from UNM, BCP, TCP, HP to 0.5 hr pollen tube and 4 hr pollen tube). We constructed a regulatory network, including 19 TFs, 101 target genes and 319 regulatory interactions. The computationally estimated TF activities were well correlated to their coordinated genes' expressions during the development process. We clustered the expression of their target genes in the context of regulatory influences, and inferred new regulatory relationships between those TFs and their target genes, such as transcription factor WRKY34, which was identified that specifically expressed in pollen, and regulated several new target genes. Our finding facilitates the interpretation of the expression patterns with more biological relevancy, since the clusters corresponding to the activity of specific TF or the combination of TFs suggest the coordinated regulation of TFs to their target genes. Through integrating different resources, we constructed a dynamic regulatory network of Arabidopsis thaliana during pollen development with gene coexpression and NCA. The network illustrated the relationships between the TFs' activities and their target genes' expression, as well as the interactions between TFs, which provide new insight into the molecular mechanisms that control the pollen development.

  8. Machine Learning–Based Differential Network Analysis: A Study of Stress-Responsive Transcriptomes in Arabidopsis[W

    PubMed Central

    Ma, Chuang; Xin, Mingming; Feldmann, Kenneth A.; Wang, Xiangfeng

    2014-01-01

    Machine learning (ML) is an intelligent data mining technique that builds a prediction model based on the learning of prior knowledge to recognize patterns in large-scale data sets. We present an ML-based methodology for transcriptome analysis via comparison of gene coexpression networks, implemented as an R package called machine learning–based differential network analysis (mlDNA) and apply this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana. The mlDNA first used a ML-based filtering process to remove nonexpressed, constitutively expressed, or non-stress-responsive “noninformative” genes prior to network construction, through learning the patterns of 32 expression characteristics of known stress-related genes. The retained “informative” genes were subsequently analyzed by ML-based network comparison to predict candidate stress-related genes showing expression and network differences between control and stress networks, based on 33 network topological characteristics. Comparative evaluation of the network-centric and gene-centric analytic methods showed that mlDNA substantially outperformed traditional statistical testing–based differential expression analysis at identifying stress-related genes, with markedly improved prediction accuracy. To experimentally validate the mlDNA predictions, we selected 89 candidates out of the 1784 predicted salt stress–related genes with available SALK T-DNA mutagenesis lines for phenotypic screening and identified two previously unreported genes, mutants of which showed salt-sensitive phenotypes. PMID:24520154

  9. Integrated network analysis identifies fight-club nodes as a class of hubs encompassing key putative switch genes that induce major transcriptome reprogramming during grapevine development.

    PubMed

    Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola

    2014-12-01

    We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named "fight-club hubs" characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named "switch genes" was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. © 2014 American Society of Plant Biologists. All rights reserved.

  10. Improved ethanol production from cheese whey, whey powder, and sugar beet molasses by "Vitreoscilla hemoglobin expressing" Escherichia coli.

    PubMed

    Akbas, Meltem Yesilcimen; Sar, Taner; Ozcelik, Busra

    2014-01-01

    This work investigated the improvement of ethanol production by engineered ethanologenic Escherichia coli to express the hemoglobin from the bacterium Vitreoscilla (VHb). Ethanologenic E. coli strain FBR5 and FBR5 transformed with the VHb gene in two constructs (strains TS3 and TS4) were grown in cheese whey (CW) medium at small and large scales, at both high and low aeration, or with whey powder (WP) or sugar beet molasses hydrolysate (SBMH) media at large scale and low aeration. Culture pH, cell growth, VHb levels, and ethanol production were evaluated after 48 h. VHb expression in TS3 and TS4 enhanced their ethanol production in CW (21-419%), in WP (17-362%), or in SBMH (48-118%) media. This work extends the findings that "VHb technology" may be useful for improving the production of ethanol from waste and byproducts of various sources.

  11. Inferring causal genomic alterations in breast cancer using gene expression data

    PubMed Central

    2011-01-01

    Background One of the primary objectives in cancer research is to identify causal genomic alterations, such as somatic copy number variation (CNV) and somatic mutations, during tumor development. Many valuable studies lack genomic data to detect CNV; therefore, methods that are able to infer CNVs from gene expression data would help maximize the value of these studies. Results We developed a framework for identifying recurrent regions of CNV and distinguishing the cancer driver genes from the passenger genes in the regions. By inferring CNV regions across many datasets we were able to identify 109 recurrent amplified/deleted CNV regions. Many of these regions are enriched for genes involved in many important processes associated with tumorigenesis and cancer progression. Genes in these recurrent CNV regions were then examined in the context of gene regulatory networks to prioritize putative cancer driver genes. The cancer driver genes uncovered by the framework include not only well-known oncogenes but also a number of novel cancer susceptibility genes validated via siRNA experiments. Conclusions To our knowledge, this is the first effort to systematically identify and validate drivers for expression based CNV regions in breast cancer. The framework where the wavelet analysis of copy number alteration based on expression coupled with the gene regulatory network analysis, provides a blueprint for leveraging genomic data to identify key regulatory components and gene targets. This integrative approach can be applied to many other large-scale gene expression studies and other novel types of cancer data such as next-generation sequencing based expression (RNA-Seq) as well as CNV data. PMID:21806811

  12. A regulatory toolbox of MiniPromoters to drive selective expression in the brain

    PubMed Central

    Portales-Casamar, Elodie; Swanson, Douglas J.; Liu, Li; de Leeuw, Charles N.; Banks, Kathleen G.; Ho Sui, Shannan J.; Fulton, Debra L.; Ali, Johar; Amirabbasi, Mahsa; Arenillas, David J.; Babyak, Nazar; Black, Sonia F.; Bonaguro, Russell J.; Brauer, Erich; Candido, Tara R.; Castellarin, Mauro; Chen, Jing; Chen, Ying; Cheng, Jason C. Y.; Chopra, Vik; Docking, T. Roderick; Dreolini, Lisa; D'Souza, Cletus A.; Flynn, Erin K.; Glenn, Randy; Hatakka, Kristi; Hearty, Taryn G.; Imanian, Behzad; Jiang, Steven; Khorasan-zadeh, Shadi; Komljenovic, Ivana; Laprise, Stéphanie; Liao, Nancy Y.; Lim, Jonathan S.; Lithwick, Stuart; Liu, Flora; Liu, Jun; Lu, Meifen; McConechy, Melissa; McLeod, Andrea J.; Milisavljevic, Marko; Mis, Jacek; O'Connor, Katie; Palma, Betty; Palmquist, Diana L.; Schmouth, Jean-François; Swanson, Magdalena I.; Tam, Bonny; Ticoll, Amy; Turner, Jenna L.; Varhol, Richard; Vermeulen, Jenny; Watkins, Russell F.; Wilson, Gary; Wong, Bibiana K. Y.; Wong, Siaw H.; Wong, Tony Y. T.; Yang, George S.; Ypsilanti, Athena R.; Jones, Steven J. M.; Holt, Robert A.; Goldowitz, Daniel; Wasserman, Wyeth W.; Simpson, Elizabeth M.

    2010-01-01

    The Pleiades Promoter Project integrates genomewide bioinformatics with large-scale knockin mouse production and histological examination of expression patterns to develop MiniPromoters and related tools designed to study and treat the brain by directed gene expression. Genes with brain expression patterns of interest are subjected to bioinformatic analysis to delineate candidate regulatory regions, which are then incorporated into a panel of compact human MiniPromoters to drive expression to brain regions and cell types of interest. Using single-copy, homologous-recombination “knockins” in embryonic stem cells, each MiniPromoter reporter is integrated immediately 5′ of the Hprt locus in the mouse genome. MiniPromoter expression profiles are characterized in differentiation assays of the transgenic cells or in mouse brains following transgenic mouse production. Histological examination of adult brains, eyes, and spinal cords for reporter gene activity is coupled to costaining with cell-type–specific markers to define expression. The publicly available Pleiades MiniPromoter Project is a key resource to facilitate research on brain development and therapies. PMID:20807748

  13. Differential gene expression during thermal stress and bleaching in the Caribbean coral Montastraea faveolata.

    PubMed

    DeSalvo, M K; Voolstra, C R; Sunagawa, S; Schwarz, J A; Stillman, J H; Coffroth, M A; Szmant, A M; Medina, M

    2008-09-01

    The declining health of coral reefs worldwide is likely to intensify in response to continued anthropogenic disturbance from coastal development, pollution, and climate change. In response to these stresses, reef-building corals may exhibit bleaching, which marks the breakdown in symbiosis between coral and zooxanthellae. Mass coral bleaching due to elevated water temperature can devastate coral reefs on a large geographical scale. In order to understand the molecular and cellular basis of bleaching in corals, we have measured gene expression changes associated with thermal stress and bleaching using a complementary DNA microarray containing 1310 genes of the Caribbean coral Montastraea faveolata. In a first experiment, we identified differentially expressed genes by comparing experimentally bleached M. faveolata fragments to control non-heat-stressed fragments. In a second experiment, we identified differentially expressed genes during a time course experiment with four time points across 9 days. Results suggest that thermal stress and bleaching in M. faveolata affect the following processes: oxidative stress, Ca(2+) homeostasis, cytoskeletal organization, cell death, calcification, metabolism, protein synthesis, heat shock protein activity, and transposon activity. These results represent the first medium-scale transcriptomic study focused on revealing the cellular foundation of thermal stress-induced coral bleaching. We postulate that oxidative stress in thermal-stressed corals causes a disruption of Ca(2+) homeostasis, which in turn leads to cytoskeletal and cell adhesion changes, decreased calcification, and the initiation of cell death via apoptosis and necrosis.

  14. High level expression of Acidothermus cellulolyticus β-1, 4-endoglucanase in transgenic rice enhances the hydrolysis of its straw by cultured cow gastric fluid

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chou, Hong L.; Dai, Ziyu; Hsieh, Chia W.

    Large-scale production of effective cellulose hydrolytic enzymes is the key to the bioconversion of agricultural residues to ethanol. The goal of this study was to develop a rice plant as a bioreactor for the large-scale production of cellulose hydrolytic enzymes via genetic transformation, and to simultaneously improve rice straw as an efficient biomass feedstock for conversion of cellulose to glucose. In this study, the cellulose hydrolytic enzyme {beta}-1, 4-endoglucanase (E1) from the thermophilic bacterium Acidothermus cellulolyticus was overexpressed in rice through Agrobacterium-mediated transformation. The expression of the bacterial gene in rice was driven by the constitutive Mac promoter, a hybridmore » promoter of Ti plasmid mannopine synthetase promoter and cauliflower mosaic virus 35S promoter enhancer with the signal peptide of tobacco pathogenesis-related protein for targeting the protein to the apoplastic compartment for storage. A total of 52 transgenic rice plants from six independent lines expressing the bacterial enzyme were obtained, which expressed the gene at high levels with a normal phenotype. The specific activities of E1 in the leaves of the highest expressing transgenic rice lines were about 20 fold higher than those of various transgenic plants obtained in previous studies and the protein amounts accounted for up to 6.1% of the total leaf soluble protein. Zymogram and temperature-dependent activity analyses demonstrated the thermostability of the enzyme and its substrate specificity against cellulose, and a simple heat treatment can be used to purify the protein. In addition, hydrolysis of transgenic rice straw with cultured cow gastric fluid yielded almost twice more reducing sugars than wild type straw. Taken together, these data suggest that transgenic rice can effectively serve as a bioreactor for large-scale production of active, thermostable cellulose hydrolytic enzymes. As a feedstock, direct expression of large amount of cellulases in transgenic rice may also facilitate saccharification of cellulose in rice straw and significantly reduce the costs for hydrolytic enzymes.« less

  15. Plant Omics Data Center: An Integrated Web Repository for Interspecies Gene Expression Networks with NLP-Based Curation

    PubMed Central

    Ohyanagi, Hajime; Takano, Tomoyuki; Terashima, Shin; Kobayashi, Masaaki; Kanno, Maasa; Morimoto, Kyoko; Kanegae, Hiromi; Sasaki, Yohei; Saito, Misa; Asano, Satomi; Ozaki, Soichi; Kudo, Toru; Yokoyama, Koji; Aya, Koichiro; Suwabe, Keita; Suzuki, Go; Aoki, Koh; Kubo, Yasutaka; Watanabe, Masao; Matsuoka, Makoto; Yano, Kentaro

    2015-01-01

    Comprehensive integration of large-scale omics resources such as genomes, transcriptomes and metabolomes will provide deeper insights into broader aspects of molecular biology. For better understanding of plant biology, we aim to construct a next-generation sequencing (NGS)-derived gene expression network (GEN) repository for a broad range of plant species. So far we have incorporated information about 745 high-quality mRNA sequencing (mRNA-Seq) samples from eight plant species (Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, Sorghum bicolor, Vitis vinifera, Solanum tuberosum, Medicago truncatula and Glycine max) from the public short read archive, digitally profiled the entire set of gene expression profiles, and drawn GENs by using correspondence analysis (CA) to take advantage of gene expression similarities. In order to understand the evolutionary significance of the GENs from multiple species, they were linked according to the orthology of each node (gene) among species. In addition to other gene expression information, functional annotation of the genes will facilitate biological comprehension. Currently we are improving the given gene annotations with natural language processing (NLP) techniques and manual curation. Here we introduce the current status of our analyses and the web database, PODC (Plant Omics Data Center; http://bioinf.mind.meiji.ac.jp/podc/), now open to the public, providing GENs, functional annotations and additional comprehensive omics resources. PMID:25505034

  16. Integrated Network Analysis Identifies Fight-Club Nodes as a Class of Hubs Encompassing Key Putative Switch Genes That Induce Major Transcriptome Reprogramming during Grapevine Development[W][OPEN

    PubMed Central

    Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola

    2014-01-01

    We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named “fight-club hubs” characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named “switch genes” was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. PMID:25490918

  17. Genome-scale approaches to the epigenetics of common human disease

    PubMed Central

    2011-01-01

    Traditionally, the pathology of human disease has been focused on microscopic examination of affected tissues, chemical and biochemical analysis of biopsy samples, other available samples of convenience, such as blood, and noninvasive or invasive imaging of varying complexity, in order to classify disease and illuminate its mechanistic basis. The molecular age has complemented this armamentarium with gene expression arrays and selective analysis of individual genes. However, we are entering a new era of epigenomic profiling, i.e., genome-scale analysis of cell-heritable nonsequence genetic change, such as DNA methylation. The epigenome offers access to stable measurements of cellular state and to biobanked material for large-scale epidemiological studies. Some of these genome-scale technologies are beginning to be applied to create the new field of epigenetic epidemiology. PMID:19844740

  18. A genome-wide inducible phenotypic screen identifies antisense RNA constructs silencing Escherichia coli essential genes

    PubMed Central

    Meng, Jia; Kanzaki, Gregory; Meas, Diane; Lam, Christopher K.; Crummer, Heather; Tain, Justina; Xu, H. Howard

    2013-01-01

    Regulated antisense RNA (asRNA) expression has been employed successfully in Gram-positive bacteria for genome-wide essential gene identification and drug target determination. However, there have been no published reports describing the application of asRNA gene silencing for comprehensive analyses of essential genes in Gram-negative bacteria. In this study, we report the first genome-wide identification of asRNA constructs for essential genes in Escherichia coli. We screened 250,000 library transformants for conditional growth-inhibitory recombinant clones from two shot-gun genomic libraries of E. coli using a paired-termini expression vector (pHN678). After sequencing plasmid inserts of 675 confirmed inducer-sensitive cell clones, we identified 152 separate asRNA constructs of which 134 inserts came from essential genes while 18 originated from non-essential genes (but share operons with essential genes). Among the 79 individual essential genes silenced by these asRNA constructs, 61 genes (77%) engage in processes related to protein synthesis. The cell-based assays of an asRNA clone targeting fusA (encoding elongation factor G) showed that the induced cells were sensitized 12 fold to fusidic acid, a known specific inhibitor. Our results demonstrate the utility of the paired-termini expression vector and feasibility of large-scale gene silencing in E. coli using regulated asRNA expression. PMID:22268863

  19. Statistical Analysis of Big Data on Pharmacogenomics

    PubMed Central

    Fan, Jianqing; Liu, Han

    2013-01-01

    This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905

  20. Mechanisms Underlying Adaptation to Life in Hydrogen Sulfide–Rich Environments

    PubMed Central

    Kelley, Joanna L.; Arias-Rodriguez, Lenin; Patacsil Martin, Dorrelyn; Yee, Muh-Ching; Bustamante, Carlos D.; Tobler, Michael

    2016-01-01

    Hydrogen sulfide (H2S) is a potent toxicant interfering with oxidative phosphorylation in mitochondria and creating extreme environmental conditions in aquatic ecosystems. The mechanistic basis of adaptation to perpetual exposure to H2S remains poorly understood. We investigated evolutionarily independent lineages of livebearing fishes that have colonized and adapted to springs rich in H2S and compared their genome-wide gene expression patterns with closely related lineages from adjacent, nonsulfidic streams. Significant differences in gene expression were uncovered between all sulfidic and nonsulfidic population pairs. Variation in the number of differentially expressed genes among population pairs corresponded to differences in divergence times and rates of gene flow, which is consistent with neutral drift driving a substantial portion of gene expression variation among populations. Accordingly, there was little evidence for convergent evolution shaping large-scale gene expression patterns among independent sulfide spring populations. Nonetheless, we identified a small number of genes that was consistently differentially expressed in the same direction in all sulfidic and nonsulfidic population pairs. Functional annotation of shared differentially expressed genes indicated upregulation of genes associated with enzymatic H2S detoxification and transport of oxidized sulfur species, oxidative phosphorylation, energy metabolism, and pathways involved in responses to oxidative stress. Overall, our results suggest that modification of processes associated with H2S detoxification and toxicity likely complement each other to mediate elevated H2S tolerance in sulfide spring fishes. Our analyses allow for the development of novel hypotheses about biochemical and physiological mechanisms of adaptation to extreme environments. PMID:26861137

  1. Genome-wide Mapping Reveals Conservation of Promoter DNA Methylation Following Chicken Domestication

    PubMed Central

    Li, Qinghe; Wang, Yuanyuan; Hu, Xiaoxiang; Zhao, Yaofeng; Li, Ning

    2015-01-01

    It is well-known that environment influences DNA methylation, however, the extent of heritable DNA methylation variation following animal domestication remains largely unknown. Using meDIP-chip we mapped the promoter methylomes for 23,316 genes in muscle tissues of ancestral and domestic chickens. We systematically examined the variation of promoter DNA methylation in terms of different breeds, differentially expressed genes, SNPs and genes undergo genetic selection sweeps. While considerable changes in DNA sequence and gene expression programs were prevalent, we found that the inter-strain DNA methylation patterns were highly conserved in promoter region between the wild and domestic chicken breeds. Our data suggests a global preservation of DNA methylation between the wild and domestic chicken breeds in either a genome-wide or locus-specific scale in chick muscle tissues. PMID:25735894

  2. Transcriptional profiles of supragranular-enriched genes associate with corticocortical network architecture in the human brain

    PubMed Central

    Krienen, Fenna M.; Yeo, B. T. Thomas; Ge, Tian; Buckner, Randy L.; Sherwood, Chet C.

    2016-01-01

    The human brain is patterned with disproportionately large, distributed cerebral networks that connect multiple association zones in the frontal, temporal, and parietal lobes. The expansion of the cortical surface, along with the emergence of long-range connectivity networks, may be reflected in changes to the underlying molecular architecture. Using the Allen Institute’s human brain transcriptional atlas, we demonstrate that genes particularly enriched in supragranular layers of the human cerebral cortex relative to mouse distinguish major cortical classes. The topography of transcriptional expression reflects large-scale brain network organization consistent with estimates from functional connectivity MRI and anatomical tracing in nonhuman primates. Microarray expression data for genes preferentially expressed in human upper layers (II/III), but enriched only in lower layers (V/VI) of mouse, were cross-correlated to identify molecular profiles across the cerebral cortex of postmortem human brains (n = 6). Unimodal sensory and motor zones have similar molecular profiles, despite being distributed across the cortical mantle. Sensory/motor profiles were anticorrelated with paralimbic and certain distributed association network profiles. Tests of alternative gene sets did not consistently distinguish sensory and motor regions from paralimbic and association regions: (i) genes enriched in supragranular layers in both humans and mice, (ii) genes cortically enriched in humans relative to nonhuman primates, (iii) genes related to connectivity in rodents, (iv) genes associated with human and mouse connectivity, and (v) 1,454 gene sets curated from known gene ontologies. Molecular innovations of upper cortical layers may be an important component in the evolution of long-range corticocortical projections. PMID:26739559

  3. Transcriptional profiles of supragranular-enriched genes associate with corticocortical network architecture in the human brain.

    PubMed

    Krienen, Fenna M; Yeo, B T Thomas; Ge, Tian; Buckner, Randy L; Sherwood, Chet C

    2016-01-26

    The human brain is patterned with disproportionately large, distributed cerebral networks that connect multiple association zones in the frontal, temporal, and parietal lobes. The expansion of the cortical surface, along with the emergence of long-range connectivity networks, may be reflected in changes to the underlying molecular architecture. Using the Allen Institute's human brain transcriptional atlas, we demonstrate that genes particularly enriched in supragranular layers of the human cerebral cortex relative to mouse distinguish major cortical classes. The topography of transcriptional expression reflects large-scale brain network organization consistent with estimates from functional connectivity MRI and anatomical tracing in nonhuman primates. Microarray expression data for genes preferentially expressed in human upper layers (II/III), but enriched only in lower layers (V/VI) of mouse, were cross-correlated to identify molecular profiles across the cerebral cortex of postmortem human brains (n = 6). Unimodal sensory and motor zones have similar molecular profiles, despite being distributed across the cortical mantle. Sensory/motor profiles were anticorrelated with paralimbic and certain distributed association network profiles. Tests of alternative gene sets did not consistently distinguish sensory and motor regions from paralimbic and association regions: (i) genes enriched in supragranular layers in both humans and mice, (ii) genes cortically enriched in humans relative to nonhuman primates, (iii) genes related to connectivity in rodents, (iv) genes associated with human and mouse connectivity, and (v) 1,454 gene sets curated from known gene ontologies. Molecular innovations of upper cortical layers may be an important component in the evolution of long-range corticocortical projections.

  4. RNA sequencing: current and prospective uses in metabolic research.

    PubMed

    Vikman, Petter; Fadista, Joao; Oskolkov, Nikolay

    2014-10-01

    Previous global RNA analysis was restricted to known transcripts in species with a defined transcriptome. Next generation sequencing has transformed transcriptomics by making it possible to analyse expressed genes with an exon level resolution from any tissue in any species without any a priori knowledge of which genes that are being expressed, splice patterns or their nucleotide sequence. In addition, RNA sequencing is a more sensitive technique compared with microarrays with a larger dynamic range, and it also allows for investigation of imprinting and allele-specific expression. This can be done for a cost that is able to compete with that of a microarray, making RNA sequencing a technique available to most researchers. Therefore RNA sequencing has recently become the state of the art with regards to large-scale RNA investigations and has to a large extent replaced microarrays. The only drawback is the large data amounts produced, which together with the complexity of the data can make a researcher spend far more time on analysis than performing the actual experiment. © 2014 Society for Endocrinology.

  5. Screening and identification of gastric adenocarcinoma metastasis-related genes by using cDNA microarray coupled to FDD-PCR.

    PubMed

    Wang, Jian-Hua; Chen, Shi-Shu

    2002-07-01

    To clone gastric adenocarcinoma metastasis related genes, RF-1 cell line (primary tumor of a gastric adenocarcinoma patient ) and RF-48 cell line (its metastatic counterpart) were used as a model for studying the molecular mechanism of tumor metastasis. Two fluorescent cDNA probes, labeled with Cy3 and Cy5 dyes, were prepared from RF-1 and RF-48 mRNA samples by reverse transcription method. The two color probes were then mixed and hybridized to the cDNA chip constructed by double-dots of 4 096 human genes, and scanned at two wavelengths. The experiment was repeated for 2 times. Differential expression genes from the above two cells were analyzed using the computer. 138 in all genes (3.4%) revealed differential expression in RF-48 cells compared with RF-1 cells: 81(2.1%) genes revealed apparent up-regulation, and 56(1.3%) genes revealed down-regulation. 45 genes involved in gastric adenocarcinoma metastasis were cloned using fluorescent differential display-PCR (FDD-PCR), including 3 novel genes. There were 7 differential expression genes that agreed with each other in two detection methods. The possible roles of some differential expressed genes, which maybe involved in the mechanism of tumor metastasis, were discussed. cDNA chip was used to analyze gene expression in a high-throughput and large scale manner, in combination with FDD-PCR for cloning unknown novel genes. In conclusion, some genes related to metastasis were preliminarily scanned, which would contribute to disclose the molecular mechanism of gastric adenocarcinoma metastasis.

  6. Analysis of large-scale gene expression data.

    PubMed

    Sherlock, G

    2000-04-01

    The advent of cDNA and oligonucleotide microarray technologies has led to a paradigm shift in biological investigation, such that the bottleneck in research is shifting from data generation to data analysis. Hierarchical clustering, divisive clustering, self-organizing maps and k-means clustering have all been recently used to make sense of this mass of data.

  7. Laminar and dorsoventral molecular organization of the medial entorhinal cortex revealed by large-scale anatomical analysis of gene expression.

    PubMed

    Ramsden, Helen L; Sürmeli, Gülşen; McDonagh, Steven G; Nolan, Matthew F

    2015-01-01

    Neural circuits in the medial entorhinal cortex (MEC) encode an animal's position and orientation in space. Within the MEC spatial representations, including grid and directional firing fields, have a laminar and dorsoventral organization that corresponds to a similar topography of neuronal connectivity and cellular properties. Yet, in part due to the challenges of integrating anatomical data at the resolution of cortical layers and borders, we know little about the molecular components underlying this organization. To address this we develop a new computational pipeline for high-throughput analysis and comparison of in situ hybridization (ISH) images at laminar resolution. We apply this pipeline to ISH data for over 16,000 genes in the Allen Brain Atlas and validate our analysis with RNA sequencing of MEC tissue from adult mice. We find that differential gene expression delineates the borders of the MEC with neighboring brain structures and reveals its laminar and dorsoventral organization. We propose a new molecular basis for distinguishing the deep layers of the MEC and show that their similarity to corresponding layers of neocortex is greater than that of superficial layers. Our analysis identifies ion channel-, cell adhesion- and synapse-related genes as candidates for functional differentiation of MEC layers and for encoding of spatial information at different scales along the dorsoventral axis of the MEC. We also reveal laminar organization of genes related to disease pathology and suggest that a high metabolic demand predisposes layer II to neurodegenerative pathology. In principle, our computational pipeline can be applied to high-throughput analysis of many forms of neuroanatomical data. Our results support the hypothesis that differences in gene expression contribute to functional specialization of superficial layers of the MEC and dorsoventral organization of the scale of spatial representations.

  8. Laminar and Dorsoventral Molecular Organization of the Medial Entorhinal Cortex Revealed by Large-scale Anatomical Analysis of Gene Expression

    PubMed Central

    Ramsden, Helen L.; Sürmeli, Gülşen; McDonagh, Steven G.; Nolan, Matthew F.

    2015-01-01

    Neural circuits in the medial entorhinal cortex (MEC) encode an animal’s position and orientation in space. Within the MEC spatial representations, including grid and directional firing fields, have a laminar and dorsoventral organization that corresponds to a similar topography of neuronal connectivity and cellular properties. Yet, in part due to the challenges of integrating anatomical data at the resolution of cortical layers and borders, we know little about the molecular components underlying this organization. To address this we develop a new computational pipeline for high-throughput analysis and comparison of in situ hybridization (ISH) images at laminar resolution. We apply this pipeline to ISH data for over 16,000 genes in the Allen Brain Atlas and validate our analysis with RNA sequencing of MEC tissue from adult mice. We find that differential gene expression delineates the borders of the MEC with neighboring brain structures and reveals its laminar and dorsoventral organization. We propose a new molecular basis for distinguishing the deep layers of the MEC and show that their similarity to corresponding layers of neocortex is greater than that of superficial layers. Our analysis identifies ion channel-, cell adhesion- and synapse-related genes as candidates for functional differentiation of MEC layers and for encoding of spatial information at different scales along the dorsoventral axis of the MEC. We also reveal laminar organization of genes related to disease pathology and suggest that a high metabolic demand predisposes layer II to neurodegenerative pathology. In principle, our computational pipeline can be applied to high-throughput analysis of many forms of neuroanatomical data. Our results support the hypothesis that differences in gene expression contribute to functional specialization of superficial layers of the MEC and dorsoventral organization of the scale of spatial representations. PMID:25615592

  9. Partial least squares based identification of Duchenne muscular dystrophy specific genes.

    PubMed

    An, Hui-bo; Zheng, Hua-cheng; Zhang, Li; Ma, Lin; Liu, Zheng-yan

    2013-11-01

    Large-scale parallel gene expression analysis has provided a greater ease for investigating the underlying mechanisms of Duchenne muscular dystrophy (DMD). Previous studies typically implemented variance/regression analysis, which would be fundamentally flawed when unaccounted sources of variability in the arrays existed. Here we aim to identify genes that contribute to the pathology of DMD using partial least squares (PLS) based analysis. We carried out PLS-based analysis with two datasets downloaded from the Gene Expression Omnibus (GEO) database to identify genes contributing to the pathology of DMD. Except for the genes related to inflammation, muscle regeneration and extracellular matrix (ECM) modeling, we found some genes with high fold change, which have not been identified by previous studies, such as SRPX, GPNMB, SAT1, and LYZ. In addition, downregulation of the fatty acid metabolism pathway was found, which may be related to the progressive muscle wasting process. Our results provide a better understanding for the downstream mechanisms of DMD.

  10. Novel Genomic and Evolutionary Insight of WRKY Transcription Factors in Plant Lineage

    PubMed Central

    Mohanta, Tapan Kumar; Park, Yong-Hwan; Bae, Hanhong

    2016-01-01

    The evolutionarily conserved WRKY transcription factor (TF) regulates different aspects of gene expression in plants, and modulates growth, development, as well as biotic and abiotic stress responses. Therefore, understanding the details regarding WRKY TFs is very important. In this study, large-scale genomic analyses of the WRKY TF gene family from 43 plant species were conducted. The results of our study revealed that WRKY TFs could be grouped and specifically classified as those belonging to the monocot or dicot plant lineage. In this study, we identified several novel WRKY TFs. To our knowledge, this is the first report on a revised grouping system of the WRKY TF gene family in plants. The different forms of novel chimeric forms of WRKY TFs in the plant genome might play a crucial role in their evolution. Tissue-specific gene expression analyses in Glycine max and Phaseolus vulgaris showed that WRKY11-1, WRKY11-2 and WRKY11-3 were ubiquitously expressed in all tissue types, and WRKY15-2 was highly expressed in the stem, root, nodule and pod tissues in G. max and P. vulgaris. PMID:27853303

  11. Novel Genomic and Evolutionary Insight of WRKY Transcription Factors in Plant Lineage.

    PubMed

    Mohanta, Tapan Kumar; Park, Yong-Hwan; Bae, Hanhong

    2016-11-17

    The evolutionarily conserved WRKY transcription factor (TF) regulates different aspects of gene expression in plants, and modulates growth, development, as well as biotic and abiotic stress responses. Therefore, understanding the details regarding WRKY TFs is very important. In this study, large-scale genomic analyses of the WRKY TF gene family from 43 plant species were conducted. The results of our study revealed that WRKY TFs could be grouped and specifically classified as those belonging to the monocot or dicot plant lineage. In this study, we identified several novel WRKY TFs. To our knowledge, this is the first report on a revised grouping system of the WRKY TF gene family in plants. The different forms of novel chimeric forms of WRKY TFs in the plant genome might play a crucial role in their evolution. Tissue-specific gene expression analyses in Glycine max and Phaseolus vulgaris showed that WRKY11-1, WRKY11-2 and WRKY11-3 were ubiquitously expressed in all tissue types, and WRKY15-2 was highly expressed in the stem, root, nodule and pod tissues in G. max and P. vulgaris.

  12. Insights into the noncoding RNome of nitrogen-fixing endosymbiotic α-proteobacteria.

    PubMed

    Jiménez-Zurdo, José I; Valverde, Claudio; Becker, Anke

    2013-02-01

    Symbiotic chronic infection of legumes by rhizobia involves transition of invading bacteria from a free-living environment in soil to an intracellular state as differentiated nitrogen-fixing bacteroids within the nodules elicited in the host plant. The adaptive flexibility demanded by this complex lifestyle is likely facilitated by the large set of regulatory proteins encoded by rhizobial genomes. However, proteins are not the only relevant players in the regulation of gene expression in bacteria. Large-scale high-throughput analysis of prokaryotic genomes is evidencing the expression of an unexpected plethora of small untranslated transcripts (sRNAs) with housekeeping or regulatory roles. sRNAs mostly act in response to environmental cues as post-transcriptional regulators of gene expression through protein-assisted base-pairing interactions with target mRNAs. Riboregulation contributes to fine-tune a wide range of bacterial processes which, in intracellular animal pathogens, largely compromise virulence traits. Here, we summarize the incipient knowledge about the noncoding RNome structure of nitrogen-fixing endosymbiotic bacteria as inferred from genome-wide searches for sRNA genes in the alfalfa partner Sinorhizobium meliloti and further comparative genomics analysis. The biology of relevant S. meliloti RNA chaperones (e.g., Hfq) is also reviewed as a first global indicator of the impact of riboregulation in the establishment of the symbiotic interaction.

  13. Defining the Human Macula Transcriptome and Candidate Retinal Disease Genes UsingEyeSAGE

    PubMed Central

    Rickman, Catherine Bowes; Ebright, Jessica N.; Zavodni, Zachary J.; Yu, Ling; Wang, Tianyuan; Daiger, Stephen P.; Wistow, Graeme; Boon, Kathy; Hauser, Michael A.

    2009-01-01

    Purpose To develop large-scale, high-throughput annotation of the human macula transcriptome and to identify and prioritize candidate genes for inherited retinal dystrophies, based on ocular-expression profiles using serial analysis of gene expression (SAGE). Methods Two human retina and two retinal pigment epithelium (RPE)/choroid SAGE libraries made from matched macula or midperipheral retina and adjacent RPE/choroid of morphologically normal 28- to 66-year-old donors and a human central retina longSAGE library made from 41- to 66-year-old donors were generated. Their transcription profiles were entered into a relational database, EyeSAGE, including microarray expression profiles of retina and publicly available normal human tissue SAGE libraries. EyeSAGE was used to identify retina- and RPE-specific and -associated genes, and candidate genes for retina and RPE disease loci. Differential and/or cell-type specific expression was validated by quantitative and single-cell RT-PCR. Results Cone photoreceptor-associated gene expression was elevated in the macula transcription profiles. Analysis of the longSAGE retina tags enhanced tag-to-gene mapping and revealed alternatively spliced genes. Analysis of candidate gene expression tables for the identified Bardet-Biedl syndrome disease gene (BBS5) in the BBS5 disease region table yielded BBS5 as the top candidate. Compelling candidates for inherited retina diseases were identified. Conclusions The EyeSAGE database, combining three different gene-profiling platforms including the authors’ multidonor-derived retina/RPE SAGE libraries and existing single-donor retina/RPE libraries, is a powerful resource for definition of the retina and RPE transcriptomes. It can be used to identify retina-specific genes, including alternatively spliced transcripts and to prioritize candidate genes within mapped retinal disease regions. PMID:16723438

  14. Defining the human macula transcriptome and candidate retinal disease genes using EyeSAGE.

    PubMed

    Bowes Rickman, Catherine; Ebright, Jessica N; Zavodni, Zachary J; Yu, Ling; Wang, Tianyuan; Daiger, Stephen P; Wistow, Graeme; Boon, Kathy; Hauser, Michael A

    2006-06-01

    To develop large-scale, high-throughput annotation of the human macula transcriptome and to identify and prioritize candidate genes for inherited retinal dystrophies, based on ocular-expression profiles using serial analysis of gene expression (SAGE). Two human retina and two retinal pigment epithelium (RPE)/choroid SAGE libraries made from matched macula or midperipheral retina and adjacent RPE/choroid of morphologically normal 28- to 66-year-old donors and a human central retina longSAGE library made from 41- to 66-year-old donors were generated. Their transcription profiles were entered into a relational database, EyeSAGE, including microarray expression profiles of retina and publicly available normal human tissue SAGE libraries. EyeSAGE was used to identify retina- and RPE-specific and -associated genes, and candidate genes for retina and RPE disease loci. Differential and/or cell-type specific expression was validated by quantitative and single-cell RT-PCR. Cone photoreceptor-associated gene expression was elevated in the macula transcription profiles. Analysis of the longSAGE retina tags enhanced tag-to-gene mapping and revealed alternatively spliced genes. Analysis of candidate gene expression tables for the identified Bardet-Biedl syndrome disease gene (BBS5) in the BBS5 disease region table yielded BBS5 as the top candidate. Compelling candidates for inherited retina diseases were identified. The EyeSAGE database, combining three different gene-profiling platforms including the authors' multidonor-derived retina/RPE SAGE libraries and existing single-donor retina/RPE libraries, is a powerful resource for definition of the retina and RPE transcriptomes. It can be used to identify retina-specific genes, including alternatively spliced transcripts and to prioritize candidate genes within mapped retinal disease regions.

  15. First Transcriptome and Digital Gene Expression Analysis in Neuroptera with an Emphasis on Chemoreception Genes in Chrysopa pallens (Rambur).

    PubMed

    Li, Zhao-Qun; Zhang, Shuai; Ma, Yan; Luo, Jun-Yu; Wang, Chun-Yi; Lv, Li-Min; Dong, Shuang-Lin; Cui, Jin-Jie

    2013-01-01

    Chrysopa pallens (Rambur) are the most important natural enemies and predators of various agricultural pests. Understanding the sophisticated olfactory system in insect antennae is crucial for studying the physiological bases of olfaction and also could lead to effective applications of C. pallens in integrated pest management. However no transcriptome information is available for Neuroptera, and sequence data for C. pallens are scarce, so obtaining more sequence data is a priority for researchers on this species. To facilitate identifying sets of genes involved in olfaction, a normalized transcriptome of C. pallens was sequenced. A total of 104,603 contigs were obtained and assembled into 10,662 clusters and 39,734 singletons; 20,524 were annotated based on BLASTX analyses. A large number of candidate chemosensory genes were identified, including 14 odorant-binding proteins (OBPs), 22 chemosensory proteins (CSPs), 16 ionotropic receptors, 14 odorant receptors, and genes potentially involved in olfactory modulation. To better understand the OBPs, CSPs and cytochrome P450s, phylogenetic trees were constructed. In addition, 10 digital gene expression libraries of different tissues were constructed and gene expression profiles were compared among different tissues in males and females. Our results provide a basis for exploring the mechanisms of chemoreception in C. pallens, as well as other insects. The evolutionary analyses in our study provide new insights into the differentiation and evolution of insect OBPs and CSPs. Our study provided large-scale sequence information for further studies in C. pallens.

  16. Bridging the Gap between Gene Expression and Metabolic Phenotype via Kinetic Models

    DTIC Science & Technology

    2013-07-22

    construction of large-scale kinetic models of metabolism, namely, the detailed definition of appro- priate reaction rate expressions and the determination...mole of bio- mass precursor, and the summation included only the drain fluxes to biomass. Note that this definition of the biomass growth rate can... 4P G ly co ly si s Glu Pro Arg Lys Glucose-6P Carbohydrates RNA Lipids Acetaldehyde Figure 2 Metabolic network of the central carbon

  17. Network-directed cis-mediator analysis of normal prostate tissue expression profiles reveals downstream regulatory associations of prostate cancer susceptibility loci.

    PubMed

    Larson, Nicholas B; McDonnell, Shannon K; Fogarty, Zach; Larson, Melissa C; Cheville, John; Riska, Shaun; Baheti, Saurabh; Weber, Alexandra M; Nair, Asha A; Wang, Liang; O'Brien, Daniel; Davila, Jaime; Schaid, Daniel J; Thibodeau, Stephen N

    2017-10-17

    Large-scale genome-wide association studies have identified multiple single-nucleotide polymorphisms associated with risk of prostate cancer. Many of these genetic variants are presumed to be regulatory in nature; however, follow-up expression quantitative trait loci (eQTL) association studies have to-date been restricted largely to cis -acting associations due to study limitations. While trans -eQTL scans suffer from high testing dimensionality, recent evidence indicates most trans -eQTL associations are mediated by cis -regulated genes, such as transcription factors. Leveraging a data-driven gene co-expression network, we conducted a comprehensive cis -mediator analysis using RNA-Seq data from 471 normal prostate tissue samples to identify downstream regulatory associations of previously identified prostate cancer risk variants. We discovered multiple trans -eQTL associations that were significantly mediated by cis -regulated transcripts, four of which involved risk locus 17q12, proximal transcription factor HNF1B , and target trans -genes with known HNF response elements ( MIA2 , SRC , SEMA6A , KIF12 ). We additionally identified evidence of cis -acting down-regulation of MSMB via rs10993994 corresponding to reduced co-expression of NDRG1 . The majority of these cis -mediator relationships demonstrated trans -eQTL replicability in 87 prostate tissue samples from the Gene-Tissue Expression Project. These findings provide further biological context to known risk loci and outline new hypotheses for investigation into the etiology of prostate cancer.

  18. Medium-throughput processing of whole mount in situ hybridisation experiments into gene expression domains.

    PubMed

    Crombach, Anton; Cicin-Sain, Damjan; Wotton, Karl R; Jaeger, Johannes

    2012-01-01

    Understanding the function and evolution of developmental regulatory networks requires the characterisation and quantification of spatio-temporal gene expression patterns across a range of systems and species. However, most high-throughput methods to measure the dynamics of gene expression do not preserve the detailed spatial information needed in this context. For this reason, quantification methods based on image bioinformatics have become increasingly important over the past few years. Most available approaches in this field either focus on the detailed and accurate quantification of a small set of gene expression patterns, or attempt high-throughput analysis of spatial expression through binary pattern extraction and large-scale analysis of the resulting datasets. Here we present a robust, "medium-throughput" pipeline to process in situ hybridisation patterns from embryos of different species of flies. It bridges the gap between high-resolution, and high-throughput image processing methods, enabling us to quantify graded expression patterns along the antero-posterior axis of the embryo in an efficient and straightforward manner. Our method is based on a robust enzymatic (colorimetric) in situ hybridisation protocol and rapid data acquisition through wide-field microscopy. Data processing consists of image segmentation, profile extraction, and determination of expression domain boundary positions using a spline approximation. It results in sets of measured boundaries sorted by gene and developmental time point, which are analysed in terms of expression variability or spatio-temporal dynamics. Our method yields integrated time series of spatial gene expression, which can be used to reverse-engineer developmental gene regulatory networks across species. It is easily adaptable to other processes and species, enabling the in silico reconstitution of gene regulatory networks in a wide range of developmental contexts.

  19. Transcriptional Analysis of Resistance to Low Temperatures in Bermudagrass Crown Tissues

    PubMed Central

    Melmaiee, Kalpalatha; Anderson, Michael; Elavarthi, Sathya; Guenzi, Arron; Canaan, Patricia

    2015-01-01

    Bermudagrass (Cynodon dactylon L pers.) is one of the most geographically adapted and utilized of the warm-season grasses. However, bermudagrass adaptation to the Northern USA is limited by freeze damage and winterkill. Our study provides the first large-scale analyses of gene expression in bermudagrass regenerative crown tissues during cold acclimation. We compared gene expression patterns in crown tissues from highly cold tolerant “MSU” and susceptible “Zebra” genotypes exposed to near-freezing temperatures. Suppressive subtractive hybridization was used to isolate putative cold responsive genes Approximately, 3845 transcript sequences enriched for cold acclimation were deposited in the GenBank. A total of 4589 ESTs (3184 unigenes) including 744 ESTs associated with the bermudagrass disease spring dead spot were printed on microarrays and hybridized with cold acclimated complementary Deoxyribonucleic acid (cDNA). A total of 587 differentially expressed unigenes were identified in this study. Of these only 97 (17%) showed significant NCBI matches. The overall expression pattern revealed 40% more down- than up-regulated genes, which was particularly enhanced in MSU compared to Zebra. Among the up-regulated genes 68% were uniquely expressed in MSU (36%) or Zebra (32%). Among the down-regulated genes 40% were unique to MSU, while only 15% to Zebra. Overall expression intensity was significantly higher in MSU than in Zebra (p value ≤ 0.001) and the overall number of genes expressed at 28 days was 2.7 fold greater than at 2 days. These changes in expression patterns reflect the strong genotypic and temporal response to cold temperatures. Additionally, differentially expressed genes from this study can be utilized for developing molecular markers in bermudagrass and other warm season grasses for enhancing cold hardiness. PMID:26348040

  20. Purification and properties of insulin receptor ectodomain from large-scale mammalian cell culture.

    PubMed

    Cosgrove, L; Lovrecz, G O; Verkuylen, A; Cavaleri, L; Black, L A; Bentley, J D; Howlett, G J; Gray, P P; Ward, C W; McKern, N M

    1995-12-01

    Ectodomain of the exon 11+ form of the human insulin receptor (hIR) was expressed in the mammalian cell secretion vector pEE6.HCMV-GS, containing the glutamine synthetase gene. Following transfection of the hIR ectodomain gene into Chinese hamster ovary (CHO-K1) cells, clones were isolated by selecting for glutamine synthetase expression with methionine sulphoximine. The expression levels of ectodomain were subsequently increased by gene amplification. Production was scaled up using a 40-liter airlift fermenter in which the transfected CHO-K1 cells were cultured on microcarrier beads, initially in medium containing 10% fetal calf serum (FCS). By continuous perfusion of serum-free medium into the bioreactor, cell viability was maintained during reduction of FCS, which enabled soluble hIR ectodomain to be harvested for at least 22 days. Harvests were concentrated 20-fold by anion-exchange chromatography. Optimal recovery of ectodomain from early harvests containing large quantities of serum proteins was achieved by insulin-affinity chromatography, whereas in later harvests purification was achieved by multistep chromatography. Analysis of the purified hIR ectodomain showed that it had a molecular weight by sedimentation equilibrium analysis of 269,500. Amino-terminal amino acid sequence analysis showed that the ectodomain was correctly processed to alpha and beta chains and that glycosylation characteristics were similar to those of native hIR. The integrity of the ectodomain was demonstrated by the recognition of conformation-dependent anti-hIR antibodies and by its binding of insulin (Kd approximately 2 x 10(-9) M). These results demonstrate the successful production and purification of hIR ectodomain by processes amenable to scale-up and in a form appropriate for structure/function studies of the ligand-binding domain of the receptor.

  1. Blood Gene Expression Predicts Bronchiolitis Obliterans Syndrome

    PubMed Central

    Danger, Richard; Royer, Pierre-Joseph; Reboulleau, Damien; Durand, Eugénie; Loy, Jennifer; Tissot, Adrien; Lacoste, Philippe; Roux, Antoine; Reynaud-Gaubert, Martine; Gomez, Carine; Kessler, Romain; Mussot, Sacha; Dromer, Claire; Brugière, Olivier; Mornex, Jean-François; Guillemain, Romain; Dahan, Marcel; Knoop, Christiane; Botturi, Karine; Foureau, Aurore; Pison, Christophe; Koutsokera, Angela; Nicod, Laurent P.; Brouard, Sophie; Magnan, Antoine; Jougon, J.

    2018-01-01

    Bronchiolitis obliterans syndrome (BOS), the main manifestation of chronic lung allograft dysfunction, leads to poor long-term survival after lung transplantation. Identifying predictors of BOS is essential to prevent the progression of dysfunction before irreversible damage occurs. By using a large set of 107 samples from lung recipients, we performed microarray gene expression profiling of whole blood to identify early biomarkers of BOS, including samples from 49 patients with stable function for at least 3 years, 32 samples collected at least 6 months before BOS diagnosis (prediction group), and 26 samples at or after BOS diagnosis (diagnosis group). An independent set from 25 lung recipients was used for validation by quantitative PCR (13 stables, 11 in the prediction group, and 8 in the diagnosis group). We identified 50 transcripts differentially expressed between stable and BOS recipients. Three genes, namely POU class 2 associating factor 1 (POU2AF1), T-cell leukemia/lymphoma protein 1A (TCL1A), and B cell lymphocyte kinase, were validated as predictive biomarkers of BOS more than 6 months before diagnosis, with areas under the curve of 0.83, 0.77, and 0.78 respectively. These genes allow stratification based on BOS risk (log-rank test p < 0.01) and are not associated with time posttransplantation. This is the first published large-scale gene expression analysis of blood after lung transplantation. The three-gene blood signature could provide clinicians with new tools to improve follow-up and adapt treatment of patients likely to develop BOS. PMID:29375549

  2. Gene targeting by TALEN-induced homologous recombination in goats directs production of β-lactoglobulin-free, high-human lactoferrin milk

    PubMed Central

    Cui, Chenchen; Song, Yujie; Liu, Jun; Ge, Hengtao; Li, Qian; Huang, Hui; Hu, Linyong; Zhu, Hongmei; Jin, Yaping; Zhang, Yong

    2015-01-01

    β-Lactoglobulin (BLG) is a major goat’s milk allergen that is absent in human milk. Engineered endonucleases, including transcription activator-like effector nucleases (TALENs) and zinc-finger nucleases, enable targeted genetic modification in livestock. In this study, TALEN-mediated gene knockout followed by gene knock-in were used to generate BLG knockout goats as mammary gland bioreactors for large-scale production of human lactoferrin (hLF). We introduced precise genetic modifications in the goat genome at frequencies of approximately 13.6% and 6.09% for the first and second sequential targeting, respectively, by using targeting vectors that underwent TALEN-induced homologous recombination (HR). Analysis of milk from the cloned goats revealed large-scale hLF expression or/and decreased BLG levels in milk from heterozygous goats as well as the absence of BLG in milk from homozygous goats. Furthermore, the TALEN-mediated targeting events in somatic cells can be transmitted through the germline after SCNT. Our result suggests that gene targeting via TALEN-induced HR may expedite the production of genetically engineered livestock for agriculture and biomedicine. PMID:25994151

  3. Gene targeting by TALEN-induced homologous recombination in goats directs production of β-lactoglobulin-free, high-human lactoferrin milk.

    PubMed

    Cui, Chenchen; Song, Yujie; Liu, Jun; Ge, Hengtao; Li, Qian; Huang, Hui; Hu, Linyong; Zhu, Hongmei; Jin, Yaping; Zhang, Yong

    2015-05-21

    β-Lactoglobulin (BLG) is a major goat's milk allergen that is absent in human milk. Engineered endonucleases, including transcription activator-like effector nucleases (TALENs) and zinc-finger nucleases, enable targeted genetic modification in livestock. In this study, TALEN-mediated gene knockout followed by gene knock-in were used to generate BLG knockout goats as mammary gland bioreactors for large-scale production of human lactoferrin (hLF). We introduced precise genetic modifications in the goat genome at frequencies of approximately 13.6% and 6.09% for the first and second sequential targeting, respectively, by using targeting vectors that underwent TALEN-induced homologous recombination (HR). Analysis of milk from the cloned goats revealed large-scale hLF expression or/and decreased BLG levels in milk from heterozygous goats as well as the absence of BLG in milk from homozygous goats. Furthermore, the TALEN-mediated targeting events in somatic cells can be transmitted through the germline after SCNT. Our result suggests that gene targeting via TALEN-induced HR may expedite the production of genetically engineered livestock for agriculture and biomedicine.

  4. Soybean defense responses to the soybean aphid.

    PubMed

    Li, Yan; Zou, Jijun; Li, Min; Bilgin, Damla D; Vodkin, Lila O; Hartman, Glen L; Clough, Steven J

    2008-01-01

    Transcript profiles in aphid (Aphis glycines)-resistant (cv. Dowling) and -susceptible (cv. Williams 82) soybean (Glycine max) cultivars using soybean cDNA microarrays were investigated. Large-scale soybean cDNA microarrays representing approx. 18 000 genes or c. 30% of the soybean genome were compared at 6 and 12 h post-application of aphids. In a separate experiment utilizing clip cages, expression of three defense-related genes were examined at 6, 12, 24, 48, and 72 h in both cultivars by quantitative real-time PCR. One hundred and forty genes showed specific responses for resistance; these included genes related to cell wall, defense, DNA/RNA, secondary metabolism, signaling and other processes. When an extended time period of sampling was investigated, earlier and greater induction of three defense-related genes was observed in the resistant cultivar; however, the induction declined after 24 or 48 h in the resistant cultivar but continued to increase in the susceptible cultivar after 24 h. Aphid-challenged resistant plants showed rapid differential gene expression patterns similar to the incompatible response induced by avirulent Pseudomonas syringae. Five genes were identified as differentially expressed between the two genotypes in the absence of aphids.

  5. Blazing Signature Filter: a library for fast pairwise similarity comparisons

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, Joon-Yong; Fujimoto, Grant M.; Wilson, Ryan

    Identifying similarities between datasets is a fundamental task in data mining and has become an integral part of modern scientific investigation. Whether the task is to identify co-expressed genes in large-scale expression surveys or to predict combinations of gene knockouts which would elicit a similar phenotype, the underlying computational task is often a multi-dimensional similarity test. As datasets continue to grow, improvements to the efficiency, sensitivity or specificity of such computation will have broad impacts as it allows scientists to more completely explore the wealth of scientific data. A significant practical drawback of large-scale data mining is the vast majoritymore » of pairwise comparisons are unlikely to be relevant, meaning that they do not share a signature of interest. It is therefore essential to efficiently identify these unproductive comparisons as rapidly as possible and exclude them from more time-intensive similarity calculations. The Blazing Signature Filter (BSF) is a highly efficient pairwise similarity algorithm which enables extensive data mining within a reasonable amount of time. The algorithm transforms datasets into binary metrics, allowing it to utilize the computationally efficient bit operators and provide a coarse measure of similarity. As a result, the BSF can scale to high dimensionality and rapidly filter unproductive pairwise comparison. Two bioinformatics applications of the tool are presented to demonstrate the ability to scale to billions of pairwise comparisons and the usefulness of this approach.« less

  6. Exploration for the Salinity Tolerance-Related Genes from Xero-Halophyte Atriplex canescens Exploiting Yeast Functional Screening System

    PubMed Central

    Li, Jingtao; Sun, Xinhua; Liu, Yanzhi; Wang, Xueliang; Zhang, Hao; Pan, Hongyu

    2017-01-01

    Plant productivity is limited by salinity stress, both in natural and agricultural systems. Identification of salt stress-related genes from halophyte can provide insights into mechanisms of salt stress tolerance in plants. Atriplex canescens is a xero-halophyte that exhibits optimum growth in the presence of 400 mM NaCl. A cDNA library derived from highly salt-treated A. canescens plants was constructed based on a yeast expression system. A total of 53 transgenic yeast clones expressing enhanced salt tolerance were selected from 105 transformants. Their plasmids were sequenced and the gene characteristics were annotated using a BLASTX search. Retransformation of yeast cells with the selected plasmids conferred salt tolerance to the resulting transformants. The expression patterns of 28 of these stress-related genes were further investigated in A. canescens leaves by quantitative reverse transcription-PCR. In this study, we provided a rapid and robust assay system for large-scale screening of genes for varied abiotic stress tolerance with high efficiency in A. canescens. PMID:29149055

  7. Nonclassical Regulation of Transcription: Interchromosomal Interactions at the Malic enzyme Locus of Drosophila melanogaster

    PubMed Central

    Lum, Thomas E.; Merritt, Thomas J. S.

    2011-01-01

    Regulation of transcription can be a complex process in which many cis- and trans-interactions determine the final pattern of expression. Among these interactions are trans-interactions mediated by the pairing of homologous chromosomes. These trans-effects are wide ranging, affecting gene regulation in many species and creating complex possibilities in gene regulation. Here we describe a novel case of trans-interaction between alleles of the Malic enzyme (Men) locus in Drosophila melanogaster that results in allele-specific, non-additive gene expression. Using both empirical biochemical and predictive bioinformatic approaches, we show that the regulatory elements of one allele are capable of interacting in trans with, and modifying the expression of, the second allele. Furthermore, we show that nonlocal factors—different genetic backgrounds—are capable of significant interactions with individual Men alleles, suggesting that these trans-effects can be modified by both locally and distantly acting elements. In sum, these results emphasize the complexity of gene regulation and the need to understand both small- and large-scale interactions as more complete models of the role of trans-interactions in gene regulation are developed. PMID:21900270

  8. Temporal Expression-based Analysis of Metabolism

    PubMed Central

    Segrè, Daniel

    2012-01-01

    Metabolic flux is frequently rerouted through cellular metabolism in response to dynamic changes in the intra- and extra-cellular environment. Capturing the mechanisms underlying these metabolic transitions in quantitative and predictive models is a prominent challenge in systems biology. Progress in this regard has been made by integrating high-throughput gene expression data into genome-scale stoichiometric models of metabolism. Here, we extend previous approaches to perform a Temporal Expression-based Analysis of Metabolism (TEAM). We apply TEAM to understanding the complex metabolic dynamics of the respiratorily versatile bacterium Shewanella oneidensis grown under aerobic, lactate-limited conditions. TEAM predicts temporal metabolic flux distributions using time-series gene expression data. Increased predictive power is achieved by supplementing these data with a large reference compendium of gene expression, which allows us to take into account the unique character of the distribution of expression of each individual gene. We further propose a straightforward method for studying the sensitivity of TEAM to changes in its fundamental free threshold parameter θ, and reveal that discrete zones of distinct metabolic behavior arise as this parameter is changed. By comparing the qualitative characteristics of these zones to additional experimental data, we are able to constrain the range of θ to a small, well-defined interval. In parallel, the sensitivity analysis reveals the inherently difficult nature of dynamic metabolic flux modeling: small errors early in the simulation propagate to relatively large changes later in the simulation. We expect that handling such “history-dependent” sensitivities will be a major challenge in the future development of dynamic metabolic-modeling techniques. PMID:23209390

  9. Evidence for Alteration of Gene Regulatory Networks through MicroRNAs of the HIV-infected brain: novel analysis of retrospective cases.

    PubMed

    Tatro, Erick T; Scott, Erick R; Nguyen, Timothy B; Salaria, Shahid; Banerjee, Sugato; Moore, David J; Masliah, Eliezer; Achim, Cristian L; Everall, Ian P

    2010-04-26

    HIV infection disturbs the central nervous system (CNS) through inflammation and glial activation. Evidence suggests roles for microRNA (miRNA) in host defense and neuronal homeostasis, though little is known about miRNAs' role in HIV CNS infection. MiRNAs are non-coding RNAs that regulate gene translation through post-transcriptional mechanisms. Messenger-RNA profiling alone is insufficient to elucidate the dynamic dance of molecular expression of the genome. We sought to clarify RNA alterations in the frontal cortex (FC) of HIV-infected individuals and those concurrently infected and diagnosed with major depressive disorder (MDD). This report is the first published study of large-scale miRNA profiling from human HIV-infected FC. The goals of this study were to: 1. Identify changes in miRNA expression that occurred in the frontal cortex (FC) of HIV individuals, 2. Determine whether miRNA expression profiles of the FC could differentiate HIV from HIV/MDD, and 3. Adapt a method to meaningfully integrate gene expression data and miRNA expression data in clinical samples. We isolated RNA from the FC (n = 3) of three separate groups (uninfected controls, HIV, and HIV/MDD) and then pooled the RNA within each group for use in large-scale miRNA profiling. RNA from HIV and HIV/MDD patients (n = 4 per group) were also used for non-pooled mRNA analysis on Affymetrix U133 Plus 2.0 arrays. We then utilized a method for integrating the two datasets in a Target Bias Analysis. We found miRNAs of three types: A) Those with many dysregulated mRNA targets of less stringent statistical significance, B) Fewer dysregulated target-genes of highly stringent statistical significance, and C) unclear bias. In HIV/MDD, more miRNAs were downregulated than in HIV alone. Specific miRNA families at targeted chromosomal loci were dysregulated. The dysregulated miRNAs clustered on Chromosomes 14, 17, 19, and X. A small subset of dysregulated genes had many 3' untranslated region (3'UTR) target-sites for dysregulated miRNAs. We provide evidence that certain miRNAs serve as key elements in gene regulatory networks in HIV-infected FC and may be implicated in neurobehavioral disorder. Finally, our data indicates that some genes may serve as hubs of miRNA activity.

  10. Molecular phenotype of zebrafish ovarian follicle by serial analysis of gene expression and proteomic profiling, and comparison with the transcriptomes of other animals

    PubMed Central

    Knoll-Gellida, Anja; André, Michèle; Gattegno, Tamar; Forgue, Jean; Admon, Arie; Babin, Patrick J

    2006-01-01

    Background The ability of an oocyte to develop into a viable embryo depends on the accumulation of specific maternal information and molecules, such as RNAs and proteins. A serial analysis of gene expression (SAGE) was carried out in parallel with proteomic analysis on fully-grown ovarian follicles from zebrafish (Danio rerio). The data obtained were compared with ovary/follicle/egg molecular phenotypes of other animals, published or available in public sequence databases. Results Sequencing of 27,486 SAGE tags identified 11,399 different ones, including 3,329 tags with an occurrence superior to one. Fifty-eight genes were expressed at over 0.15% of the total population and represented 17.34% of the mRNA population identified. The three most expressed transcripts were a rhamnose-binding lectin, beta-actin 2, and a transcribed locus similar to the H2B histone family. Comparison with the large-scale expressed sequence tags sequencing approach revealed highly expressed transcripts that were not previously known to be expressed at high levels in fish ovaries, like the short-sized polarized metallothionein 2 transcript. A higher sensitivity for the detection of transcripts with a characterized maternal genetic contribution was also demonstrated compared to large-scale sequencing of cDNA libraries. Ferritin heavy polypeptide 1, heat shock protein 90-beta, lactate dehydrogenase B4, beta-actin isoforms, tubulin beta 2, ATP synthase subunit 9, together with 40 S ribosomal protein S27a, were common highly-expressed transcripts of vertebrate ovary/unfertilized egg. Comparison of transcriptome and proteome data revealed that transcript levels provide little predictive value with respect to the extent of protein abundance. All the proteins identified by proteomic analysis of fully-grown zebrafish follicles had at least one transcript counterpart, with two exceptions: eosinophil chemotactic cytokine and nothepsin. Conclusion This study provides a complete sequence data set of maternal mRNA stored in zebrafish germ cells at the end of oogenesis. This catalogue contains highly-expressed transcripts that are part of a vertebrate ovarian expressed gene signature. Comparison of transcriptome and proteome data identified downregulated transcripts or proteins potentially incorporated in the oocyte by endocytosis. The molecular phenotype described provides groundwork for future experimental approaches aimed at identifying functionally important stored maternal transcripts and proteins involved in oogenesis and early stages of embryo development. PMID:16526958

  11. LASSIM-A network inference toolbox for genome-wide mechanistic modeling.

    PubMed

    Magnusson, Rasmus; Mariotti, Guido Pio; Köpsén, Mattias; Lövfors, William; Gawel, Danuta R; Jörnsten, Rebecka; Linde, Jörg; Nordling, Torbjörn E M; Nyman, Elin; Schulze, Sylvie; Nestor, Colm E; Zhang, Huan; Cedersund, Gunnar; Benson, Mikael; Tjärnberg, Andreas; Gustafsson, Mika

    2017-06-01

    Recent technological advancements have made time-resolved, quantitative, multi-omics data available for many model systems, which could be integrated for systems pharmacokinetic use. Here, we present large-scale simulation modeling (LASSIM), which is a novel mathematical tool for performing large-scale inference using mechanistically defined ordinary differential equations (ODE) for gene regulatory networks (GRNs). LASSIM integrates structural knowledge about regulatory interactions and non-linear equations with multiple steady state and dynamic response expression datasets. The rationale behind LASSIM is that biological GRNs can be simplified using a limited subset of core genes that are assumed to regulate all other gene transcription events in the network. The LASSIM method is implemented as a general-purpose toolbox using the PyGMO Python package to make the most of multicore computers and high performance clusters, and is available at https://gitlab.com/Gustafsson-lab/lassim. As a method, LASSIM works in two steps, where it first infers a non-linear ODE system of the pre-specified core gene expression. Second, LASSIM in parallel optimizes the parameters that model the regulation of peripheral genes by core system genes. We showed the usefulness of this method by applying LASSIM to infer a large-scale non-linear model of naïve Th2 cell differentiation, made possible by integrating Th2 specific bindings, time-series together with six public and six novel siRNA-mediated knock-down experiments. ChIP-seq showed significant overlap for all tested transcription factors. Next, we performed novel time-series measurements of total T-cells during differentiation towards Th2 and verified that our LASSIM model could monitor those data significantly better than comparable models that used the same Th2 bindings. In summary, the LASSIM toolbox opens the door to a new type of model-based data analysis that combines the strengths of reliable mechanistic models with truly systems-level data. We demonstrate the power of this approach by inferring a mechanistically motivated, genome-wide model of the Th2 transcription regulatory system, which plays an important role in several immune related diseases.

  12. Exploring candidate biomarkers for lung and prostate cancers using gene expression and flux variability analysis.

    PubMed

    Asgari, Yazdan; Khosravi, Pegah; Zabihinpour, Zahra; Habibi, Mahnaz

    2018-02-19

    Genome-scale metabolic models have provided valuable resources for exploring changes in metabolism under normal and cancer conditions. However, metabolism itself is strongly linked to gene expression, so integration of gene expression data into metabolic models might improve the detection of genes involved in the control of tumor progression. Herein, we considered gene expression data as extra constraints to enhance the predictive powers of metabolic models. We reconstructed genome-scale metabolic models for lung and prostate, under normal and cancer conditions to detect the major genes associated with critical subsystems during tumor development. Furthermore, we utilized gene expression data in combination with an information theory-based approach to reconstruct co-expression networks of the human lung and prostate in both cohorts. Our results revealed 19 genes as candidate biomarkers for lung and prostate cancer cells. This study also revealed that the development of a complementary approach (integration of gene expression and metabolic profiles) could lead to proposing novel biomarkers and suggesting renovated cancer treatment strategies which have not been possible to detect using either of the methods alone.

  13. Reverse-engineering of gene networks for regulating early blood development from single-cell measurements.

    PubMed

    Wei, Jiangyong; Hu, Xiaohua; Zou, Xiufen; Tian, Tianhai

    2017-12-28

    Recent advances in omics technologies have raised great opportunities to study large-scale regulatory networks inside the cell. In addition, single-cell experiments have measured the gene and protein activities in a large number of cells under the same experimental conditions. However, a significant challenge in computational biology and bioinformatics is how to derive quantitative information from the single-cell observations and how to develop sophisticated mathematical models to describe the dynamic properties of regulatory networks using the derived quantitative information. This work designs an integrated approach to reverse-engineer gene networks for regulating early blood development based on singel-cell experimental observations. The wanderlust algorithm is initially used to develop the pseudo-trajectory for the activities of a number of genes. Since the gene expression data in the developed pseudo-trajectory show large fluctuations, we then use Gaussian process regression methods to smooth the gene express data in order to obtain pseudo-trajectories with much less fluctuations. The proposed integrated framework consists of both bioinformatics algorithms to reconstruct the regulatory network and mathematical models using differential equations to describe the dynamics of gene expression. The developed approach is applied to study the network regulating early blood cell development. A graphic model is constructed for a regulatory network with forty genes and a dynamic model using differential equations is developed for a network of nine genes. Numerical results suggests that the proposed model is able to match experimental data very well. We also examine the networks with more regulatory relations and numerical results show that more regulations may exist. We test the possibility of auto-regulation but numerical simulations do not support the positive auto-regulation. In addition, robustness is used as an importantly additional criterion to select candidate networks. The research results in this work shows that the developed approach is an efficient and effective method to reverse-engineer gene networks using single-cell experimental observations.

  14. Generation, Annotation, and Analysis of a Large-Scale Expressed Sequence Tag Library from Arabidopsis pumila to Explore Salt-Responsive Genes.

    PubMed

    Huang, Xianzhong; Yang, Lifei; Jin, Yuhuan; Lin, Jun; Liu, Fang

    2017-01-01

    Arabidopsis pumila is an ephemeral plant, and a close relative of the model plant Arabidopsis thaliana , but it possesses higher photosynthetic efficiency, higher propagation rate, and higher salinity tolerance compared to those A. thaliana , thus providing a candidate plant system for gene mining for environmental adaption and salt tolerance. However, A. pumila is an under-explored resource for understanding the genetic mechanisms underlying abiotic stress adaptation. To improve our understanding of the molecular and genetic mechanisms of salt stress adaptation, more than 19,900 clones randomly selected from a cDNA library constructed previously from leaf tissue exposed to high-salinity shock were sequenced. A total of 16,014 high-quality expressed sequence tags (ESTs) were generated, which have been deposited in the dbEST GenBank under accession numbers JZ932319 to JZ948332. Clustering and assembly of these ESTs resulted in the identification of 8,835 unique sequences, consisting of 2,469 contigs and 6,366 singletons. The blastx results revealed 8,011 unigenes with significant similarity to known genes, while only 425 unigenes remained uncharacterized. Functional classification demonstrated an abundance of unigenes involved in binding, catalytic, structural or transporter activities, and in pathways of energy, carbohydrate, amino acid, or lipid metabolism. At least seven main classes of genes were related to salt-tolerance among the 8,835 unigenes. Many previously reported salt tolerance genes were also manifested in this library, for example VP1, H + -ATPase, NHX1, SOS2, SOS3, NAC, MYB, ERF, LEA, P5CS1 . In addition, 251 transcription factors were identified from the library, classified into 42 families. Lastly, changes in expression of the 12 most abundant unigenes, 12 transcription factor genes, and 19 stress-related genes in the first 24 h of exposure to high-salinity stress conditions were monitored by qRT-PCR. The large-scale EST library obtained in this study provides first-hand information on gene sequences expressed in young leaves of A. pumila exposed to salt shock. The rapid discovery of known or unknown genes related to salinity stress response in A. pumila will facilitate the understanding of complex adaptive mechanisms for ephemerals.

  15. Versatile Gene-Specific Sequence Tags for Arabidopsis Functional Genomics: Transcript Profiling and Reverse Genetics Applications

    PubMed Central

    Hilson, Pierre; Allemeersch, Joke; Altmann, Thomas; Aubourg, Sébastien; Avon, Alexandra; Beynon, Jim; Bhalerao, Rishikesh P.; Bitton, Frédérique; Caboche, Michel; Cannoot, Bernard; Chardakov, Vasil; Cognet-Holliger, Cécile; Colot, Vincent; Crowe, Mark; Darimont, Caroline; Durinck, Steffen; Eickhoff, Holger; de Longevialle, Andéol Falcon; Farmer, Edward E.; Grant, Murray; Kuiper, Martin T.R.; Lehrach, Hans; Léon, Céline; Leyva, Antonio; Lundeberg, Joakim; Lurin, Claire; Moreau, Yves; Nietfeld, Wilfried; Paz-Ares, Javier; Reymond, Philippe; Rouzé, Pierre; Sandberg, Goran; Segura, Maria Dolores; Serizet, Carine; Tabrett, Alexandra; Taconnat, Ludivine; Thareau, Vincent; Van Hummelen, Paul; Vercruysse, Steven; Vuylsteke, Marnik; Weingartner, Magdalena; Weisbeek, Peter J.; Wirta, Valtteri; Wittink, Floyd R.A.; Zabeau, Marc; Small, Ian

    2004-01-01

    Microarray transcript profiling and RNA interference are two new technologies crucial for large-scale gene function studies in multicellular eukaryotes. Both rely on sequence-specific hybridization between complementary nucleic acid strands, inciting us to create a collection of gene-specific sequence tags (GSTs) representing at least 21,500 Arabidopsis genes and which are compatible with both approaches. The GSTs were carefully selected to ensure that each of them shared no significant similarity with any other region in the Arabidopsis genome. They were synthesized by PCR amplification from genomic DNA. Spotted microarrays fabricated from the GSTs show good dynamic range, specificity, and sensitivity in transcript profiling experiments. The GSTs have also been transferred to bacterial plasmid vectors via recombinational cloning protocols. These cloned GSTs constitute the ideal starting point for a variety of functional approaches, including reverse genetics. We have subcloned GSTs on a large scale into vectors designed for gene silencing in plant cells. We show that in planta expression of GST hairpin RNA results in the expected phenotypes in silenced Arabidopsis lines. These versatile GST resources provide novel and powerful tools for functional genomics. PMID:15489341

  16. A genome-scale map of expression for a mouse brain section obtained using voxelation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chin, Mark H.; Geng, Alex B.; Khan, Arshad H.

    Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological diseases. We have reconstructed 2- dimensional images of gene expression for 20,000 genes in a coronal slice of the mouse brain at the level of the striatum by using microarrays in combination with voxelation at a resolution of 1 mm3. Good reliability of the microarray results were confirmed using multiple replicates, subsequent quantitative RT-PCR voxelation, mass spectrometry voxelation and publicly available in situ hybridization data. Known and novel genes were identified with expression patterns localized to defined substructures within the brain. In addition, genesmore » with unexpected patterns were identified and cluster analysis identified a set of genes with a gradient of dorsal/ventral expression not restricted to known anatomical boundaries. The genome-scale maps of gene expression obtained using voxelation will be a valuable tool for the neuroscience community.« less

  17. Selection Shapes Transcriptional Logic and Regulatory Specialization in Genetic Networks.

    PubMed

    Fogelmark, Karl; Peterson, Carsten; Troein, Carl

    2016-01-01

    Living organisms need to regulate their gene expression in response to environmental signals and internal cues. This is a computational task where genes act as logic gates that connect to form transcriptional networks, which are shaped at all scales by evolution. Large-scale mutations such as gene duplications and deletions add and remove network components, whereas smaller mutations alter the connections between them. Selection determines what mutations are accepted, but its importance for shaping the resulting networks has been debated. To investigate the effects of selection in the shaping of transcriptional networks, we derive transcriptional logic from a combinatorially powerful yet tractable model of the binding between DNA and transcription factors. By evolving the resulting networks based on their ability to function as either a simple decision system or a circadian clock, we obtain information on the regulation and logic rules encoded in functional transcriptional networks. Comparisons are made between networks evolved for different functions, as well as with structurally equivalent but non-functional (neutrally evolved) networks, and predictions are validated against the transcriptional network of E. coli. We find that the logic rules governing gene expression depend on the function performed by the network. Unlike the decision systems, the circadian clocks show strong cooperative binding and negative regulation, which achieves tight temporal control of gene expression. Furthermore, we find that transcription factors act preferentially as either activators or repressors, both when binding multiple sites for a single target gene and globally in the transcriptional networks. This separation into positive and negative regulators requires gene duplications, which highlights the interplay between mutation and selection in shaping the transcriptional networks.

  18. Large-scale bioinformatic analysis of the regulation of the disease resistance NBS gene family by microRNAs in Poaceae.

    PubMed

    Habachi-Houimli, Yosra; Khalfallah, Yosra; Makni, Hanem; Makni, Mohamed; Bouktila, Dhia

    2016-01-01

    In the present study, we have screened 71, 713, 525, 119 and 241 mature miRNA variants from Hordeum vulgare, Oryza sativa, Brachypodium distachyon, Triticum aestivum, and Sorghum bicolor, respectively, and classified them with respect to their conservation status and expression levels. These Poaceae non-redundant miRNA species (1,669) were distributed over a total of 625 MIR families, among which only 54 were conserved across two or more plant species, confirming the relatively recent evolutionary differentiation of miRNAs in grasses. On the other hand, we have used 257 H. vulgare, 286T. aestivum, 119 B. distachyon, 269 O. sativa, and 139 S. bicolor NBS domains, which were either mined directly from the annotated proteomes, or predicted from whole genome sequence assemblies. The hybridization potential between miRNAs and their putative NBS genes targets was analyzed, revealing that at least 454 NBS genes from all five Poaceae were potentially regulated by 265 distinct miRNA species, most of them expressed in leaves and predominantly co-expressed in additional tissues. Based on gene ontology, we could assign these probable miRNA target genes to 16 functional groups, among which three conferring resistance to bacteria (Rpm1, Xa1 and Rps2), and 13 groups of resistance to fungi (Rpp8,13, Rp3, Tsn1, Lr10, Rps1-k-1, Pm3, Rpg5, and MLA1,6,10,12,13). The results of the present analysis provide a large-scale platform for a better understanding of biological control strategies of disease resistance genes in Poaceae, and will serve as an important starting point for enhancing crop disease resistance improvement by means of transgenic lines with artificial miRNAs. Copyright © 2016 Académie des sciences. Published by Elsevier SAS. All rights reserved.

  19. Plant Omics Data Center: an integrated web repository for interspecies gene expression networks with NLP-based curation.

    PubMed

    Ohyanagi, Hajime; Takano, Tomoyuki; Terashima, Shin; Kobayashi, Masaaki; Kanno, Maasa; Morimoto, Kyoko; Kanegae, Hiromi; Sasaki, Yohei; Saito, Misa; Asano, Satomi; Ozaki, Soichi; Kudo, Toru; Yokoyama, Koji; Aya, Koichiro; Suwabe, Keita; Suzuki, Go; Aoki, Koh; Kubo, Yasutaka; Watanabe, Masao; Matsuoka, Makoto; Yano, Kentaro

    2015-01-01

    Comprehensive integration of large-scale omics resources such as genomes, transcriptomes and metabolomes will provide deeper insights into broader aspects of molecular biology. For better understanding of plant biology, we aim to construct a next-generation sequencing (NGS)-derived gene expression network (GEN) repository for a broad range of plant species. So far we have incorporated information about 745 high-quality mRNA sequencing (mRNA-Seq) samples from eight plant species (Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, Sorghum bicolor, Vitis vinifera, Solanum tuberosum, Medicago truncatula and Glycine max) from the public short read archive, digitally profiled the entire set of gene expression profiles, and drawn GENs by using correspondence analysis (CA) to take advantage of gene expression similarities. In order to understand the evolutionary significance of the GENs from multiple species, they were linked according to the orthology of each node (gene) among species. In addition to other gene expression information, functional annotation of the genes will facilitate biological comprehension. Currently we are improving the given gene annotations with natural language processing (NLP) techniques and manual curation. Here we introduce the current status of our analyses and the web database, PODC (Plant Omics Data Center; http://bioinf.mind.meiji.ac.jp/podc/), now open to the public, providing GENs, functional annotations and additional comprehensive omics resources. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.

  20. Large-scale gene-centric analysis identifies novel variants for coronary artery disease.

    PubMed

    2011-09-01

    Coronary artery disease (CAD) has a significant genetic contribution that is incompletely characterized. To complement genome-wide association (GWA) studies, we conducted a large and systematic candidate gene study of CAD susceptibility, including analysis of many uncommon and functional variants. We examined 49,094 genetic variants in ∼2,100 genes of cardiovascular relevance, using a customised gene array in 15,596 CAD cases and 34,992 controls (11,202 cases and 30,733 controls of European descent; 4,394 cases and 4,259 controls of South Asian origin). We attempted to replicate putative novel associations in an additional 17,121 CAD cases and 40,473 controls. Potential mechanisms through which the novel variants could affect CAD risk were explored through association tests with vascular risk factors and gene expression. We confirmed associations of several previously known CAD susceptibility loci (eg, 9p21.3:p<10(-33); LPA:p<10(-19); 1p13.3:p<10(-17)) as well as three recently discovered loci (COL4A1/COL4A2, ZC3HC1, CYP17A1:p<5×10(-7)). However, we found essentially null results for most previously suggested CAD candidate genes. In our replication study of 24 promising common variants, we identified novel associations of variants in or near LIPA, IL5, TRIB1, and ABCG5/ABCG8, with per-allele odds ratios for CAD risk with each of the novel variants ranging from 1.06-1.09. Associations with variants at LIPA, TRIB1, and ABCG5/ABCG8 were supported by gene expression data or effects on lipid levels. Apart from the previously reported variants in LPA, none of the other ∼4,500 low frequency and functional variants showed a strong effect. Associations in South Asians did not differ appreciably from those in Europeans, except for 9p21.3 (per-allele odds ratio: 1.14 versus 1.27 respectively; P for heterogeneity = 0.003). This large-scale gene-centric analysis has identified several novel genes for CAD that relate to diverse biochemical and cellular functions and clarified the literature with regard to many previously suggested genes.

  1. Micro-scale and meso-scale architectural cues cooperate and compete to direct aligned tissue formation

    PubMed Central

    Gilchrist, Christopher L.; Ruch, David S.; Little, Dianne; Guilak, Farshid

    2014-01-01

    Tissue and biomaterial microenvironments provide architectural cues that direct important cell behaviors including cell shape, alignment, migration, and resulting tissue formation. These architectural features may be presented to cells across multiple length scales, from nanometers to millimeters in size. In this study, we examined how architectural cues at two distinctly different length scales, “micro-scale” cues on the order of ~1–2 μm, and “meso-scale” cues several orders of magnitude larger (>100 μm), interact to direct aligned neo-tissue formation. Utilizing a micro-photopatterning (μPP) model system to precisely arrange cell-adhesive patterns, we examined the effects of substrate architecture at these length scales on human mesenchymal stem cell (hMSC) organization, gene expression, and fibrillar collagen deposition. Both micro- and meso-scale architectures directed cell alignment and resulting tissue organization, and when combined, meso cues could enhance or compete against micro-scale cues. As meso boundary aspect ratios were increased, meso-scale cues overrode micro-scale cues and controlled tissue alignment, with a characteristic critical width (~500 μm) similar to boundary dimensions that exist in vivo in highly aligned tissues. Meso-scale cues acted via both lateral confinement (in a cell-density-dependent manner) and by permitting end-to-end cell arrangements that yielded greater fibrillar collagen deposition. Despite large differences in fibrillar collagen content and organization between μPP architectural conditions, these changes did not correspond with changes in gene expression of key matrix or tendon-related genes. These findings highlight the complex interplay between geometric cues at multiple length scales and may have implications for tissue engineering strategies, where scaffold designs that incorporate cues at multiple length scales could improve neo-tissue organization and resulting functional outcomes. PMID:25263687

  2. A genome-wide inducible phenotypic screen identifies antisense RNA constructs silencing Escherichia coli essential genes.

    PubMed

    Meng, Jia; Kanzaki, Gregory; Meas, Diane; Lam, Christopher K; Crummer, Heather; Tain, Justina; Xu, H Howard

    2012-04-01

    Regulated antisense RNA (asRNA) expression has been employed successfully in Gram-positive bacteria for genome-wide essential gene identification and drug target determination. However, there have been no published reports describing the application of asRNA gene silencing for comprehensive analyses of essential genes in Gram-negative bacteria. In this study, we report the first genome-wide identification of asRNA constructs for essential genes in Escherichia coli. We screened 250 000 library transformants for conditional growth inhibitory recombinant clones from two shotgun genomic libraries of E. coli using a paired-termini expression vector (pHN678). After sequencing plasmid inserts of 675 confirmed inducer sensitive cell clones, we identified 152 separate asRNA constructs of which 134 inserts came from essential genes, while 18 originated from nonessential genes (but share operons with essential genes). Among the 79 individual essential genes silenced by these asRNA constructs, 61 genes (77%) engage in processes related to protein synthesis. The cell-based assays of an asRNA clone targeting fusA (encoding elongation factor G) showed that the induced cells were sensitized 12-fold to fusidic acid, a known specific inhibitor. Our results demonstrate the utility of the paired-termini expression vector and feasibility of large-scale gene silencing in E. coli using regulated asRNA expression. © 2012 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

  3. An integrated bioinformatics approach to improve two-color microarray quality-control: impact on biological conclusions.

    PubMed

    van Haaften, Rachel I M; Luceri, Cristina; van Erk, Arie; Evelo, Chris T A

    2009-06-01

    Omics technology used for large-scale measurements of gene expression is rapidly evolving. This work pointed out the need of an extensive bioinformatics analyses for array quality assessment before and after gene expression clustering and pathway analysis. A study focused on the effect of red wine polyphenols on rat colon mucosa was used to test the impact of quality control and normalisation steps on the biological conclusions. The integration of data visualization, pathway analysis and clustering revealed an artifact problem that was solved with an adapted normalisation. We propose a possible point to point standard analysis procedure, based on a combination of clustering and data visualization for the analysis of microarray data.

  4. Structure and vascular tissue expression of duplicated TERMINAL EAR1-like paralogues in poplar.

    PubMed

    Charon, Céline; Vivancos, Julien; Mazubert, Christelle; Paquet, Nicolas; Pilate, Gilles; Dron, Michel

    2010-02-01

    TERMINAL EAR1-like (TEL) genes encode putative RNA-binding proteins only found in land plants. Previous studies suggested that they may regulate tissue and organ initiation in Poaceae. Two TEL genes were identified in both Populus trichocarpa and the hybrid aspen Populus tremula x P. alba, named, respectively, PoptrTEL1-2 and PtaTEL1-2. The analysis of the organisation around the PoptrTEL genes in the P. trichocarpa genome and the estimation of the synonymous substitution rate for PtaTEL1-2 genes indicate that the paralogous link between these two Populus TEL genes probably results from the Salicoid large-scale gene-duplication event. Phylogenetic analyses confirmed their orthology link with the other TEL genes. The expression pattern of both PtaTEL genes appeared to be restricted to the mother cells of the plant body: leaf founder cells, leaf primordia, axillary buds and root differentiating tissues, as well as to mother cells of vascular tissues. Most interestingly, PtaTEL1-2 transcripts were found in differentiating cells of secondary xylem and phloem, but probably not in the cambium itself. Taken together, these results indicate specific expression of the TEL genes in differentiating cells controlling tissue and organ development in Populus (and other Angiosperm species).

  5. De novo sequencing and characterization of floral transcriptome in two species of buckwheat (Fagopyrum)

    PubMed Central

    2011-01-01

    Background Transcriptome sequencing data has become an integral component of modern genetics, genomics and evolutionary biology. However, despite advances in the technologies of DNA sequencing, such data are lacking for many groups of living organisms, in particular, many plant taxa. We present here the results of transcriptome sequencing for two closely related plant species. These species, Fagopyrum esculentum and F. tataricum, belong to the order Caryophyllales - a large group of flowering plants with uncertain evolutionary relationships. F. esculentum (common buckwheat) is also an important food crop. Despite these practical and evolutionary considerations Fagopyrum species have not been the subject of large-scale sequencing projects. Results Normalized cDNA corresponding to genes expressed in flowers and inflorescences of F. esculentum and F. tataricum was sequenced using the 454 pyrosequencing technology. This resulted in 267 (for F. esculentum) and 229 (F. tataricum) thousands of reads with average length of 341-349 nucleotides. De novo assembly of the reads produced about 25 thousands of contigs for each species, with 7.5-8.2× coverage. Comparative analysis of two transcriptomes demonstrated their overall similarity but also revealed genes that are presumably differentially expressed. Among them are retrotransposon genes and genes involved in sugar biosynthesis and metabolism. Thirteen single-copy genes were used for phylogenetic analysis; the resulting trees are largely consistent with those inferred from multigenic plastid datasets. The sister relationships of the Caryophyllales and asterids now gained high support from nuclear gene sequences. Conclusions 454 transcriptome sequencing and de novo assembly was performed for two congeneric flowering plant species, F. esculentum and F. tataricum. As a result, a large set of cDNA sequences that represent orthologs of known plant genes as well as potential new genes was generated. PMID:21232141

  6. Identification of a novel Gig2 gene family specific to non-amniote vertebrates.

    PubMed

    Zhang, Yi-Bing; Liu, Ting-Kai; Jiang, Jun; Shi, Jun; Liu, Ying; Li, Shun; Gui, Jian-Fang

    2013-01-01

    Gig2 (grass carp reovirus (GCRV)-induced gene 2) is first identified as a novel fish interferon (IFN)-stimulated gene (ISG). Overexpression of a zebrafish Gig2 gene can protect cultured fish cells from virus infection. In the present study, we identify a novel gene family that is comprised of genes homologous to the previously characterized Gig2. EST/GSS search and in silico cloning identify 190 Gig2 homologous genes in 51 vertebrate species ranged from lampreys to amphibians. Further large-scale search of vertebrate and invertebrate genome databases indicate that Gig2 gene family is specific to non-amniotes including lampreys, sharks/rays, ray-finned fishes and amphibians. Phylogenetic analysis and synteny analysis reveal lineage-specific expansion of Gig2 gene family and also provide valuable evidence for the fish-specific genome duplication (FSGD) hypothesis. Although Gig2 family proteins exhibit no significant sequence similarity to any known proteins, a typical Gig2 protein appears to consist of two conserved parts: an N-terminus that bears very low homology to the catalytic domains of poly(ADP-ribose) polymerases (PARPs), and a novel C-terminal domain that is unique to this gene family. Expression profiling of zebrafish Gig2 family genes shows that some duplicate pairs have diverged in function via acquisition of novel spatial and/or temporal expression under stresses. The specificity of this gene family to non-amniotes might contribute to a large extent to distinct physiology in non-amniote vertebrates.

  7. Asymmetric author-topic model for knowledge discovering of big data in toxicogenomics.

    PubMed

    Chung, Ming-Hua; Wang, Yuping; Tang, Hailin; Zou, Wen; Basinger, John; Xu, Xiaowei; Tong, Weida

    2015-01-01

    The advancement of high-throughput screening technologies facilitates the generation of massive amount of biological data, a big data phenomena in biomedical science. Yet, researchers still heavily rely on keyword search and/or literature review to navigate the databases and analyses are often done in rather small-scale. As a result, the rich information of a database has not been fully utilized, particularly for the information embedded in the interactive nature between data points that are largely ignored and buried. For the past 10 years, probabilistic topic modeling has been recognized as an effective machine learning algorithm to annotate the hidden thematic structure of massive collection of documents. The analogy between text corpus and large-scale genomic data enables the application of text mining tools, like probabilistic topic models, to explore hidden patterns of genomic data and to the extension of altered biological functions. In this paper, we developed a generalized probabilistic topic model to analyze a toxicogenomics dataset that consists of a large number of gene expression data from the rat livers treated with drugs in multiple dose and time-points. We discovered the hidden patterns in gene expression associated with the effect of doses and time-points of treatment. Finally, we illustrated the ability of our model to identify the evidence of potential reduction of animal use.

  8. Functional Genome Mining for Metabolites Encoded by Large Gene Clusters through Heterologous Expression of a Whole-Genome Bacterial Artificial Chromosome Library in Streptomyces spp.

    PubMed Central

    Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin

    2016-01-01

    ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including streptothricins, borrelidin, two novel lipopeptides, and one unknown antibiotic from Streptomyces rochei Sal35. The transfer, expression, and screening of the library were all performed in a high-throughput way, so that this approach is scalable and adaptable to industrial automation for next-generation antibiotic discovery. PMID:27451447

  9. Identification of rice genes associated with cosmic-ray response via co-expression gene network analysis.

    PubMed

    Hwang, Sun-Goo; Kim, Dong Sub; Hwang, Jung Eun; Han, A-Reum; Jang, Cheol Seong

    2014-05-15

    In order to better understand the biological systems that are affected in response to cosmic ray (CR), we conducted weighted gene co-expression network analysis using the module detection method. By using the Pearson's correlation coefficient (PCC) value, we evaluated complex gene-gene functional interactions between 680 CR-responsive probes from integrated microarray data sets, which included large-scale transcriptional profiling of 1000 microarray samples. These probes were divided into 6 distinct modules that contained 20 enriched gene ontology (GO) functions, such as oxidoreductase activity, hydrolase activity, and response to stimulus and stress. In particular, modules 1 and 2 commonly showed enriched annotation categories such as oxidoreductase activity, including enriched cis-regulatory elements known as ROS-specific regulators. These results suggest that the ROS-mediated irradiation response pathway is affected by CR in modules 1 and 2. We found 243 ionizing radiation (IR)-responsive probes that exhibited similarities in expression patterns in various irradiation microarray data sets. The expression patterns of 6 randomly selected IR-responsive genes were evaluated by quantitative reverse transcription polymerase chain reaction following treatment with CR, gamma rays (GR), and ion beam (IB); similar patterns were observed among these genes under these 3 treatments. Moreover, we constructed subnetworks of IR-responsive genes and evaluated the expression levels of their neighboring genes following GR treatment; similar patterns were observed among them. These results of network-based analyses might provide a clue to understanding the complex biological system related to the CR response in plants. Copyright © 2014 Elsevier B.V. All rights reserved.

  10. Biochemical Diversification through Foreign Gene Expression in Bdelloid Rotifers

    PubMed Central

    Eyres, Isobel; Wang-Koh, Yuan; Lubzens, Esther; Barraclough, Timothy G.; Micklem, Gos; Tunnacliffe, Alan

    2012-01-01

    Bdelloid rotifers are microinvertebrates with unique characteristics: they have survived tens of millions of years without sexual reproduction; they withstand extreme desiccation by undergoing anhydrobiosis; and they tolerate very high levels of ionizing radiation. Recent evidence suggests that subtelomeric regions of the bdelloid genome contain sequences originating from other organisms by horizontal gene transfer (HGT), of which some are known to be transcribed. However, the extent to which foreign gene expression plays a role in bdelloid physiology is unknown. We address this in the first large scale analysis of the transcriptome of the bdelloid Adineta ricciae: cDNA libraries from hydrated and desiccated bdelloids were subjected to massively parallel sequencing and assembled transcripts compared against the UniProtKB database by blastx to identify their putative products. Of ∼29,000 matched transcripts, ∼10% were inferred from blastx matches to be horizontally acquired, mainly from eubacteria but also from fungi, protists, and algae. After allowing for possible sources of error, the rate of HGT is at least 8%–9%, a level significantly higher than other invertebrates. We verified their foreign nature by phylogenetic analysis and by demonstrating linkage of foreign genes with metazoan genes in the bdelloid genome. Approximately 80% of horizontally acquired genes expressed in bdelloids code for enzymes, and these represent 39% of enzymes in identified pathways. Many enzymes encoded by foreign genes enhance biochemistry in bdelloids compared to other metazoans, for example, by potentiating toxin degradation or generation of antioxidants and key metabolites. They also supplement, and occasionally potentially replace, existing metazoan functions. Bdelloid rotifers therefore express horizontally acquired genes on a scale unprecedented in animals, and foreign genes make a profound contribution to their metabolism. This represents a potential mechanism for ancient asexuals to adapt rapidly to changing environments and thereby persist over long evolutionary time periods in the absence of sex. PMID:23166508

  11. Large-scale functional RNAi screen in C. elegans identifies genes that regulate the dysfunction of mutant polyglutamine neurons

    PubMed Central

    2012-01-01

    Background A central goal in Huntington's disease (HD) research is to identify and prioritize candidate targets for neuroprotective intervention, which requires genome-scale information on the modifiers of early-stage neuron injury in HD. Results Here, we performed a large-scale RNA interference screen in C. elegans strains that express N-terminal huntingtin (htt) in touch receptor neurons. These neurons control the response to light touch. Their function is strongly impaired by expanded polyglutamines (128Q) as shown by the nearly complete loss of touch response in adult animals, providing an in vivo model in which to manipulate the early phases of expanded-polyQ neurotoxicity. In total, 6034 genes were examined, revealing 662 gene inactivations that either reduce or aggravate defective touch response in 128Q animals. Several genes were previously implicated in HD or neurodegenerative disease, suggesting that this screen has effectively identified candidate targets for HD. Network-based analysis emphasized a subset of high-confidence modifier genes in pathways of interest in HD including metabolic, neurodevelopmental and pro-survival pathways. Finally, 49 modifiers of 128Q-neuron dysfunction that are dysregulated in the striatum of either R/2 or CHL2 HD mice, or both, were identified. Conclusions Collectively, these results highlight the relevance to HD pathogenesis, providing novel information on the potential therapeutic targets for neuroprotection in HD. PMID:22413862

  12. Large-scale functional RNAi screen in C. elegans identifies genes that regulate the dysfunction of mutant polyglutamine neurons.

    PubMed

    Lejeune, François-Xavier; Mesrob, Lilia; Parmentier, Frédéric; Bicep, Cedric; Vazquez-Manrique, Rafael P; Parker, J Alex; Vert, Jean-Philippe; Tourette, Cendrine; Neri, Christian

    2012-03-13

    A central goal in Huntington's disease (HD) research is to identify and prioritize candidate targets for neuroprotective intervention, which requires genome-scale information on the modifiers of early-stage neuron injury in HD. Here, we performed a large-scale RNA interference screen in C. elegans strains that express N-terminal huntingtin (htt) in touch receptor neurons. These neurons control the response to light touch. Their function is strongly impaired by expanded polyglutamines (128Q) as shown by the nearly complete loss of touch response in adult animals, providing an in vivo model in which to manipulate the early phases of expanded-polyQ neurotoxicity. In total, 6034 genes were examined, revealing 662 gene inactivations that either reduce or aggravate defective touch response in 128Q animals. Several genes were previously implicated in HD or neurodegenerative disease, suggesting that this screen has effectively identified candidate targets for HD. Network-based analysis emphasized a subset of high-confidence modifier genes in pathways of interest in HD including metabolic, neurodevelopmental and pro-survival pathways. Finally, 49 modifiers of 128Q-neuron dysfunction that are dysregulated in the striatum of either R/2 or CHL2 HD mice, or both, were identified. Collectively, these results highlight the relevance to HD pathogenesis, providing novel information on the potential therapeutic targets for neuroprotection in HD. © 2012 Lejeune et al; licensee BioMed Central Ltd.

  13. Expression profiling of chickpea genes differentially regulated during a resistance response to Ascochyta rabiei.

    PubMed

    Coram, Tristan E; Pang, Edwin C K

    2006-11-01

    Using microarray technology and a set of chickpea (Cicer arietinum L.) unigenes, grasspea (Lathyrus sativus L.) expressed sequence tags (ESTs) and lentil (Lens culinaris Med.) resistance gene analogues, the ascochyta blight (Ascochyta rabiei (Pass.) L.) resistance response was studied in four chickpea genotypes, including resistant, moderately resistant, susceptible and wild relative (Cicer echinospermum L.) genotypes. The experimental system minimized environmental effects and was conducted in reference design, in which samples from mock-inoculated controls acted as reference against post-inoculation samples. Robust data quality was achieved through the use of three biological replicates (including a dye swap), the inclusion of negative controls and strict selection criteria for differentially expressed genes, including a fold change cut-off determined by self-self hybridizations, Student's t-test and multiple testing correction (P < 0.05). Microarray observations were also validated by quantitative reverse transcriptase-polymerase chain reaction (RT-PCR). The time course expression patterns of 756 microarray features resulted in the differential expression of 97 genes in at least one genotype at one time point. k-means clustering grouped the genes into clusters of similar observations for each genotype, and comparisons between A. rabiei-resistant and A. rabiei-susceptible genotypes revealed potential gene 'signatures' predictive of effective A. rabiei resistance. These genes included several pathogenesis-related proteins, SNAKIN2 antimicrobial peptide, proline-rich protein, disease resistance response protein DRRG49-C, environmental stress-inducible protein, leucine-zipper protein, polymorphic antigen membrane protein, Ca-binding protein and several unknown proteins. The potential involvement of these genes and their pathways of induction are discussed. This study represents the first large-scale gene expression profiling in chickpea, and future work will focus on the functional validation of the genes of interest.

  14. Still acting green: continued expression of photosynthetic genes in the heterotrophic Dinoflagellate Pfiesteria piscicida (Peridiniales, Alveolata).

    PubMed

    Kim, Gwang Hoon; Jeong, Hae Jin; Yoo, Yeong Du; Kim, Sunju; Han, Ji Hee; Han, Jong Won; Zuccarello, Giuseppe C

    2013-01-01

    The loss of photosynthetic function should lead to the cessation of expression and finally loss of photosynthetic genes in the new heterotroph. Dinoflagellates are known to have lost their photosynthetic ability several times. Dinoflagellates have also acquired photosynthesis from other organisms, either on a long-term basis or as "kleptoplastids" multiple times. The fate of photosynthetic gene expression in heterotrophs can be informative into evolution of gene expression patterns after functional loss, and the dinoflagellates ability to acquire new photosynthetic function through additional endosymbiosis. To explore this we analyzed a large-scale EST database consisting of 151,091 unique sequences (29,170 contigs, 120,921 singletons) obtained from 454 pyrosequencing of the heterotrophic dinoflagellate Pfiesteria piscicida. About 597 contigs from P. piscicida showed significant homology (E-value

  15. A data mining paradigm for identifying key factors in biological processes using gene expression data.

    PubMed

    Li, Jin; Zheng, Le; Uchiyama, Akihiko; Bin, Lianghua; Mauro, Theodora M; Elias, Peter M; Pawelczyk, Tadeusz; Sakowicz-Burkiewicz, Monika; Trzeciak, Magdalena; Leung, Donald Y M; Morasso, Maria I; Yu, Peng

    2018-06-13

    A large volume of biological data is being generated for studying mechanisms of various biological processes. These precious data enable large-scale computational analyses to gain biological insights. However, it remains a challenge to mine the data efficiently for knowledge discovery. The heterogeneity of these data makes it difficult to consistently integrate them, slowing down the process of biological discovery. We introduce a data processing paradigm to identify key factors in biological processes via systematic collection of gene expression datasets, primary analysis of data, and evaluation of consistent signals. To demonstrate its effectiveness, our paradigm was applied to epidermal development and identified many genes that play a potential role in this process. Besides the known epidermal development genes, a substantial proportion of the identified genes are still not supported by gain- or loss-of-function studies, yielding many novel genes for future studies. Among them, we selected a top gene for loss-of-function experimental validation and confirmed its function in epidermal differentiation, proving the ability of this paradigm to identify new factors in biological processes. In addition, this paradigm revealed many key genes in cold-induced thermogenesis using data from cold-challenged tissues, demonstrating its generalizability. This paradigm can lead to fruitful results for studying molecular mechanisms in an era of explosive accumulation of publicly available biological data.

  16. Identification of transcription coactivator OCA-B-dependent genes involved in antigen-dependent B cell differentiation by cDNA array analyses.

    PubMed

    Kim, Unkyu; Siegel, Rachael; Ren, Xiaodi; Gunther, Cary S; Gaasterland, Terry; Roeder, Robert G

    2003-07-22

    The tissue-specific transcriptional coactivator OCA-B is required for antigen-dependent B cell differentiation events, including germinal center formation. However, the identity of OCA-B target genes involved in this process is unknown. This study has used large-scale cDNA arrays to monitor changes in gene expression patterns that accompany mature B cell differentiation. B cell receptor ligation alone induces many genes involved in B cell expansion, whereas B cell receptor and helper T cell costimulation induce genes associated with B cell effector function. OCA-B expression is induced by both B cell receptor ligation alone and helper T cell costimulation, suggesting that OCA-B is involved in B cell expansion as well as B cell function. Accordingly, several genes involved in cell proliferation and signaling, such as Lck, Kcnn4, Cdc37, cyclin D3, B4galt1, and Ms4a11, have been identified as OCA-B-dependent genes. Further studies on the roles played by these genes in B cells will contribute to an understanding of B cell differentiation.

  17. CPM Is a Useful Cell Surface Marker to Isolate Expandable Bi-Potential Liver Progenitor Cells Derived from Human iPS Cells.

    PubMed

    Kido, Taketomo; Koui, Yuta; Suzuki, Kaori; Kobayashi, Ayaka; Miura, Yasushi; Chern, Edward Y; Tanaka, Minoru; Miyajima, Atsushi

    2015-10-13

    To develop a culture system for large-scale production of mature hepatocytes, liver progenitor cells (LPCs) with a high proliferation potential would be advantageous. We have found that carboxypeptidase M (CPM) is highly expressed in embryonic LPCs, hepatoblasts, while its expression is decreased along with hepatic maturation. Consistently, CPM expression was transiently induced during hepatic specification from human-induced pluripotent stem cells (hiPSCs). CPM(+) cells isolated from differentiated hiPSCs at the immature hepatocyte stage proliferated extensively in vitro and expressed a set of genes that were typical of hepatoblasts. Moreover, the CPM(+) cells exhibited a mature hepatocyte phenotype after induction of hepatic maturation and also underwent cholangiocytic differentiation in a three-dimensional culture system. These results indicated that hiPSC-derived CPM(+) cells share the characteristics of LPCs, with the potential to proliferate and differentiate bi-directionally. Thus, CPM is a useful marker for isolating hiPSC-derived LPCs, which allows development of a large-scale culture system for producing hepatocytes and cholangiocytes. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  18. CPM Is a Useful Cell Surface Marker to Isolate Expandable Bi-Potential Liver Progenitor Cells Derived from Human iPS Cells

    PubMed Central

    Kido, Taketomo; Koui, Yuta; Suzuki, Kaori; Kobayashi, Ayaka; Miura, Yasushi; Chern, Edward Y.; Tanaka, Minoru; Miyajima, Atsushi

    2015-01-01

    Summary To develop a culture system for large-scale production of mature hepatocytes, liver progenitor cells (LPCs) with a high proliferation potential would be advantageous. We have found that carboxypeptidase M (CPM) is highly expressed in embryonic LPCs, hepatoblasts, while its expression is decreased along with hepatic maturation. Consistently, CPM expression was transiently induced during hepatic specification from human-induced pluripotent stem cells (hiPSCs). CPM+ cells isolated from differentiated hiPSCs at the immature hepatocyte stage proliferated extensively in vitro and expressed a set of genes that were typical of hepatoblasts. Moreover, the CPM+ cells exhibited a mature hepatocyte phenotype after induction of hepatic maturation and also underwent cholangiocytic differentiation in a three-dimensional culture system. These results indicated that hiPSC-derived CPM+ cells share the characteristics of LPCs, with the potential to proliferate and differentiate bi-directionally. Thus, CPM is a useful marker for isolating hiPSC-derived LPCs, which allows development of a large-scale culture system for producing hepatocytes and cholangiocytes. PMID:26365514

  19. Sequential and combinatorial roles of maf family genes define proper lens development.

    PubMed

    Reza, Hasan Mahmud; Urano, Atsuyo; Shimada, Naoko; Yasuda, Kunio

    2007-01-16

    Maf proteins have been shown to play pivotal roles in lens development in vertebrates. The developing chick lens expresses at least three large Maf proteins. However, the transcriptional relationship among the three large maf genes and their various roles in transactivating the downstream genes largely remain to be elucidated. Chick embryos were electroporated with wild-type L-maf, c-maf, and mafB by in ovo electroporation, and their effects on gene expression were determined by in situ hybridization using specific probes or by immunostaining. Endogenous gene expression was determined using nonelectroporated samples. A regulation mechanism exists among the members of maf family gene. An early-expressed member of this gene family typically stimulates the expression of later-expressed members. We also examined the regulation of various lens-expressing genes with a focus on the interaction between different Maf proteins. We found that the transcriptional ability of Maf proteins varies, even when the target is the same, in parallel with their discrete functions. L-Maf and c-Maf have no effect on E-cadherin expression, whereas MafB enhances its expression and thereby impedes lens vesicle formation. This study also revealed that Maf proteins can regulate the expression of gap junction genes, connexins, and their interacting partner, major intrinsic protein (MIP), during lens development. Misexpression of L-Maf and c-Maf induces ectopic expression of Cx43 and MIP; in contrast, MafB appears to have no effect on Cx43, but induces MIP significantly as evidenced from our gain-of-function experiments. Our results indicate that large Maf function is indispensable for chick lens initiation and development. In addition, L-Maf positively regulates most of the essential genes in this program and directs a series of molecular events leading to proper formation of the lens.

  20. Coordinated Gene Expression of Neuroinflammatory and Cell Signaling Markers in Dorsolateral Prefrontal Cortex during Human Brain Development and Aging

    PubMed Central

    Primiani, Christopher T.; Ryan, Veronica H.; Rao, Jagadeesh S.; Cam, Margaret C.; Ahn, Kwangmi; Modi, Hiren R.; Rapoport, Stanley I.

    2014-01-01

    Background Age changes in expression of inflammatory, synaptic, and neurotrophic genes are not well characterized during human brain development and senescence. Knowing these changes may elucidate structural, metabolic, and functional brain processes over the lifespan, as well vulnerability to neurodevelopmental or neurodegenerative diseases. Hypothesis Expression levels of inflammatory, synaptic, and neurotrophic genes in the human brain are coordinated over the lifespan and underlie changes in phenotypic networks or cascades. Methods We used a large-scale microarray dataset from human prefrontal cortex, BrainCloud, to quantify age changes over the lifespan, divided into Development (0 to 21 years, 87 brains) and Aging (22 to 78 years, 144 brains) intervals, in transcription levels of 39 genes. Results Gene expression levels followed different trajectories over the lifespan. Many changes were intercorrelated within three similar groups or clusters of genes during both Development and Aging, despite different roles of the gene products in the two intervals. During Development, changes were related to reported neuronal loss, dendritic growth and pruning, and microglial events; TLR4, IL1R1, NFKB1, MOBP, PLA2G4A, and PTGS2 expression increased in the first years of life, while expression of synaptic genes GAP43 and DBN1 decreased, before reaching plateaus. During Aging, expression was upregulated for potentially pro-inflammatory genes such as NFKB1, TRAF6, TLR4, IL1R1, TSPO, and GFAP, but downregulated for neurotrophic and synaptic integrity genes such as BDNF, NGF, PDGFA, SYN, and DBN1. Conclusions Coordinated changes in gene transcription cascades underlie changes in synaptic, neurotrophic, and inflammatory phenotypic networks during brain Development and Aging. Early postnatal expression changes relate to neuronal, glial, and myelin growth and synaptic pruning events, while late Aging is associated with pro-inflammatory and synaptic loss changes. Thus, comparable transcriptional regulatory networks that operate throughout the lifespan underlie different phenotypic processes during Aging compared to Development. PMID:25329999

  1. Coordinated gene expression of neuroinflammatory and cell signaling markers in dorsolateral prefrontal cortex during human brain development and aging.

    PubMed

    Primiani, Christopher T; Ryan, Veronica H; Rao, Jagadeesh S; Cam, Margaret C; Ahn, Kwangmi; Modi, Hiren R; Rapoport, Stanley I

    2014-01-01

    Age changes in expression of inflammatory, synaptic, and neurotrophic genes are not well characterized during human brain development and senescence. Knowing these changes may elucidate structural, metabolic, and functional brain processes over the lifespan, as well vulnerability to neurodevelopmental or neurodegenerative diseases. Expression levels of inflammatory, synaptic, and neurotrophic genes in the human brain are coordinated over the lifespan and underlie changes in phenotypic networks or cascades. We used a large-scale microarray dataset from human prefrontal cortex, BrainCloud, to quantify age changes over the lifespan, divided into Development (0 to 21 years, 87 brains) and Aging (22 to 78 years, 144 brains) intervals, in transcription levels of 39 genes. Gene expression levels followed different trajectories over the lifespan. Many changes were intercorrelated within three similar groups or clusters of genes during both Development and Aging, despite different roles of the gene products in the two intervals. During Development, changes were related to reported neuronal loss, dendritic growth and pruning, and microglial events; TLR4, IL1R1, NFKB1, MOBP, PLA2G4A, and PTGS2 expression increased in the first years of life, while expression of synaptic genes GAP43 and DBN1 decreased, before reaching plateaus. During Aging, expression was upregulated for potentially pro-inflammatory genes such as NFKB1, TRAF6, TLR4, IL1R1, TSPO, and GFAP, but downregulated for neurotrophic and synaptic integrity genes such as BDNF, NGF, PDGFA, SYN, and DBN1. Coordinated changes in gene transcription cascades underlie changes in synaptic, neurotrophic, and inflammatory phenotypic networks during brain Development and Aging. Early postnatal expression changes relate to neuronal, glial, and myelin growth and synaptic pruning events, while late Aging is associated with pro-inflammatory and synaptic loss changes. Thus, comparable transcriptional regulatory networks that operate throughout the lifespan underlie different phenotypic processes during Aging compared to Development.

  2. IL-1beta, but not BMP-7 leads to a dramatic change in the gene expression pattern of human adult articular chondrocytes--portraying the gene expression pattern in two donors.

    PubMed

    Saas, J; Haag, J; Rueger, D; Chubinskaya, S; Sohler, F; Zimmer, R; Bartnik, E; Aigner, T

    2006-10-01

    Anabolic and catabolic cytokines and growth factors such as BMP-7 and IL-1beta play a central role in controlling the balance between degradation and repair of normal and (osteo)arthritic articular cartilage matrix. In this report, we investigated the response of articular chondrocytes to these factors IL-1beta and BMP-7 in terms of changes in gene expression levels. Large scale analysis was performed on primary human adult articular chondrocytes isolated from two human, independent donors cultured in alginate beads (non-stimulated and stimulated with IL-1beta and BMP-7 for 48 h) using Affymetrix gene chips (oligo-arrays). Biostatistical and bioinformatic evaluation of gene expression pattern was performed using the Resolver software (Rosetta). Part of the results were confirmed using real-time PCR. IL-1beta modulated significantly 909 out of 3459 genes detectable, whereas BMP-7 influenced only 36 out of 3440. BMP-7 induced mainly anabolic activation of chondrocytes including classical target genes such as collagen type II and aggrecan, while IL-1beta, both, significantly modulated the gene expression levels of numerous genes; namely, IL-1beta down-regulated the expression of anabolic genes and induced catabolic genes and mediators. Our data indicate that BMP-7 has only a limited effect on differentiated cells, whereas IL-1beta causes a dramatic change in gene expression pattern, i.e. induced or repressed much more genes. This presumably reflects the fact that BMP-7 signaling is effected via one pathway only (i.e. Smad-pathway) whereas IL-1beta is able to signal via a broad variety of intracellular signaling cascades involving the JNK, p38, NFkB and Erk pathways and even influencing BMP signaling.

  3. Development of a versatile enrichment analysis tool reveals associations between the maternal brain and mental health disorders, including autism

    PubMed Central

    2013-01-01

    Background A recent study of lateral septum (LS) suggested a large number of autism-related genes with altered expression in the postpartum state. However, formally testing the findings for enrichment of autism-associated genes proved to be problematic with existing software. Many gene-disease association databases have been curated which are not currently incorporated in popular, full-featured enrichment tools, and the use of custom gene lists in these programs can be difficult to perform and interpret. As a simple alternative, we have developed the Modular Single-set Enrichment Test (MSET), a minimal tool that enables one to easily evaluate expression data for enrichment of any conceivable gene list of interest. Results The MSET approach was validated by testing several publicly available expression data sets for expected enrichment in areas of autism, attention deficit hyperactivity disorder (ADHD), and arthritis. Using nine independent, unique autism gene lists extracted from association databases and two recent publications, a striking consensus of enrichment was detected within gene expression changes in LS of postpartum mice. A network of 160 autism-related genes was identified, representing developmental processes such as synaptic plasticity, neuronal morphogenesis, and differentiation. Additionally, maternal LS displayed enrichment for genes associated with bipolar disorder, schizophrenia, ADHD, and depression. Conclusions The transition to motherhood includes the most fundamental social bonding event in mammals and features naturally occurring changes in sociability. Some individuals with autism, schizophrenia, or other mental health disorders exhibit impaired social traits. Genes involved in these deficits may also contribute to elevated sociability in the maternal brain. To date, this is the first study to show a significant, quantitative link between the maternal brain and mental health disorders using large scale gene expression data. Thus, the postpartum brain may provide a novel and promising platform for understanding the complex genetics of improved sociability that may have direct relevance for multiple psychiatric illnesses. This study also provides an important new tool that fills a critical analysis gap and makes evaluation of enrichment using any database of interest possible with an emphasis on ease of use and methodological transparency. PMID:24245670

  4. Development of a versatile enrichment analysis tool reveals associations between the maternal brain and mental health disorders, including autism.

    PubMed

    Eisinger, Brian E; Saul, Michael C; Driessen, Terri M; Gammie, Stephen C

    2013-11-19

    A recent study of lateral septum (LS) suggested a large number of autism-related genes with altered expression in the postpartum state. However, formally testing the findings for enrichment of autism-associated genes proved to be problematic with existing software. Many gene-disease association databases have been curated which are not currently incorporated in popular, full-featured enrichment tools, and the use of custom gene lists in these programs can be difficult to perform and interpret. As a simple alternative, we have developed the Modular Single-set Enrichment Test (MSET), a minimal tool that enables one to easily evaluate expression data for enrichment of any conceivable gene list of interest. The MSET approach was validated by testing several publicly available expression data sets for expected enrichment in areas of autism, attention deficit hyperactivity disorder (ADHD), and arthritis. Using nine independent, unique autism gene lists extracted from association databases and two recent publications, a striking consensus of enrichment was detected within gene expression changes in LS of postpartum mice. A network of 160 autism-related genes was identified, representing developmental processes such as synaptic plasticity, neuronal morphogenesis, and differentiation. Additionally, maternal LS displayed enrichment for genes associated with bipolar disorder, schizophrenia, ADHD, and depression. The transition to motherhood includes the most fundamental social bonding event in mammals and features naturally occurring changes in sociability. Some individuals with autism, schizophrenia, or other mental health disorders exhibit impaired social traits. Genes involved in these deficits may also contribute to elevated sociability in the maternal brain. To date, this is the first study to show a significant, quantitative link between the maternal brain and mental health disorders using large scale gene expression data. Thus, the postpartum brain may provide a novel and promising platform for understanding the complex genetics of improved sociability that may have direct relevance for multiple psychiatric illnesses. This study also provides an important new tool that fills a critical analysis gap and makes evaluation of enrichment using any database of interest possible with an emphasis on ease of use and methodological transparency.

  5. Rethinking cell-cycle-dependent gene expression in Schizosaccharomyces pombe.

    PubMed

    Cooper, Stephen

    2017-11-01

    Three studies of gene expression during the division cycle of Schizosaccharomyces pombe led to the proposal that a large number of genes are expressed at particular times during the S. pombe cell cycle. Yet only a small fraction of genes proposed to be expressed in a cell-cycle-dependent manner are reproducible in all three published studies. In addition to reproducibility problems, questions about expression amplitudes, cell-cycle timing of expression, synchronization artifacts, and the problem with methods for synchronizing cells must be considered. These problems and complications prompt the idea that caution should be used before accepting the conclusion that there are a large number of genes expressed in a cell-cycle-dependent manner in S. pombe.

  6. Divergent evolution of arrested development in the dauer stage of Caenorhabditis elegans and the infective stage of Heterodera glycines

    PubMed Central

    Elling, Axel A; Mitreva, Makedonka; Recknor, Justin; Gai, Xiaowu; Martin, John; Maier, Thomas R; McDermott, Jeffrey P; Hewezi, Tarek; McK Bird, David; Davis, Eric L; Hussey, Richard S; Nettleton, Dan; McCarter, James P; Baum, Thomas J

    2007-01-01

    Background The soybean cyst nematode Heterodera glycines is the most important parasite in soybean production worldwide. A comprehensive analysis of large-scale gene expression changes throughout the development of plant-parasitic nematodes has been lacking to date. Results We report an extensive genomic analysis of H. glycines, beginning with the generation of 20,100 expressed sequence tags (ESTs). In-depth analysis of these ESTs plus approximately 1,900 previously published sequences predicted 6,860 unique H. glycines genes and allowed a classification by function using InterProScan. Expression profiling of all 6,860 genes throughout the H. glycines life cycle was undertaken using the Affymetrix Soybean Genome Array GeneChip. Our data sets and results represent a comprehensive resource for molecular studies of H. glycines. Demonstrating the power of this resource, we were able to address whether arrested development in the Caenorhabditis elegans dauer larva and the H. glycines infective second-stage juvenile (J2) exhibits shared gene expression profiles. We determined that the gene expression profiles associated with the C. elegans dauer pathway are not uniformly conserved in H. glycines and that the expression profiles of genes for metabolic enzymes of C. elegans dauer larvae and H. glycines infective J2 are dissimilar. Conclusion Our results indicate that hallmark gene expression patterns and metabolism features are not shared in the developmentally arrested life stages of C. elegans and H. glycines, suggesting that developmental arrest in these two nematode species has undergone more divergent evolution than previously thought and pointing to the need for detailed genomic analyses of individual parasite species. PMID:17919324

  7. EgoNet: identification of human disease ego-network modules

    PubMed Central

    2014-01-01

    Background Mining novel biomarkers from gene expression profiles for accurate disease classification is challenging due to small sample size and high noise in gene expression measurements. Several studies have proposed integrated analyses of microarray data and protein-protein interaction (PPI) networks to find diagnostic subnetwork markers. However, the neighborhood relationship among network member genes has not been fully considered by those methods, leaving many potential gene markers unidentified. The main idea of this study is to take full advantage of the biological observation that genes associated with the same or similar diseases commonly reside in the same neighborhood of molecular networks. Results We present EgoNet, a novel method based on egocentric network-analysis techniques, to exhaustively search and prioritize disease subnetworks and gene markers from a large-scale biological network. When applied to a triple-negative breast cancer (TNBC) microarray dataset, the top selected modules contain both known gene markers in TNBC and novel candidates, such as RAD51 and DOK1, which play a central role in their respective ego-networks by connecting many differentially expressed genes. Conclusions Our results suggest that EgoNet, which is based on the ego network concept, allows the identification of novel biomarkers and provides a deeper understanding of their roles in complex diseases. PMID:24773628

  8. RD2-MolPack-Chim3, a packaging cell line for stable production of lentiviral vectors for anti-HIV gene therapy.

    PubMed

    Stornaiuolo, Anna; Piovani, Bianca Maria; Bossi, Sergio; Zucchelli, Eleonora; Corna, Stefano; Salvatori, Francesca; Mavilio, Fulvio; Bordignon, Claudio; Rizzardi, Gian Paolo; Bovolenta, Chiara

    2013-08-01

    Over the last two decades, several attempts to generate packaging cells for lentiviral vectors (LV) have been made. Despite different technologies, no packaging clone is currently employed in clinical trials. We developed a new strategy for LV stable production based on the HEK-293T progenitor cells; the sequential insertion of the viral genes by integrating vectors; the constitutive expression of the viral components; and the RD114-TR envelope pseudotyping. We generated the intermediate clone PK-7 expressing constitutively gag/pol and rev genes and, by adding tat and rd114-tr genes, the stable packaging cell line RD2-MolPack, which can produce LV carrying any transfer vector (TV). Finally, we obtained the RD2-MolPack-Chim3 producer clone by transducing RD2-MolPack cells with the TV expressing the anti-HIV transgene Chim3. Remarkably, RD114-TR pseudovirions have much higher potency when produced by stable compared with transient technology. Most importantly, comparable transduction efficiency in hematopoietic stem cells (HSC) is obtained with 2-logs less physical particles respect to VSV-G pseudovirions produced by transient transfection. Altogether, RD2-MolPack technology should be considered a valid option for large-scale production of LV to be used in gene therapy protocols employing HSC, resulting in the possibility of downsizing the manufacturing scale by about 10-fold in respect to transient technology.

  9. Gene expression-based detection of radiation exposure in mice after treatment with granulocyte colony-stimulating factor and lipopolysaccharide.

    PubMed

    Tucker, James D; Grever, William E; Joiner, Michael C; Konski, Andre A; Thomas, Robert A; Smolinski, Joseph M; Divine, George W; Auner, Gregory W

    2012-02-01

    In a large-scale nuclear incident, many thousands of people may be exposed to a wide range of radiation doses. Rapid biological dosimetry will be required on an individualized basis to estimate the exposures and to make treatment decisions. To ameliorate the adverse effects of exposure, victims may be treated with one or more cytokine growth factors, including granulocyte colony-stimulating factor (G-CSF), which has therapeutic efficacy for treating radiation-induced bone marrow ablation by stimulating granulopoiesis. The existence of infections and the administration of G-CSF each may confound the ability to achieve reliable dosimetry by gene expression analysis. In this study, C57BL/6 mice were used to determine the extent to which G-CSF and lipopolysaccharide (LPS, which simulates infection by gram-negative bacteria) alter the expression of genes that are either radiation-responsive or non-responsive, i.e., show potential for use as endogenous controls. Mice were acutely exposed to (60)Co γ rays at either 0 Gy or 6 Gy. Two hours later the animals were injected with either 0.1 mg/kg of G-CSF or 0.3 mg/kg of LPS. Expression levels of 96 different gene targets were evaluated in peripheral blood after an additional 4 or 24 h using real-time quantitative PCR. The results indicate that the expression levels of some genes are altered by LPS, but altered expression after G-CSF treatment was generally not observed. The expression levels of many genes therefore retain utility for biological dosimetry or as endogenous controls. These data suggest that PCR-based quantitative gene expression analyses may have utility in radiation biodosimetry in humans even in the presence of an infection or after treatment with G-CSF.

  10. Orthogonal control of expression mean and variance by epigenetic features at different genomic loci

    DOE PAGES

    Dey, Siddharth S.; Foley, Jonathan E.; Limsirichai, Prajit; ...

    2015-05-05

    While gene expression noise has been shown to drive dramatic phenotypic variations, the molecular basis for this variability in mammalian systems is not well understood. Gene expression has been shown to be regulated by promoter architecture and the associated chromatin environment. However, the exact contribution of these two factors in regulating expression noise has not been explored. Using a dual-reporter lentiviral model system, we deconvolved the influence of the promoter sequence to systematically study the contribution of the chromatin environment at different genomic locations in regulating expression noise. By integrating a large-scale analysis to quantify mRNA levels by smFISH andmore » protein levels by flow cytometry in single cells, we found that mean expression and noise are uncorrelated across genomic locations. Furthermore, we showed that this independence could be explained by the orthogonal control of mean expression by the transcript burst size and noise by the burst frequency. Finally, we showed that genomic locations displaying higher expression noise are associated with more repressed chromatin, thereby indicating the contribution of the chromatin environment in regulating expression noise.« less

  11. Directed module detection in a large-scale expression compendium.

    PubMed

    Fu, Qiang; Lemmens, Karen; Sanchez-Rodriguez, Aminael; Thijs, Inge M; Meysman, Pieter; Sun, Hong; Fierro, Ana Carolina; Engelen, Kristof; Marchal, Kathleen

    2012-01-01

    Public online microarray databases contain tremendous amounts of expression data. Mining these data sources can provide a wealth of information on the underlying transcriptional networks. In this chapter, we illustrate how the web services COLOMBOS and DISTILLER can be used to identify condition-dependent coexpression modules by exploring compendia of public expression data. COLOMBOS is designed for user-specified query-driven analysis, whereas DISTILLER generates a global regulatory network overview. The user is guided through both web services by means of a case study in which condition-dependent coexpression modules comprising a gene of interest (i.e., "directed") are identified.

  12. First Transcriptome and Digital Gene Expression Analysis in Neuroptera with an Emphasis on Chemoreception Genes in Chrysopa pallens (Rambur)

    PubMed Central

    Li, Zhao-Qun; Zhang, Shuai; Ma, Yan; Luo, Jun-Yu; Wang, Chun-Yi; Lv, Li-Min; Dong, Shuang-Lin; Cui, Jin-Jie

    2013-01-01

    Background Chrysopa pallens (Rambur) are the most important natural enemies and predators of various agricultural pests. Understanding the sophisticated olfactory system in insect antennae is crucial for studying the physiological bases of olfaction and also could lead to effective applications of C. pallens in integrated pest management. However no transcriptome information is available for Neuroptera, and sequence data for C. pallens are scarce, so obtaining more sequence data is a priority for researchers on this species. Results To facilitate identifying sets of genes involved in olfaction, a normalized transcriptome of C. pallens was sequenced. A total of 104,603 contigs were obtained and assembled into 10,662 clusters and 39,734 singletons; 20,524 were annotated based on BLASTX analyses. A large number of candidate chemosensory genes were identified, including 14 odorant-binding proteins (OBPs), 22 chemosensory proteins (CSPs), 16 ionotropic receptors, 14 odorant receptors, and genes potentially involved in olfactory modulation. To better understand the OBPs, CSPs and cytochrome P450s, phylogenetic trees were constructed. In addition, 10 digital gene expression libraries of different tissues were constructed and gene expression profiles were compared among different tissues in males and females. Conclusions Our results provide a basis for exploring the mechanisms of chemoreception in C. pallens, as well as other insects. The evolutionary analyses in our study provide new insights into the differentiation and evolution of insect OBPs and CSPs. Our study provided large-scale sequence information for further studies in C. pallens. PMID:23826220

  13. Identification of Genes Uniquely Expressed in the Germ-Line Tissues of the Jewel Wasp Nasonia vitripennis

    PubMed Central

    Ferree, Patrick M.; Fang, Christopher; Mastrodimos, Mariah; Hay, Bruce A.; Amrhein, Henry; Akbari, Omar S.

    2015-01-01

    The jewel wasp Nasonia vitripennis is a rising model organism for the study of haplo-diploid reproduction characteristic of hymenopteran insects, which include all wasps, bees, and ants. We performed transcriptional profiling of the ovary, the female soma, and the male soma of N. vitripennis to complement a previously existing transcriptome of the wasp testis. These data were deposited into an open-access genome browser for visualization of transcripts relative to their gene models. We used these data to identify the assemblies of genes uniquely expressed in the germ-line tissues. We found that 156 protein-coding genes are expressed exclusively in the wasp testis compared with only 22 in the ovary. Of the testis-specific genes, eight are candidates for male-specific DNA packaging proteins known as protamines. We found very similar expression patterns of centrosome associated genes in the testis and ovary, arguing that de novo centrosome formation, a key process for development of unfertilized eggs into males, likely does not rely on large-scale transcriptional differences between these tissues. In contrast, a number of meiosis-related genes show a bias toward testis-specific expression, despite the lack of true meiosis in N. vitripennis males. These patterns may reflect an unexpected complexity of male gamete production in the haploid males of this organism. Broadly, these data add to the growing number of genomic and genetic tools available in N. vitripennis for addressing important biological questions in this rising insect model organism. PMID:26464360

  14. Fast and robust group-wise eQTL mapping using sparse graphical models.

    PubMed

    Cheng, Wei; Shi, Yu; Zhang, Xiang; Wang, Wei

    2015-01-16

    Genome-wide expression quantitative trait loci (eQTL) studies have emerged as a powerful tool to understand the genetic basis of gene expression and complex traits. The traditional eQTL methods focus on testing the associations between individual single-nucleotide polymorphisms (SNPs) and gene expression traits. A major drawback of this approach is that it cannot model the joint effect of a set of SNPs on a set of genes, which may correspond to hidden biological pathways. We introduce a new approach to identify novel group-wise associations between sets of SNPs and sets of genes. Such associations are captured by hidden variables connecting SNPs and genes. Our model is a linear-Gaussian model and uses two types of hidden variables. One captures the set associations between SNPs and genes, and the other captures confounders. We develop an efficient optimization procedure which makes this approach suitable for large scale studies. Extensive experimental evaluations on both simulated and real datasets demonstrate that the proposed methods can effectively capture both individual and group-wise signals that cannot be identified by the state-of-the-art eQTL mapping methods. Considering group-wise associations significantly improves the accuracy of eQTL mapping, and the successful multi-layer regression model opens a new approach to understand how multiple SNPs interact with each other to jointly affect the expression level of a group of genes.

  15. Prokaryotic Gene Clusters: A Rich Toolbox for Synthetic Biology

    PubMed Central

    Fischbach, Michael; Voigt, Christopher A.

    2014-01-01

    Bacteria construct elaborate nanostructures, obtain nutrients and energy from diverse sources, synthesize complex molecules, and implement signal processing to react to their environment. These complex phenotypes require the coordinated action of multiple genes, which are often encoded in a contiguous region of the genome, referred to as a gene cluster. Gene clusters sometimes contain all of the genes necessary and sufficient for a particular function. As an evolutionary mechanism, gene clusters facilitate the horizontal transfer of the complete function between species. Here, we review recent work on a number of clusters whose functions are relevant to biotechnology. Engineering these clusters has been hindered by their regulatory complexity, the need to balance the expression of many genes, and a lack of tools to design and manipulate DNA at this scale. Advances in synthetic biology will enable the large-scale bottom-up engineering of the clusters to optimize their functions, wake up cryptic clusters, or to transfer them between organisms. Understanding and manipulating gene clusters will move towards an era of genome engineering, where multiple functions can be “mixed-and-matched” to create a designer organism. PMID:21154668

  16. Titer improvement of iso-migrastatin in selected heterologous Streptomyces hosts and related analysis of mRNA expression by quantitative RT–PCR

    PubMed Central

    Yang, Dong; Zhu, Xiangcheng; Wu, Xueyun; Feng, Zhiyang; Huang, Lei; Shen, Ben; Xu, Zhinan

    2011-01-01

    iso-Migrastatin (iso-MGS) has been actively pursued recently as an outstanding candidate of antimetastasis agents. Having characterized the iso-MGS biosynthetic gene cluster from its native producer Streptomyces platensis NRRL 18993, we have recently succeeded in producing iso-MGS in five selected heterologous Streptomyces hosts, albeit the low titers failed to meet expectations and cast doubt on the utility of this novel technique for large-scale production. To further explore and capitalize on the production capacity of these hosts, a thorough investigation of these five engineered strains with three fermentation media for iso-MGS production was undertaken. Streptomyces albus J1074 and Streptomyces lividans K4-114 were found to be preferred heterologous hosts, and subsequent analysis of carbon and nitrogen sources revealed that sucrose and yeast extract were ideal for iso-MGS production. After the initial optimization, the titers of iso-MGS in all five hosts were considerably improved by 3–18-fold in the optimized R2YE medium. Furthermore, the iso-MGS titer of S. albus J1074 (pBS11001) was significantly improved to 186.7 mg/L by a hybrid medium strategy. Addition of NaHCO3 to the latter finally afforded an optimized iso-MGS titer of 213.8 mg/L, about 5-fold higher than the originally reported system. With S. albus J1074 (pBS11001) as a model host, the expression of iso-MGS gene cluster in four different media was systematically studied via the quantitative RT–PCR technology. The resultant comparison revealed the correlation of gene expression and iso-MGS production for the first time; synchronous expression of the whole gene cluster was crucial for optimal iso-MGS production. These results reveal new insights into the iso-MGS biosynthetic machinery in heterologous hosts and provide the primary data to realize large-scale production of iso-MGS for further preclinical studies. PMID:21132287

  17. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities tomore » known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.« less

  18. DEIVA: a web application for interactive visual analysis of differential gene expression profiles.

    PubMed

    Harshbarger, Jayson; Kratz, Anton; Carninci, Piero

    2017-01-07

    Differential gene expression (DGE) analysis is a technique to identify statistically significant differences in RNA abundance for genes or arbitrary features between different biological states. The result of a DGE test is typically further analyzed using statistical software, spreadsheets or custom ad hoc algorithms. We identified a need for a web-based system to share DGE statistical test results, and locate and identify genes in DGE statistical test results with a very low barrier of entry. We have developed DEIVA, a free and open source, browser-based single page application (SPA) with a strong emphasis on being user friendly that enables locating and identifying single or multiple genes in an immediate, interactive, and intuitive manner. By design, DEIVA scales with very large numbers of users and datasets. Compared to existing software, DEIVA offers a unique combination of design decisions that enable inspection and analysis of DGE statistical test results with an emphasis on ease of use.

  19. mRNA-Seq Analysis of the Pseudoperonospora cubensis Transcriptome During Cucumber (Cucumis sativus L.) Infection

    PubMed Central

    Hamilton, John P.; Vaillancourt, Brieanne; Buell, C. Robin; Day, Brad

    2012-01-01

    Pseudoperonospora cubensis, an oomycete, is the causal agent of cucurbit downy mildew, and is responsible for significant losses on cucurbit crops worldwide. While other oomycete plant pathogens have been extensively studied at the molecular level, Ps. cubensis and the molecular basis of its interaction with cucurbit hosts has not been well examined. Here, we present the first large-scale global gene expression analysis of Ps. cubensis infection of a susceptible Cucumis sativus cultivar, ‘Vlaspik’, and identification of genes with putative roles in infection, growth, and pathogenicity. Using high throughput whole transcriptome sequencing, we captured differential expression of 2383 Ps. cubensis genes in sporangia and at 1, 2, 3, 4, 6, and 8 days post-inoculation (dpi). Additionally, comparison of Ps. cubensis expression profiles with expression profiles from an infection time course of the oomycete pathogen Phytophthora infestans on Solanum tuberosum revealed similarities in expression patterns of 1,576–6,806 orthologous genes suggesting a substantial degree of overlap in molecular events in virulence between the biotrophic Ps. cubensis and the hemi-biotrophic P. infestans. Co-expression analyses identified distinct modules of Ps. cubensis genes that were representative of early, intermediate, and late infection stages. Collectively, these expression data have advanced our understanding of key molecular and genetic events in the virulence of Ps. cubensis and thus, provides a foundation for identifying mechanism(s) by which to engineer or effect resistance in the host. PMID:22545137

  20. Modeling gene expression measurement error: a quasi-likelihood approach

    PubMed Central

    Strimmer, Korbinian

    2003-01-01

    Background Using suitable error models for gene expression measurements is essential in the statistical analysis of microarray data. However, the true probabilistic model underlying gene expression intensity readings is generally not known. Instead, in currently used approaches some simple parametric model is assumed (usually a transformed normal distribution) or the empirical distribution is estimated. However, both these strategies may not be optimal for gene expression data, as the non-parametric approach ignores known structural information whereas the fully parametric models run the risk of misspecification. A further related problem is the choice of a suitable scale for the model (e.g. observed vs. log-scale). Results Here a simple semi-parametric model for gene expression measurement error is presented. In this approach inference is based an approximate likelihood function (the extended quasi-likelihood). Only partial knowledge about the unknown true distribution is required to construct this function. In case of gene expression this information is available in the form of the postulated (e.g. quadratic) variance structure of the data. As the quasi-likelihood behaves (almost) like a proper likelihood, it allows for the estimation of calibration and variance parameters, and it is also straightforward to obtain corresponding approximate confidence intervals. Unlike most other frameworks, it also allows analysis on any preferred scale, i.e. both on the original linear scale as well as on a transformed scale. It can also be employed in regression approaches to model systematic (e.g. array or dye) effects. Conclusions The quasi-likelihood framework provides a simple and versatile approach to analyze gene expression data that does not make any strong distributional assumptions about the underlying error model. For several simulated as well as real data sets it provides a better fit to the data than competing models. In an example it also improved the power of tests to identify differential expression. PMID:12659637

  1. Revealing complex function, process and pathway interactions with high-throughput expression and biological annotation data.

    PubMed

    Singh, Nitesh Kumar; Ernst, Mathias; Liebscher, Volkmar; Fuellen, Georg; Taher, Leila

    2016-10-20

    The biological relationships both between and within the functions, processes and pathways that operate within complex biological systems are only poorly characterized, making the interpretation of large scale gene expression datasets extremely challenging. Here, we present an approach that integrates gene expression and biological annotation data to identify and describe the interactions between biological functions, processes and pathways that govern a phenotype of interest. The product is a global, interconnected network, not of genes but of functions, processes and pathways, that represents the biological relationships within the system. We validated our approach on two high-throughput expression datasets describing organismal and organ development. Our findings are well supported by the available literature, confirming that developmental processes and apoptosis play key roles in cell differentiation. Furthermore, our results suggest that processes related to pluripotency and lineage commitment, which are known to be critical for development, interact mainly indirectly, through genes implicated in more general biological processes. Moreover, we provide evidence that supports the relevance of cell spatial organization in the developing liver for proper liver function. Our strategy can be viewed as an abstraction that is useful to interpret high-throughput data and devise further experiments.

  2. Nutritional and reproductive signaling revealed by comparative gene expression analysis in Chrysopa pallens (Rambur) at different nutritional statuses

    PubMed Central

    Han, Benfeng; Zhang, Shen; Zeng, Fanrong; Mao, Jianjun

    2017-01-01

    Background The green lacewing, Chrysopa pallens Rambur, is one of the most important natural predators because of its extensive spectrum of prey and wide distribution. However, what we know about the nutritional and reproductive physiology of this species is very scarce. Results By cDNA amplification and Illumina short-read sequencing, we analyzed transcriptomes of C. pallens female adult under starved and fed conditions. In total, 71236 unigenes were obtained with an average length of 833 bp. Four vitellogenins, three insulin-like peptides and two insulin receptors were annotated. Comparison of gene expression profiles suggested that totally 1501 genes were differentially expressed between the two nutritional statuses. KEGG orthology classification showed that these differentially expression genes (DEGs) were mapped to 241 pathways. In turn, the top 4 are ribosome, protein processing in endoplasmic reticulum, biosynthesis of amino acids and carbon metabolism, indicating a distinct difference in nutritional and reproductive signaling between the two feeding conditions. Conclusions Our study yielded large-scale molecular information relevant to C. pallens nutritional and reproductive signaling, which will contribute to mass rearing and commercial use of this predaceous insect species. PMID:28683101

  3. Nutritional and reproductive signaling revealed by comparative gene expression analysis in Chrysopa pallens (Rambur) at different nutritional statuses.

    PubMed

    Han, Benfeng; Zhang, Shen; Zeng, Fanrong; Mao, Jianjun

    2017-01-01

    The green lacewing, Chrysopa pallens Rambur, is one of the most important natural predators because of its extensive spectrum of prey and wide distribution. However, what we know about the nutritional and reproductive physiology of this species is very scarce. By cDNA amplification and Illumina short-read sequencing, we analyzed transcriptomes of C. pallens female adult under starved and fed conditions. In total, 71236 unigenes were obtained with an average length of 833 bp. Four vitellogenins, three insulin-like peptides and two insulin receptors were annotated. Comparison of gene expression profiles suggested that totally 1501 genes were differentially expressed between the two nutritional statuses. KEGG orthology classification showed that these differentially expression genes (DEGs) were mapped to 241 pathways. In turn, the top 4 are ribosome, protein processing in endoplasmic reticulum, biosynthesis of amino acids and carbon metabolism, indicating a distinct difference in nutritional and reproductive signaling between the two feeding conditions. Our study yielded large-scale molecular information relevant to C. pallens nutritional and reproductive signaling, which will contribute to mass rearing and commercial use of this predaceous insect species.

  4. Interdisciplinary Team Science in Cell Biology.

    PubMed

    Horwitz, Rick

    2016-11-01

    The cell is complex. With its multitude of components, spatial-temporal character, and gene expression diversity, it is challenging to comprehend the cell as an integrated system and to develop models that predict its behaviors. I suggest an approach to address this issue, involving system level data analysis, large scale team science, and philanthropy. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Representing high throughput expression profiles via perturbation barcodes reveals compound targets.

    PubMed

    Filzen, Tracey M; Kutchukian, Peter S; Hermes, Jeffrey D; Li, Jing; Tudor, Matthew

    2017-02-01

    High throughput mRNA expression profiling can be used to characterize the response of cell culture models to perturbations such as pharmacologic modulators and genetic perturbations. As profiling campaigns expand in scope, it is important to homogenize, summarize, and analyze the resulting data in a manner that captures significant biological signals in spite of various noise sources such as batch effects and stochastic variation. We used the L1000 platform for large-scale profiling of 978 representative genes across thousands of compound treatments. Here, a method is described that uses deep learning techniques to convert the expression changes of the landmark genes into a perturbation barcode that reveals important features of the underlying data, performing better than the raw data in revealing important biological insights. The barcode captures compound structure and target information, and predicts a compound's high throughput screening promiscuity, to a higher degree than the original data measurements, indicating that the approach uncovers underlying factors of the expression data that are otherwise entangled or masked by noise. Furthermore, we demonstrate that visualizations derived from the perturbation barcode can be used to more sensitively assign functions to unknown compounds through a guilt-by-association approach, which we use to predict and experimentally validate the activity of compounds on the MAPK pathway. The demonstrated application of deep metric learning to large-scale chemical genetics projects highlights the utility of this and related approaches to the extraction of insights and testable hypotheses from big, sometimes noisy data.

  6. Representing high throughput expression profiles via perturbation barcodes reveals compound targets

    PubMed Central

    Kutchukian, Peter S.; Li, Jing; Tudor, Matthew

    2017-01-01

    High throughput mRNA expression profiling can be used to characterize the response of cell culture models to perturbations such as pharmacologic modulators and genetic perturbations. As profiling campaigns expand in scope, it is important to homogenize, summarize, and analyze the resulting data in a manner that captures significant biological signals in spite of various noise sources such as batch effects and stochastic variation. We used the L1000 platform for large-scale profiling of 978 representative genes across thousands of compound treatments. Here, a method is described that uses deep learning techniques to convert the expression changes of the landmark genes into a perturbation barcode that reveals important features of the underlying data, performing better than the raw data in revealing important biological insights. The barcode captures compound structure and target information, and predicts a compound’s high throughput screening promiscuity, to a higher degree than the original data measurements, indicating that the approach uncovers underlying factors of the expression data that are otherwise entangled or masked by noise. Furthermore, we demonstrate that visualizations derived from the perturbation barcode can be used to more sensitively assign functions to unknown compounds through a guilt-by-association approach, which we use to predict and experimentally validate the activity of compounds on the MAPK pathway. The demonstrated application of deep metric learning to large-scale chemical genetics projects highlights the utility of this and related approaches to the extraction of insights and testable hypotheses from big, sometimes noisy data. PMID:28182661

  7. Transcriptomic Profiling of Central Nervous System Regions in Three Species of Honey Bee during Dance Communication Behavior

    PubMed Central

    Sen Sarma, Moushumi; Rodriguez-Zas, Sandra L.; Hong, Feng; Zhong, Sheng; Robinson, Gene E.

    2009-01-01

    Background We conducted a large-scale transcriptomic profiling of selected regions of the central nervous system (CNS) across three species of honey bees, in foragers that were performing dance behavior to communicate to their nestmates the location, direction and profitability of an attractive floral resource. We used microarrays to measure gene expression in bees from Apis mellifera, dorsata and florea, species that share major traits unique to the genus and also show striking differences in biology and dance communication. The goals of this study were to determine the extent of regional specialization in gene expression and to explore the molecular basis of dance communication. Principal Findings This “snapshot” of the honey bee CNS during dance behavior provides strong evidence for both species-consistent and species-specific differences in gene expression. Gene expression profiles in the mushroom bodies consistently showed the biggest differences relative to the other CNS regions. There were strong similarities in gene expression between the central brain and the second thoracic ganglion across all three species; many of the genes were related to metabolism and energy production. We also obtained gene expression differences between CNS regions that varied by species: A. mellifera differed the most, while dorsata and florea tended to be more similar. Significance Species differences in gene expression perhaps mirror known differences in nesting habit, ecology and dance behavior between mellifera, florea and dorsata. Species-specific differences in gene expression in selected CNS regions that relate to synaptic activity and motor control provide particularly attractive candidate genes to explain the differences in dance behavior exhibited by these three honey bee species. Similarities between central brain and thoracic ganglion provide a unique perspective on the potential coupling of these two motor-related regions during dance behavior and perhaps provide a snapshot of the energy intensive process of dance output generation. Mushroom body results reflect known roles for this region in the regulation of learning, memory and rhythmic behavior. PMID:19641619

  8. Transcriptomic profiling of central nervous system regions in three species of honey bee during dance communication behavior.

    PubMed

    Sen Sarma, Moushumi; Rodriguez-Zas, Sandra L; Hong, Feng; Zhong, Sheng; Robinson, Gene E

    2009-07-29

    We conducted a large-scale transcriptomic profiling of selected regions of the central nervous system (CNS) across three species of honey bees, in foragers that were performing dance behavior to communicate to their nestmates the location, direction and profitability of an attractive floral resource. We used microarrays to measure gene expression in bees from Apis mellifera, dorsata and florea, species that share major traits unique to the genus and also show striking differences in biology and dance communication. The goals of this study were to determine the extent of regional specialization in gene expression and to explore the molecular basis of dance communication. This "snapshot" of the honey bee CNS during dance behavior provides strong evidence for both species-consistent and species-specific differences in gene expression. Gene expression profiles in the mushroom bodies consistently showed the biggest differences relative to the other CNS regions. There were strong similarities in gene expression between the central brain and the second thoracic ganglion across all three species; many of the genes were related to metabolism and energy production. We also obtained gene expression differences between CNS regions that varied by species: A. mellifera differed the most, while dorsata and florea tended to be more similar. Species differences in gene expression perhaps mirror known differences in nesting habit, ecology and dance behavior between mellifera, florea and dorsata. Species-specific differences in gene expression in selected CNS regions that relate to synaptic activity and motor control provide particularly attractive candidate genes to explain the differences in dance behavior exhibited by these three honey bee species. Similarities between central brain and thoracic ganglion provide a unique perspective on the potential coupling of these two motor-related regions during dance behavior and perhaps provide a snapshot of the energy intensive process of dance output generation. Mushroom body results reflect known roles for this region in the regulation of learning, memory and rhythmic behavior.

  9. Genome-wide identification and characterization of WRKY gene family in Salix suchowensis.

    PubMed

    Bi, Changwei; Xu, Yiqing; Ye, Qiaolin; Yin, Tongming; Ye, Ning

    2016-01-01

    WRKY proteins are the zinc finger transcription factors that were first identified in plants. They can specifically interact with the W-box, which can be found in the promoter region of a large number of plant target genes, to regulate the expressions of downstream target genes. They also participate in diverse physiological and growing processes in plants. Prior to this study, a plenty of WRKY genes have been identified and characterized in herbaceous species, but there is no large-scale study of WRKY genes in willow. With the whole genome sequencing of Salix suchowensis, we have the opportunity to conduct the genome-wide research for willow WRKY gene family. In this study, we identified 85 WRKY genes in the willow genome and renamed them from SsWRKY1 to SsWRKY85 on the basis of their specific distributions on chromosomes. Due to their diverse structural features, the 85 willow WRKY genes could be further classified into three main groups (group I-III), with five subgroups (IIa-IIe) in group II. With the multiple sequence alignment and the manual search, we found three variations of the WRKYGQK heptapeptide: WRKYGRK, WKKYGQK and WRKYGKK, and four variations of the normal zinc finger motif, which might execute some new biological functions. In addition, the SsWRKY genes from the same subgroup share the similar exon-intron structures and conserved motif domains. Further studies of SsWRKY genes revealed that segmental duplication events (SDs) played a more prominent role in the expansion of SsWRKY genes. Distinct expression profiles of SsWRKY genes with RNA sequencing data revealed that diverse expression patterns among five tissues, including tender roots, young leaves, vegetative buds, non-lignified stems and barks. With the analyses of WRKY gene family in willow, it is not only beneficial to complete the functional and annotation information of WRKY genes family in woody plants, but also provide important references to investigate the expansion and evolution of this gene family in flowering plants.

  10. Genome-wide identification and characterization of WRKY gene family in Salix suchowensis

    PubMed Central

    Ye, Qiaolin; Yin, Tongming

    2016-01-01

    WRKY proteins are the zinc finger transcription factors that were first identified in plants. They can specifically interact with the W-box, which can be found in the promoter region of a large number of plant target genes, to regulate the expressions of downstream target genes. They also participate in diverse physiological and growing processes in plants. Prior to this study, a plenty of WRKY genes have been identified and characterized in herbaceous species, but there is no large-scale study of WRKY genes in willow. With the whole genome sequencing of Salix suchowensis, we have the opportunity to conduct the genome-wide research for willow WRKY gene family. In this study, we identified 85 WRKY genes in the willow genome and renamed them from SsWRKY1 to SsWRKY85 on the basis of their specific distributions on chromosomes. Due to their diverse structural features, the 85 willow WRKY genes could be further classified into three main groups (group I–III), with five subgroups (IIa–IIe) in group II. With the multiple sequence alignment and the manual search, we found three variations of the WRKYGQK heptapeptide: WRKYGRK, WKKYGQK and WRKYGKK, and four variations of the normal zinc finger motif, which might execute some new biological functions. In addition, the SsWRKY genes from the same subgroup share the similar exon–intron structures and conserved motif domains. Further studies of SsWRKY genes revealed that segmental duplication events (SDs) played a more prominent role in the expansion of SsWRKY genes. Distinct expression profiles of SsWRKY genes with RNA sequencing data revealed that diverse expression patterns among five tissues, including tender roots, young leaves, vegetative buds, non-lignified stems and barks. With the analyses of WRKY gene family in willow, it is not only beneficial to complete the functional and annotation information of WRKY genes family in woody plants, but also provide important references to investigate the expansion and evolution of this gene family in flowering plants. PMID:27651997

  11. A CRISPR-Based Toolbox for Studying T Cell Signal Transduction

    PubMed Central

    Chi, Shen; Weiss, Arthur; Wang, Haopeng

    2016-01-01

    CRISPR/Cas9 system is a powerful technology to perform genome editing in a variety of cell types. To facilitate the application of Cas9 in mapping T cell signaling pathways, we generated a toolbox for large-scale genetic screens in human Jurkat T cells. The toolbox has three different Jurkat cell lines expressing distinct Cas9 variants, including wild-type Cas9, dCas9-KRAB, and sunCas9. We demonstrated that the toolbox allows us to rapidly disrupt endogenous gene expression at the DNA level and to efficiently repress or activate gene expression at the transcriptional level. The toolbox, in combination with multiple currently existing genome-wide sgRNA libraries, will be useful to systematically investigate T cell signal transduction using both loss-of-function and gain-of-function genetic screens. PMID:27057542

  12. Adaptation and evolution of deep-sea scale worms (Annelida: Polynoidae): insights from transcriptome comparison with a shallow-water species

    NASA Astrophysics Data System (ADS)

    Zhang, Yanjie; Sun, Jin; Chen, Chong; Watanabe, Hiromi K.; Feng, Dong; Zhang, Yu; Chiu, Jill M. Y.; Qian, Pei-Yuan; Qiu, Jian-Wen

    2017-04-01

    Polynoid scale worms (Polynoidae, Annelida) invaded deep-sea chemosynthesis-based ecosystems approximately 60 million years ago, but little is known about their genetic adaptation to the extreme deep-sea environment. In this study, we reported the first two transcriptomes of deep-sea polynoids (Branchipolynoe pettiboneae, Lepidonotopodium sp.) and compared them with the transcriptome of a shallow-water polynoid (Harmothoe imbricata). We determined codon and amino acid usage, positive selected genes, highly expressed genes and putative duplicated genes. Transcriptome assembly produced 98,806 to 225,709 contigs in the three species. There were more positively charged amino acids (i.e., histidine and arginine) and less negatively charged amino acids (i.e., aspartic acid and glutamic acid) in the deep-sea species. There were 120 genes showing clear evidence of positive selection. Among the 10% most highly expressed genes, there were more hemoglobin genes with high expression levels in both deep-sea species. The duplicated genes related to DNA recombination and metabolism, and gene expression were only enriched in deep-sea species. Deep-sea scale worms adopted two strategies of adaptation to hypoxia in the chemosynthesis-based habitats (i.e., rapid evolution of tetra-domain hemoglobin in Branchipolynoe or high expression of single-domain hemoglobin in Lepidonotopodium sp.).

  13. Adaptation and evolution of deep-sea scale worms (Annelida: Polynoidae): insights from transcriptome comparison with a shallow-water species

    PubMed Central

    Zhang, Yanjie; Sun, Jin; Chen, Chong; Watanabe, Hiromi K.; Feng, Dong; Zhang, Yu; Chiu, Jill M.Y.; Qian, Pei-Yuan; Qiu, Jian-Wen

    2017-01-01

    Polynoid scale worms (Polynoidae, Annelida) invaded deep-sea chemosynthesis-based ecosystems approximately 60 million years ago, but little is known about their genetic adaptation to the extreme deep-sea environment. In this study, we reported the first two transcriptomes of deep-sea polynoids (Branchipolynoe pettiboneae, Lepidonotopodium sp.) and compared them with the transcriptome of a shallow-water polynoid (Harmothoe imbricata). We determined codon and amino acid usage, positive selected genes, highly expressed genes and putative duplicated genes. Transcriptome assembly produced 98,806 to 225,709 contigs in the three species. There were more positively charged amino acids (i.e., histidine and arginine) and less negatively charged amino acids (i.e., aspartic acid and glutamic acid) in the deep-sea species. There were 120 genes showing clear evidence of positive selection. Among the 10% most highly expressed genes, there were more hemoglobin genes with high expression levels in both deep-sea species. The duplicated genes related to DNA recombination and metabolism, and gene expression were only enriched in deep-sea species. Deep-sea scale worms adopted two strategies of adaptation to hypoxia in the chemosynthesis-based habitats (i.e., rapid evolution of tetra-domain hemoglobin in Branchipolynoe or high expression of single-domain hemoglobin in Lepidonotopodium sp.). PMID:28397791

  14. ARHGAP18 is a novel gene under positive natural selection that influences HbF levels in β-thalassaemia.

    PubMed

    He, Yunyan; Luo, Jianming; Chen, Yang; Zhou, Xiaoheng; Yu, Shanjuan; Jin, Ling; Xiao, Xuan; Jia, Siyuan; Liu, Qiang

    2018-02-01

    Foetal haemoglobin (HbF) plays a dominant role in ameliorating the morbidity and mortality of β-thalassaemia. A better understanding of the loci and genes involved in HbF expression would be beneficial for the treatment of β-thalassaemia major. However, the genes associated with HbF expression remain largely unknown. In this study, we first explored large-scale data sets and examined the human genome for evidence of positive natural selection to screen out single nucleotide polymorphisms (SNPs). A genetic analysis of HbF levels was conducted in a Chinese cohort of patients with β-thalassaemia to confirm the bioinformatics results. A total of 1141 subjects with β-thalassaemia were recruited. The results showed that the SNP rs11759328 in the ARHGAP18 gene was significantly associated with HbF levels (Ρ = 5.1 × 10 -4 ). ARHGAP18 belongs to the RhoGAP family and controls angiogenesis, cellular morphology and motility. Second, after determining that ARHGAP18 was highly expressed in the human K562 cell line, we used lentiviral-mediated small interfering RNA to knock down ARHGAP18 expression and subsequently assessed cell proliferation and apoptosis using cell proliferation assays and flow cytometry, respectively. ARHGAP18 downregulation in K562 cells significantly increased HBG1/2 expression and apoptosis, but proliferation was not significantly affected in vitro. Our data suggest that ARHGAP18, which was located by the SNP rs11759328 via positive selection, plays a potential role in regulating HbF expression in β-thalassaemia and may be a promising therapeutic target. Knockout studies of ARHGAP18 warrant further investigation into its aetiology in HbF.

  15. Mechanisms of stable lipid loss in a social insect

    PubMed Central

    Ament, Seth A.; Chan, Queenie W.; Wheeler, Marsha M.; Nixon, Scott E.; Johnson, S. Peir; Rodriguez-Zas, Sandra L.; Foster, Leonard J.; Robinson, Gene E.

    2011-01-01

    SUMMARY Worker honey bees undergo a socially regulated, highly stable lipid loss as part of their behavioral maturation. We used large-scale transcriptomic and proteomic experiments, physiological experiments and RNA interference to explore the mechanistic basis for this lipid loss. Lipid loss was associated with thousands of gene expression changes in abdominal fat bodies. Many of these genes were also regulated in young bees by nutrition during an initial period of lipid gain. Surprisingly, in older bees, which is when maximum lipid loss occurs, diet played less of a role in regulating fat body gene expression for components of evolutionarily conserved nutrition-related endocrine systems involving insulin and juvenile hormone signaling. By contrast, fat body gene expression in older bees was regulated more strongly by evolutionarily novel regulatory factors, queen mandibular pheromone (a honey bee-specific social signal) and vitellogenin (a conserved yolk protein that has evolved novel, maturation-related functions in the bee), independent of nutrition. These results demonstrate that conserved molecular pathways can be manipulated to achieve stable lipid loss through evolutionarily novel regulatory processes. PMID:22031746

  16. Mechanisms of stable lipid loss in a social insect.

    PubMed

    Ament, Seth A; Chan, Queenie W; Wheeler, Marsha M; Nixon, Scott E; Johnson, S Peir; Rodriguez-Zas, Sandra L; Foster, Leonard J; Robinson, Gene E

    2011-11-15

    Worker honey bees undergo a socially regulated, highly stable lipid loss as part of their behavioral maturation. We used large-scale transcriptomic and proteomic experiments, physiological experiments and RNA interference to explore the mechanistic basis for this lipid loss. Lipid loss was associated with thousands of gene expression changes in abdominal fat bodies. Many of these genes were also regulated in young bees by nutrition during an initial period of lipid gain. Surprisingly, in older bees, which is when maximum lipid loss occurs, diet played less of a role in regulating fat body gene expression for components of evolutionarily conserved nutrition-related endocrine systems involving insulin and juvenile hormone signaling. By contrast, fat body gene expression in older bees was regulated more strongly by evolutionarily novel regulatory factors, queen mandibular pheromone (a honey bee-specific social signal) and vitellogenin (a conserved yolk protein that has evolved novel, maturation-related functions in the bee), independent of nutrition. These results demonstrate that conserved molecular pathways can be manipulated to achieve stable lipid loss through evolutionarily novel regulatory processes.

  17. The morphologies of breast cancer cell lines in three-dimensionalassays correlate with their profiles of gene expression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kenny, Paraic A.; Lee, Genee Y.; Myers, Connie A.

    2007-01-31

    3D cell cultures are rapidly becoming the method of choice for the physiologically relevant modeling of many aspects of non-malignant and malignant cell behavior ex vivo. Nevertheless, only a limited number of distinct cell types have been evaluated in this assay to date. Here we report the first large scale comparison of the transcriptional profiles and 3D cell culture phenotypes of a substantial panel of human breast cancer cell lines. Each cell line adopts a colony morphology of one of four main classes in 3D culture. These morphologies reflect, at least in part, the underlying gene expression profile and proteinmore » expression patterns of the cell lines, and distinct morphologies were also associated with tumor cell invasiveness and with cell lines originating from metastases. We further demonstrate that consistent differences in genes encoding signal transduction proteins emerge when even tumor cells are cultured in 3D microenvironments.« less

  18. DTWscore: differential expression and cell clustering analysis for time-series single-cell RNA-seq data.

    PubMed

    Wang, Zhuo; Jin, Shuilin; Liu, Guiyou; Zhang, Xiurui; Wang, Nan; Wu, Deliang; Hu, Yang; Zhang, Chiping; Jiang, Qinghua; Xu, Li; Wang, Yadong

    2017-05-23

    The development of single-cell RNA sequencing has enabled profound discoveries in biology, ranging from the dissection of the composition of complex tissues to the identification of novel cell types and dynamics in some specialized cellular environments. However, the large-scale generation of single-cell RNA-seq (scRNA-seq) data collected at multiple time points remains a challenge to effective measurement gene expression patterns in transcriptome analysis. We present an algorithm based on the Dynamic Time Warping score (DTWscore) combined with time-series data, that enables the detection of gene expression changes across scRNA-seq samples and recovery of potential cell types from complex mixtures of multiple cell types. The DTWscore successfully classify cells of different types with the most highly variable genes from time-series scRNA-seq data. The study was confined to methods that are implemented and available within the R framework. Sample datasets and R packages are available at https://github.com/xiaoxiaoxier/DTWscore .

  19. Improved polysaccharide production in a submerged culture of Ganoderma lucidum by the heterologous expression of Vitreoscilla hemoglobin gene.

    PubMed

    Li, Huan-Jun; Zhang, De-Huai; Yue, Tong-Hui; Jiang, Lu-Xi; Yu, Xuya; Zhao, Peng; Li, Tao; Xu, Jun-Wei

    2016-01-10

    Expression of Vitreoscilla hemoglobin (VHb) gene was used to improve polysaccharide production in Ganoderma lucidum. The VHb gene, vgb, under the control of the constitutive glyceraldehyde-3-phosphate dehydrogenase gene promoter was introduced into G. lucidum. The activity of expressed VHb was confirmed by the observation of VHb specific CO-difference spectrum with a maximal absorption at 419 nm for the transformant. The effects of VHb expression on intracellular polysaccharide (IPS) content, extracellular polysaccharide (EPS) production and transcription levels of three genes encoding the enzymes involved in polysaccharide biosynthesis, including phosphoglucomutase (PGM), uridine diphosphate glucose pyrophosphorylase (UGP), and β-1,3-glucan synthase (GLS), were investigated. The maximum IPS content and EPS production in the vgb-bearing G. lucidum were 26.4 mg/100mg dry weight and 0.83 g/L, respectively, which were higher by 30.5% and 88.2% than those of the wild-type strain. The transcription levels of PGM, UGP and GLS were up-regulated by 1.51-, 1.55- and 3.83-fold, respectively, in the vgb-bearing G. lucidum. This work highlights the potential of VHb to enhance G. lucidum polysaccharide production by large scale fermentation. Copyright © 2015 Elsevier B.V. All rights reserved.

  20. Unique differentiation profile of mouse embryonic stem cells in rotary and stirred tank bioreactors.

    PubMed

    Fridley, Krista M; Fernandez, Irina; Li, Mon-Tzu Alice; Kettlewell, Robert B; Roy, Krishnendu

    2010-11-01

    Embryonic stem (ES)-cell-derived lineage-specific stem cells, for example, hematopoietic stem cells, could provide a potentially unlimited source for transplantable cells, especially for cell-based therapies. However, reproducible methods must be developed to maximize and scale-up ES cell differentiation to produce clinically relevant numbers of therapeutic cells. Bioreactor-based dynamic culture conditions are amenable to large-scale cell production, but few studies have evaluated how various bioreactor types and culture parameters influence ES cell differentiation, especially hematopoiesis. Our results indicate that cell seeding density and bioreactor speed significantly affect embryoid body formation and subsequent generation of hematopoietic stem and progenitor cells in both stirred tank (spinner flask) and rotary microgravity (Synthecon™) type bioreactors. In general, high percentages of hematopoietic stem and progenitor cells were generated in both bioreactors, especially at high cell densities. In addition, Synthecon bioreactors produced more sca-1(+) progenitors and spinner flasks generated more c-Kit(+) progenitors, demonstrating their unique differentiation profiles. cDNA microarray analysis of genes involved in pluripotency, germ layer formation, and hematopoietic differentiation showed that on day 7 of differentiation, embryoid bodies from both bioreactors consisted of all three germ layers of embryonic development. However, unique gene expression profiles were observed in the two bioreactors; for example, expression of specific hematopoietic genes were significantly more upregulated in the Synthecon cultures than in spinner flasks. We conclude that bioreactor type and culture parameters can be used to control ES cell differentiation, enhance unique progenitor cell populations, and provide means for large-scale production of transplantable therapeutic cells.

  1. Mechanisms of macroevolution: polyphagous plasticity in butterfly larvae revealed by RNA-Seq.

    PubMed

    de la Paz Celorio-Mancera, Maria; Wheat, Christopher W; Vogel, Heiko; Söderlind, Lina; Janz, Niklas; Nylin, Sören

    2013-10-01

    Transcriptome studies of insect herbivory are still rare, yet studies in model systems have uncovered patterns of transcript regulation that appear to provide insights into how insect herbivores attain polyphagy, such as a general increase in expression breadth and regulation of ribosomal, digestion- and detoxification-related genes. We investigated the potential generality of these emerging patterns, in the Swedish comma, Polygonia c-album, which is a polyphagous, widely-distributed butterfly. Urtica dioica and Ribes uva-crispa are hosts of P. c-album, but Ribes represents a recent evolutionary shift onto a very divergent host. Utilizing the assembled transcriptome for read mapping, we assessed gene expression finding that caterpillar life-history (i.e. 2nd vs. 4th-instar regulation) had a limited influence on gene expression plasticity. In contrast, differential expression in response to host-plant identified genes encoding serine-type endopeptidases, membrane-associated proteins and transporters. Differential regulation of genes involved in nucleic acid binding was also observed suggesting that polyphagy involves large scale transcriptional changes. Additionally, transcripts coding for structural constituents of the cuticle were differentially expressed in caterpillars in response to their diet indicating that the insect cuticle may be a target for plant defence. Our results state that emerging patterns of transcript regulation from model species appear relevant in species when placed in an evolutionary context. © 2013 John Wiley & Sons Ltd.

  2. Dynamic Visualization of Co-expression in Systems Genetics Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    New, Joshua Ryan; Huang, Jian; Chesler, Elissa J

    2008-01-01

    Biologists hope to address grand scientific challenges by exploring the abundance of data made available through modern microarray technology and other high-throughput techniques. The impact of this data, however, is limited unless researchers can effectively assimilate such complex information and integrate it into their daily research; interactive visualization tools are called for to support the effort. Specifically, typical studies of gene co-expression require novel visualization tools that enable the dynamic formulation and fine-tuning of hypotheses to aid the process of evaluating sensitivity of key parameters. These tools should allow biologists to develop an intuitive understanding of the structure of biologicalmore » networks and discover genes which reside in critical positions in networks and pathways. By using a graph as a universal data representation of correlation in gene expression data, our novel visualization tool employs several techniques that when used in an integrated manner provide innovative analytical capabilities. Our tool for interacting with gene co-expression data integrates techniques such as: graph layout, qualitative subgraph extraction through a novel 2D user interface, quantitative subgraph extraction using graph-theoretic algorithms or by querying an optimized b-tree, dynamic level-of-detail graph abstraction, and template-based fuzzy classification using neural networks. We demonstrate our system using a real-world workflow from a large-scale, systems genetics study of mammalian gene co-expression.« less

  3. MetaRanker 2.0: a web server for prioritization of genetic variation data

    PubMed Central

    Pers, Tune H.; Dworzyński, Piotr; Thomas, Cecilia Engel; Lage, Kasper; Brunak, Søren

    2013-01-01

    MetaRanker 2.0 is a web server for prioritization of common and rare frequency genetic variation data. Based on heterogeneous data sets including genetic association data, protein–protein interactions, large-scale text-mining data, copy number variation data and gene expression experiments, MetaRanker 2.0 prioritizes the protein-coding part of the human genome to shortlist candidate genes for targeted follow-up studies. MetaRanker 2.0 is made freely available at www.cbs.dtu.dk/services/MetaRanker-2.0. PMID:23703204

  4. MetaRanker 2.0: a web server for prioritization of genetic variation data.

    PubMed

    Pers, Tune H; Dworzyński, Piotr; Thomas, Cecilia Engel; Lage, Kasper; Brunak, Søren

    2013-07-01

    MetaRanker 2.0 is a web server for prioritization of common and rare frequency genetic variation data. Based on heterogeneous data sets including genetic association data, protein-protein interactions, large-scale text-mining data, copy number variation data and gene expression experiments, MetaRanker 2.0 prioritizes the protein-coding part of the human genome to shortlist candidate genes for targeted follow-up studies. MetaRanker 2.0 is made freely available at www.cbs.dtu.dk/services/MetaRanker-2.0.

  5. Molecular biology of bladder cancer.

    PubMed

    Martin-Doyle, William; Kwiatkowski, David J

    2015-04-01

    Classic as well as more recent large-scale genomic analyses have uncovered multiple genes and pathways important for bladder cancer development. Genes involved in cell-cycle control, chromatin regulation, and receptor tyrosine and PI3 kinase-mammalian target of rapamycin signaling pathways are commonly mutated in muscle-invasive bladder cancer. Expression-based analyses have identified distinct types of bladder cancer that are similar to subsets of breast cancer, and have prognostic and therapeutic significance. These observations are leading to novel therapeutic approaches in bladder cancer, providing optimism for therapeutic progress. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. Complex Genetics of Behavior: BXDs in the Automated Home-Cage.

    PubMed

    Loos, Maarten; Verhage, Matthijs; Spijker, Sabine; Smit, August B

    2017-01-01

    This chapter describes a use case for the genetic dissection and automated analysis of complex behavioral traits using the genetically diverse panel of BXD mouse recombinant inbred strains. Strains of the BXD resource differ widely in terms of gene and protein expression in the brain, as well as in their behavioral repertoire. A large mouse resource opens the possibility for gene finding studies underlying distinct behavioral phenotypes, however, such a resource poses a challenge in behavioral phenotyping. To address the specifics of large-scale screening we describe how to investigate: (1) how to assess mouse behavior systematically in addressing a large genetic cohort, (2) how to dissect automation-derived longitudinal mouse behavior into quantitative parameters, and (3) how to map these quantitative traits to the genome, deriving loci underlying aspects of behavior.

  7. Regulatory heterochronies and loose temporal scaling between sea star and sea urchin regulatory circuits.

    PubMed

    Gildor, Tsvia; Hinman, Veronica; Ben-Tabou-De-Leon, Smadar

    2017-01-01

    It has long been argued that heterochrony, a change in relative timing of a developmental process, is a major source of evolutionary innovation. Heterochronic changes of regulatory gene activation could be the underlying molecular mechanism driving heterochronic changes through evolution. Here, we compare the temporal expression profiles of key regulatory circuits between sea urchin and sea star, representative of two classes of Echinoderms that shared a common ancestor about 500 million years ago. The morphologies of the sea urchin and sea star embryos are largely comparable, yet, differences in certain mesodermal cell types and ectodermal patterning result in distinct larval body plans. We generated high resolution temporal profiles of 17 mesodermally-, endodermally- and ectodermally-expressed regulatory genes in the sea star, Patiria miniata, and compared these to their orthologs in the Mediterranean sea urchin, Paracentrotus lividus. We found that the maternal to zygotic transition is delayed in the sea star compared to the sea urchin, in agreement with the longer cleavage stage in the sea star. Interestingly, the order of gene activation shows the highest variation in the relatively diverged mesodermal circuit, while the correlations of expression dynamics are the highest in the strongly conserved endodermal circuit. We detected loose scaling of the developmental rates of these species and observed interspecies heterochronies within all studied regulatory circuits. Thus, after 500 million years of parallel evolution, mild heterochronies between the species are frequently observed and the tight temporal scaling observed for closely related species no longer holds.

  8. Alternative Splicing of CHEK2 and Codeletion with NF2 Promote Chromosomal Instability in Meningioma1

    PubMed Central

    Yang, Hong Wei; Kim, Tae-Min; Song, Sydney S; Shrinath, Nihal; Park, Richard; Kalamarides, Michel; Park, Peter J; Black, Peter M; Carroll, Rona S; Johnson, Mark D

    2012-01-01

    Mutations of the NF2 gene on chromosome 22q are thought to initiate tumorigenesis in nearly 50% of meningiomas, and 22q deletion is the earliest and most frequent large-scale chromosomal abnormality observed in these tumors. In aggressive meningiomas, 22q deletions are generally accompanied by the presence of large-scale segmental abnormalities involving other chromosomes, but the reasons for this association are unknown. We find that large-scale chromosomal alterations accumulate during meningioma progression primarily in tumors harboring 22q deletions, suggesting 22q-associated chromosomal instability. Here we show frequent codeletion of the DNA repair and tumor suppressor gene, CHEK2, in combination with NF2 on chromosome 22q in a majority of aggressive meningiomas. In addition, tumor-specific splicing of CHEK2 in meningioma leads to decreased functional Chk2 protein expression. We show that enforced Chk2 knockdown in meningioma cells decreases DNA repair. Furthermore, Chk2 depletion increases centrosome amplification, thereby promoting chromosomal instability. Taken together, these data indicate that alternative splicing and frequent codeletion of CHEK2 and NF2 contribute to the genomic instability and associated development of aggressive biologic behavior in meningiomas. PMID:22355270

  9. Selection Shapes Transcriptional Logic and Regulatory Specialization in Genetic Networks

    PubMed Central

    Fogelmark, Karl; Peterson, Carsten; Troein, Carl

    2016-01-01

    Background Living organisms need to regulate their gene expression in response to environmental signals and internal cues. This is a computational task where genes act as logic gates that connect to form transcriptional networks, which are shaped at all scales by evolution. Large-scale mutations such as gene duplications and deletions add and remove network components, whereas smaller mutations alter the connections between them. Selection determines what mutations are accepted, but its importance for shaping the resulting networks has been debated. Methodology To investigate the effects of selection in the shaping of transcriptional networks, we derive transcriptional logic from a combinatorially powerful yet tractable model of the binding between DNA and transcription factors. By evolving the resulting networks based on their ability to function as either a simple decision system or a circadian clock, we obtain information on the regulation and logic rules encoded in functional transcriptional networks. Comparisons are made between networks evolved for different functions, as well as with structurally equivalent but non-functional (neutrally evolved) networks, and predictions are validated against the transcriptional network of E. coli. Principal Findings We find that the logic rules governing gene expression depend on the function performed by the network. Unlike the decision systems, the circadian clocks show strong cooperative binding and negative regulation, which achieves tight temporal control of gene expression. Furthermore, we find that transcription factors act preferentially as either activators or repressors, both when binding multiple sites for a single target gene and globally in the transcriptional networks. This separation into positive and negative regulators requires gene duplications, which highlights the interplay between mutation and selection in shaping the transcriptional networks. PMID:26927540

  10. Predicting Response to Histone Deacetylase Inhibitors Using High-Throughput Genomics.

    PubMed

    Geeleher, Paul; Loboda, Andrey; Lenkala, Divya; Wang, Fan; LaCroix, Bonnie; Karovic, Sanja; Wang, Jacqueline; Nebozhyn, Michael; Chisamore, Michael; Hardwick, James; Maitland, Michael L; Huang, R Stephanie

    2015-11-01

    Many disparate biomarkers have been proposed as predictors of response to histone deacetylase inhibitors (HDI); however, all have failed when applied clinically. Rather than this being entirely an issue of reproducibility, response to the HDI vorinostat may be determined by the additive effect of multiple molecular factors, many of which have previously been demonstrated. We conducted a large-scale gene expression analysis using the Cancer Genome Project for discovery and generated another large independent cancer cell line dataset across different cancers for validation. We compared different approaches in terms of how accurately vorinostat response can be predicted on an independent out-of-batch set of samples and applied the polygenic marker prediction principles in a clinical trial. Using machine learning, the small effects that aggregate, resulting in sensitivity or resistance, can be recovered from gene expression data in a large panel of cancer cell lines.This approach can predict vorinostat response accurately, whereas single gene or pathway markers cannot. Our analyses recapitulated and contextualized many previous findings and suggest an important role for processes such as chromatin remodeling, autophagy, and apoptosis. As a proof of concept, we also discovered a novel causative role for CHD4, a helicase involved in the histone deacetylase complex that is associated with poor clinical outcome. As a clinical validation, we demonstrated that a common dose-limiting toxicity of vorinostat, thrombocytopenia, can be predicted (r = 0.55, P = .004) several days before it is detected clinically. Our work suggests a paradigm shift from single-gene/pathway evaluation to simultaneously evaluating multiple independent high-throughput gene expression datasets, which can be easily extended to other investigational compounds where similar issues are hampering clinical adoption. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. An Integrated Analysis of MicroRNA and mRNA Expression Profiles to Identify RNA Expression Signatures in Lambskin Hair Follicles in Hu Sheep

    PubMed Central

    Lv, Xiaoyang; Sun, Wei; Yin, Jinfeng; Ni, Rong; Su, Rui; Wang, Qingzeng; Gao, Wen; Bao, Jianjun; Yu, Jiarui; Wang, Lihong; Chen, Ling

    2016-01-01

    Wave patterns in lambskin hair follicles are an important factor determining the quality of sheep’s wool. Hair follicles in lambskin from Hu sheep, a breed unique to China, have 3 types of waves, designated as large, medium, and small. The quality of wool from small wave follicles is excellent, while the quality of large waves is considered poor. Because no molecular and biological studies on hair follicles of these sheep have been conducted to date, the molecular mechanisms underlying the formation of different wave patterns is currently unknown. The aim of this article was to screen the candidate microRNAs (miRNA) and genes for the development of hair follicles in Hu sheep. Two-day-old Hu lambs were selected from full-sib individuals that showed large, medium, and small waves. Integrated analysis of microRNA and mRNA expression profiles employed high-throughout sequencing technology. Approximately 13, 24, and 18 differentially expressed miRNAs were found between small and large waves, small and medium waves, and medium and large waves, respectively. A total of 54, 190, and 81 differentially expressed genes were found between small and large waves, small and medium waves, and medium and large waves, respectively, by RNA sequencing (RNA-seq) analysis. Differentially expressed genes were classified using gene ontology and pathway analyses. They were found to be mainly involved in cell differentiation, proliferation, apoptosis, growth, immune response, and ion transport, and were associated with MAPK and the Notch signaling pathway. Reverse transcription-polymerase chain reaction (RT-PCR) analyses of differentially-expressed miRNA and genes were consistent with sequencing results. Integrated analysis of miRNA and mRNA expression indicated that, compared to small waves, large waves included 4 downregulated miRNAs that had regulatory effects on 8 upregulated genes and 3 upregulated miRNAs, which in turn influenced 13 downregulated genes. Compared to small waves, medium waves included 13 downregulated miRNAs that had regulatory effects on 64 upregulated genes and 4 upregulated miRNAs, which in turn had regulatory effects on 22 downregulated genes. Compared to medium waves, large waves consisted of 13 upregulated miRNAs that had regulatory effects on 48 downregulated genes. These differentially expressed miRNAs and genes may play a significant role in forming different patterns, and provide evidence for the molecular mechanisms underlying the formation of hair follicles of varying patterns. PMID:27404636

  12. Chromatin organization and global regulation of Hox gene clusters

    PubMed Central

    Montavon, Thomas; Duboule, Denis

    2013-01-01

    During development, a properly coordinated expression of Hox genes, within their different genomic clusters is critical for patterning the body plans of many animals with a bilateral symmetry. The fascinating correspondence between the topological organization of Hox clusters and their transcriptional activation in space and time has served as a paradigm for understanding the relationships between genome structure and function. Here, we review some recent observations, which revealed highly dynamic changes in the structure of chromatin at Hox clusters, in parallel with their activation during embryonic development. We discuss the relevance of these findings for our understanding of large-scale gene regulation. PMID:23650639

  13. Evaluation of RNAi and CRISPR technologies by large-scale gene expression profiling in the Connectivity Map.

    PubMed

    Smith, Ian; Greenside, Peyton G; Natoli, Ted; Lahr, David L; Wadden, David; Tirosh, Itay; Narayan, Rajiv; Root, David E; Golub, Todd R; Subramanian, Aravind; Doench, John G

    2017-11-01

    The application of RNA interference (RNAi) to mammalian cells has provided the means to perform phenotypic screens to determine the functions of genes. Although RNAi has revolutionized loss-of-function genetic experiments, it has been difficult to systematically assess the prevalence and consequences of off-target effects. The Connectivity Map (CMAP) represents an unprecedented resource to study the gene expression consequences of expressing short hairpin RNAs (shRNAs). Analysis of signatures for over 13,000 shRNAs applied in 9 cell lines revealed that microRNA (miRNA)-like off-target effects of RNAi are far stronger and more pervasive than generally appreciated. We show that mitigating off-target effects is feasible in these datasets via computational methodologies to produce a consensus gene signature (CGS). In addition, we compared RNAi technology to clustered regularly interspaced short palindromic repeat (CRISPR)-based knockout by analysis of 373 single guide RNAs (sgRNAs) in 6 cells lines and show that the on-target efficacies are comparable, but CRISPR technology is far less susceptible to systematic off-target effects. These results will help guide the proper use and analysis of loss-of-function reagents for the determination of gene function.

  14. A large scale analysis of cDNA in Arabidopsis thaliana: generation of 12,028 non-redundant expressed sequence tags from normalized and size-selected cDNA libraries.

    PubMed

    Asamizu, E; Nakamura, Y; Sato, S; Tabata, S

    2000-06-30

    For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.

  15. Finding gene regulatory network candidates using the gene expression knowledge base.

    PubMed

    Venkatesan, Aravind; Tripathi, Sushil; Sanz de Galdeano, Alejandro; Blondé, Ward; Lægreid, Astrid; Mironov, Vladimir; Kuiper, Martin

    2014-12-10

    Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

  16. Massive activation of archaeal defense genes during viral infection.

    PubMed

    Quax, Tessa E F; Voet, Marleen; Sismeiro, Odile; Dillies, Marie-Agnes; Jagla, Bernd; Coppée, Jean-Yves; Sezonov, Guennadi; Forterre, Patrick; van der Oost, John; Lavigne, Rob; Prangishvili, David

    2013-08-01

    Archaeal viruses display unusually high genetic and morphological diversity. Studies of these viruses proved to be instrumental for the expansion of knowledge on viral diversity and evolution. The Sulfolobus islandicus rod-shaped virus 2 (SIRV2) is a model to study virus-host interactions in Archaea. It is a lytic virus that exploits a unique egress mechanism based on the formation of remarkable pyramidal structures on the host cell envelope. Using whole-transcriptome sequencing, we present here a global map defining host and viral gene expression during the infection cycle of SIRV2 in its hyperthermophilic host S. islandicus LAL14/1. This information was used, in combination with a yeast two-hybrid analysis of SIRV2 protein interactions, to advance current understanding of viral gene functions. As a consequence of SIRV2 infection, transcription of more than one-third of S. islandicus genes was differentially regulated. While expression of genes involved in cell division decreased, those genes playing a role in antiviral defense were activated on a large scale. Expression of genes belonging to toxin-antitoxin and clustered regularly interspaced short palindromic repeat (CRISPR)-Cas systems was specifically pronounced. The observed different degree of activation of various CRISPR-Cas systems highlights the specialized functions they perform. The information on individual gene expression and activation of antiviral defense systems is expected to aid future studies aimed at detailed understanding of the functions and interplay of these systems in vivo.

  17. Genetic and pharmacological reactivation of the mammalian inactive X chromosome

    PubMed Central

    Bhatnagar, Sanchita; Zhu, Xiaochun; Ou, Jianhong; Lin, Ling; Chamberlain, Lynn; Zhu, Lihua J.; Wajapeyee, Narendra; Green, Michael R.

    2014-01-01

    X-chromosome inactivation (XCI), the random transcriptional silencing of one X chromosome in somatic cells of female mammals, is a mechanism that ensures equal expression of X-linked genes in both sexes. XCI is initiated in cis by the noncoding Xist RNA, which coats the inactive X chromosome (Xi) from which it is produced. However, trans-acting factors that mediate XCI remain largely unknown. Here, we perform a large-scale RNA interference screen to identify trans-acting XCI factors (XCIFs) that comprise regulators of cell signaling and transcription, including the DNA methyltransferase, DNMT1. The expression pattern of the XCIFs explains the selective onset of XCI following differentiation. The XCIFs function, at least in part, by promoting expression and/or localization of Xist to the Xi. Surprisingly, we find that DNMT1, which is generally a transcriptional repressor, is an activator of Xist transcription. Small-molecule inhibitors of two of the XCIFs can reversibly reactivate the Xi, which has implications for treatment of Rett syndrome and other dominant X-linked diseases. A homozygous mouse knockout of one of the XCIFs, stanniocalcin 1 (STC1), has an expected XCI defect but surprisingly is phenotypically normal. Remarkably, X-linked genes are not overexpressed in female Stc1−/− mice, revealing the existence of a mechanism(s) that can compensate for a persistent XCI deficiency to regulate X-linked gene expression. PMID:25136103

  18. Transcription through the eye of a needle: daily and annual cyclic gene expression variation in Douglas-fir needles.

    PubMed

    Cronn, Richard; Dolan, Peter C; Jogdeo, Sanjuro; Wegrzyn, Jill L; Neale, David B; St Clair, J Bradley; Denver, Dee R

    2017-07-24

    Perennial growth in plants is the product of interdependent cycles of daily and annual stimuli that induce cycles of growth and dormancy. In conifers, needles are the key perennial organ that integrates daily and seasonal signals from light, temperature, and water availability. To understand the relationship between seasonal cycles and seasonal gene expression responses in conifers, we examined diurnal and circannual needle mRNA accumulation in Douglas-fir (Pseudotsuga menziesii) needles at diurnal and circannual scales. Using mRNA sequencing, we sampled 6.1 × 10 9 reads from 19 trees and constructed a de novo pan-transcriptome reference that includes 173,882 tree-derived transcripts. Using this reference, we mapped RNA-Seq reads from 179 samples that capture daily and annual variation. We identified 12,042 diurnally-cyclic transcripts, 9299 of which showed homology to annotated genes from other plant genomes, including angiosperm core clock genes. Annual analysis revealed 21,225 circannual transcripts, 17,335 of which showed homology to annotated genes from other plant genomes. The timing of maximum gene expression is associated with light intensity at diurnal scales and photoperiod at annual scales, with approximately half of transcripts reaching maximum expression +/- 2 h from sunrise and sunset, and +/- 20 days from winter and summer solstices. Comparisons with published studies from other conifers shows congruent behavior in clock genes with Japanese cedar (Cryptomeria), and a significant preservation of gene expression patterns for 2278 putative orthologs from Douglas-fir during the summer growing season, and 760 putative orthologs from spruce (Picea) during the transition from fall to winter. Our study highlight the extensive diurnal and circannual transcriptome variability demonstrated in conifer needles. At these temporal scales, 29% of expressed transcripts show a significant diurnal cycle, and 58.7% show a significant circannual cycle. Remarkably, thousands of genes reach their annual peak activity during winter dormancy. Our study establishes the fine-scale timing of daily and annual maximum gene expression for diverse needle genes in Douglas-fir, and it highlights the potential for using this information for evaluating hypotheses concerning the daily or seasonal timing of gene activity in temperate-zone conifers, and for identifying cyclic transcriptome components in other conifer species.

  19. The transcriptional response of Escherichia coli to recombinant protein insolubility.

    PubMed

    Smith, Harold E

    2007-03-01

    Bacterial production of recombinant proteins offers several advantages over alternative expression methods and remains the system of choice for many structural genomics projects. However, a large percentage of targets accumulate as insoluble inclusion bodies rather than soluble protein, creating a significant bottleneck in the protein production pipeline. Numerous strategies have been reported that can improve in vivo protein solubility, but most do not scale easily for high-throughput expression screening. To understand better the host cell response to the accumulation of insoluble protein, we determined genome-wide changes in bacterial gene expression upon induction of either soluble or insoluble target proteins. By comparing transcriptional profiles for multiple examples from the soluble or insoluble class, we identified a pattern of gene expression that correlates strongly with protein solubility. Direct targets of the sigma32 heat shock sigma factor, which includes genes involved in protein folding and degradation, were highly expressed in response to induction of insoluble protein. This same group of genes was also upregulated by insoluble protein accumulation under a different growth regime, indicating that sigma32-mediated gene expression is a general response to protein insolubility. This knowledge provides a starting point for the rational design of growth parameters and host strains with improved protein solubility characteristics. Summary Problems with protein solubility are frequently encountered when recombinant proteins are expressed in E. coli. The bacterial host responds to this problem by increasing expression of the protein folding machinery via the heat shock sigma factor sigma32. Manipulation of the sigma32 regulon might provide a general mechanism for improving recombinant protein solubility.

  20. Whole genome co-expression analysis of soybean cytochrome P450 genes identifies nodulation-specific P450 monooxygenases

    PubMed Central

    2010-01-01

    Background Cytochrome P450 monooxygenases (P450s) catalyze oxidation of various substrates using oxygen and NAD(P)H. Plant P450s are involved in the biosynthesis of primary and secondary metabolites performing diverse biological functions. The recent availability of the soybean genome sequence allows us to identify and analyze soybean putative P450s at a genome scale. Co-expression analysis using an available soybean microarray and Illumina sequencing data provides clues for functional annotation of these enzymes. This approach is based on the assumption that genes that have similar expression patterns across a set of conditions may have a functional relationship. Results We have identified a total number of 332 full-length P450 genes and 378 pseudogenes from the soybean genome. From the full-length sequences, 195 genes belong to A-type, which could be further divided into 20 families. The remaining 137 genes belong to non-A type P450s and are classified into 28 families. A total of 178 probe sets were found to correspond to P450 genes on the Affymetrix soybean array. Out of these probe sets, 108 represented single genes. Using the 28 publicly available microarray libraries that contain organ-specific information, some tissue-specific P450s were identified. Similarly, stress responsive soybean P450s were retrieved from 99 microarray soybean libraries. We also utilized Illumina transcriptome sequencing technology to analyze the expressions of all 332 soybean P450 genes. This dataset contains total RNAs isolated from nodules, roots, root tips, leaves, flowers, green pods, apical meristem, mock-inoculated and Bradyrhizobium japonicum-infected root hair cells. The tissue-specific expression patterns of these P450 genes were analyzed and the expression of a representative set of genes were confirmed by qRT-PCR. We performed the co-expression analysis on many of the 108 P450 genes on the Affymetrix arrays. First we confirmed that CYP93C5 (an isoflavone synthase gene) is co-expressed with several genes encoding isoflavonoid-related metabolic enzymes. We then focused on nodulation-induced P450s and found that CYP728H1 was co-expressed with the genes involved in phenylpropanoid metabolism. Similarly, CYP736A34 was highly co-expressed with lipoxygenase, lectin and CYP83D1, all of which are involved in root and nodule development. Conclusions The genome scale analysis of P450s in soybean reveals many unique features of these important enzymes in this crop although the functions of most of them are largely unknown. Gene co-expression analysis proves to be a useful tool to infer the function of uncharacterized genes. Our work presented here could provide important leads toward functional genomics studies of soybean P450s and their regulatory network through the integration of reverse genetics, biochemistry, and metabolic profiling tools. The identification of nodule-specific P450s and their further exploitation may help us to better understand the intriguing process of soybean and rhizobium interaction. PMID:21062474

  1. Meta-Profiles of Gene Expression during Aging: Limited Similarities between Mouse and Human and an Unexpectedly Decreased Inflammatory Signature

    PubMed Central

    Swindell, William R.; Johnston, Andrew; Sun, Liou; Xing, Xianying; Fisher, Gary J.; Bulyk, Martha L.; Elder, James T.; Gudjonsson, Johann E.

    2012-01-01

    Background Skin aging is associated with intrinsic processes that compromise the structure of the extracellular matrix while promoting loss of functional and regenerative capacity. These processes are accompanied by a large-scale shift in gene expression, but underlying mechanisms are not understood and conservation of these mechanisms between humans and mice is uncertain. Results We used genome-wide expression profiling to investigate the aging skin transcriptome. In humans, age-related shifts in gene expression were sex-specific. In females, aging increased expression of transcripts associated with T-cells, B-cells and dendritic cells, and decreased expression of genes in regions with elevated Zeb1, AP-2 and YY1 motif density. In males, however, these effects were contrasting or absent. When age-associated gene expression patterns in human skin were compared to those in tail skin from CB6F1 mice, overall human-mouse correspondence was weak. Moreover, inflammatory gene expression patterns were not induced with aging of mouse tail skin, and well-known aging biomarkers were in fact decreased (e.g., Clec7a, Lyz1 and Lyz2). These unexpected patterns and weak human-mouse correspondence may be due to decreased abundance of antigen presenting cells in mouse tail skin with age. Conclusions Aging is generally associated with a pro-inflammatory state, but we have identified an exception to this pattern with aging of CB6F1 mouse tail skin. Aging therefore does not uniformly heighten inflammatory status across all mouse tissues. Furthermore, we identified both intercellular and intracellular mechanisms of transcriptome aging, including those that are sex- and species-specific. PMID:22413003

  2. AREB1, AREB2, and ABF3 are master transcription factors that cooperatively regulate ABRE-dependent ABA signaling involved in drought stress tolerance and require ABA for full activation.

    PubMed

    Yoshida, Takuya; Fujita, Yasunari; Sayama, Hiroko; Kidokoro, Satoshi; Maruyama, Kyonoshin; Mizoi, Junya; Shinozaki, Kazuo; Yamaguchi-Shinozaki, Kazuko

    2010-02-01

    A myriad of drought stress-inducible genes have been reported, and many of these are activated by abscisic acid (ABA). In the promoter regions of such ABA-regulated genes, conserved cis-elements, designated ABA-responsive elements (ABREs), control gene expression via bZIP-type AREB/ABF transcription factors. Although all three members of the AREB/ABF subfamily, AREB1, AREB2, and ABF3, are upregulated by ABA and water stress, it remains unclear whether these are functional homologs. Here, we report that all three AREB/ABF transcription factors require ABA for full activation, can form hetero- or homodimers to function in nuclei, and can interact with SRK2D/SnRK2.2, an SnRK2 protein kinase that was identified as a regulator of AREB1. Along with the tissue-specific expression patterns of these genes and the subcellular localization of their encoded proteins, these findings clearly indicate that AREB1, AREB2, and ABF3 have largely overlapping functions. To elucidate the role of these AREB/ABF transcription factors, we generated an areb1 areb2 abf3 triple mutant. Large-scale transcriptome analysis, which showed that stress-responsive gene expression is remarkably impaired in the triple mutant, revealed novel AREB/ABF downstream genes in response to water stress, including many LEA class and group-Ab PP2C genes and transcription factors. The areb1 areb2 abf3 triple mutant is more resistant to ABA than are the other single and double mutants with respect to primary root growth, and it displays reduced drought tolerance. Thus, these results indicate that AREB1, AREB2, and ABF3 are master transcription factors that cooperatively regulate ABRE-dependent gene expression for ABA signaling under conditions of water stress.

  3. CoryneRegNet 4.0 – A reference database for corynebacterial gene regulatory networks

    PubMed Central

    Baumbach, Jan

    2007-01-01

    Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression) and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user) can be analyzed in the context of known transcriptional regulatory networks to predict putative contradictions or further gene regulatory interactions. Furthermore, it integrates protein clusters by means of heuristically solving the weighted graph cluster editing problem. In addition, it provides Web Service based access to up to date gene annotation data from GenDB. Conclusion The release 4.0 of CoryneRegNet is a comprehensive system for the integrated analysis of procaryotic gene regulatory networks. It is a versatile systems biology platform to support the efficient and large-scale analysis of transcriptional regulation of gene expression in microorganisms. It is publicly available at . PMID:17986320

  4. Annotating novel genes by integrating synthetic lethals and genomic information

    PubMed Central

    Schöner, Daniel; Kalisch, Markus; Leisner, Christian; Meier, Lukas; Sohrmann, Marc; Faty, Mahamadou; Barral, Yves; Peter, Matthias; Gruissem, Wilhelm; Bühlmann, Peter

    2008-01-01

    Background Large scale screening for synthetic lethality serves as a common tool in yeast genetics to systematically search for genes that play a role in specific biological processes. Often the amounts of data resulting from a single large scale screen far exceed the capacities of experimental characterization of every identified target. Thus, there is need for computational tools that select promising candidate genes in order to reduce the number of follow-up experiments to a manageable size. Results We analyze synthetic lethality data for arp1 and jnm1, two spindle migration genes, in order to identify novel members in this process. To this end, we use an unsupervised statistical method that integrates additional information from biological data sources, such as gene expression, phenotypic profiling, RNA degradation and sequence similarity. Different from existing methods that require large amounts of synthetic lethal data, our method merely relies on synthetic lethality information from two single screens. Using a Multivariate Gaussian Mixture Model, we determine the best subset of features that assign the target genes to two groups. The approach identifies a small group of genes as candidates involved in spindle migration. Experimental testing confirms the majority of our candidates and we present she1 (YBL031W) as a novel gene involved in spindle migration. We applied the statistical methodology also to TOR2 signaling as another example. Conclusion We demonstrate the general use of Multivariate Gaussian Mixture Modeling for selecting candidate genes for experimental characterization from synthetic lethality data sets. For the given example, integration of different data sources contributes to the identification of genetic interaction partners of arp1 and jnm1 that play a role in the same biological process. PMID:18194531

  5. Reconstruction of an Integrated Genome-Scale Co-Expression Network Reveals Key Modules Involved in Lung Adenocarcinoma

    PubMed Central

    Hosseini Ashtiani, Saman; Moeini, Ali; Nowzari-Dalini, Abbas; Masoudi-Nejad, Ali

    2013-01-01

    Our goal of this study was to reconstruct a “genome-scale co-expression network” and find important modules in lung adenocarcinoma so that we could identify the genes involved in lung adenocarcinoma. We integrated gene mutation, GWAS, CGH, array-CGH and SNP array data in order to identify important genes and loci in genome-scale. Afterwards, on the basis of the identified genes a co-expression network was reconstructed from the co-expression data. The reconstructed network was named “genome-scale co-expression network”. As the next step, 23 key modules were disclosed through clustering. In this study a number of genes have been identified for the first time to be implicated in lung adenocarcinoma by analyzing the modules. The genes EGFR, PIK3CA, TAF15, XIAP, VAPB, Appl1, Rab5a, ARF4, CLPTM1L, SP4, ZNF124, LPP, FOXP1, SOX18, MSX2, NFE2L2, SMARCC1, TRA2B, CBX3, PRPF6, ATP6V1C1, MYBBP1A, MACF1, GRM2, TBXA2R, PRKAR2A, PTK2, PGF and MYO10 are among the genes that belong to modules 1 and 22. All these genes, being implicated in at least one of the phenomena, namely cell survival, proliferation and metastasis, have an over-expression pattern similar to that of EGFR. In few modules, the genes such as CCNA2 (Cyclin A2), CCNB2 (Cyclin B2), CDK1, CDK5, CDC27, CDCA5, CDCA8, ASPM, BUB1, KIF15, KIF2C, NEK2, NUSAP1, PRC1, SMC4, SYCE2, TFDP1, CDC42 and ARHGEF9 are present that play a crucial role in cell cycle progression. In addition to the mentioned genes, there are some other genes (i.e. DLGAP5, BIRC5, PSMD2, Src, TTK, SENP2, PSMD2, DOK2, FUS and etc.) in the modules. PMID:23874428

  6. Reconstruction of an integrated genome-scale co-expression network reveals key modules involved in lung adenocarcinoma.

    PubMed

    Bidkhori, Gholamreza; Narimani, Zahra; Hosseini Ashtiani, Saman; Moeini, Ali; Nowzari-Dalini, Abbas; Masoudi-Nejad, Ali

    2013-01-01

    Our goal of this study was to reconstruct a "genome-scale co-expression network" and find important modules in lung adenocarcinoma so that we could identify the genes involved in lung adenocarcinoma. We integrated gene mutation, GWAS, CGH, array-CGH and SNP array data in order to identify important genes and loci in genome-scale. Afterwards, on the basis of the identified genes a co-expression network was reconstructed from the co-expression data. The reconstructed network was named "genome-scale co-expression network". As the next step, 23 key modules were disclosed through clustering. In this study a number of genes have been identified for the first time to be implicated in lung adenocarcinoma by analyzing the modules. The genes EGFR, PIK3CA, TAF15, XIAP, VAPB, Appl1, Rab5a, ARF4, CLPTM1L, SP4, ZNF124, LPP, FOXP1, SOX18, MSX2, NFE2L2, SMARCC1, TRA2B, CBX3, PRPF6, ATP6V1C1, MYBBP1A, MACF1, GRM2, TBXA2R, PRKAR2A, PTK2, PGF and MYO10 are among the genes that belong to modules 1 and 22. All these genes, being implicated in at least one of the phenomena, namely cell survival, proliferation and metastasis, have an over-expression pattern similar to that of EGFR. In few modules, the genes such as CCNA2 (Cyclin A2), CCNB2 (Cyclin B2), CDK1, CDK5, CDC27, CDCA5, CDCA8, ASPM, BUB1, KIF15, KIF2C, NEK2, NUSAP1, PRC1, SMC4, SYCE2, TFDP1, CDC42 and ARHGEF9 are present that play a crucial role in cell cycle progression. In addition to the mentioned genes, there are some other genes (i.e. DLGAP5, BIRC5, PSMD2, Src, TTK, SENP2, PSMD2, DOK2, FUS and etc.) in the modules.

  7. Screening and identification of gastric adenocarcinoma metastasis-related genes using cDNA microarray coupled to FDD-PCR.

    PubMed

    Wang, Jianhua; Chen, Shishu

    2002-10-01

    To identify certain gastric adenocarcinoma metastasis-related genes, an RF-1 cell line (primary tumor from a gastric adenocarcinoma patient) and an RF-48 cell line (its metastatic counterpart) were used as a model for studying the molecular mechanism of tumor metastasis. Two fluorescent cDNA probes, labeled with Cy3 and Cy5 dyes, were prepared from RF-1 and RF-48 mRNA samples by the reverse transcription method. The two color probes were then mixed and hybridized to a cDNA chip constructed with double-dots from 4,096 human genes, and scanned at two wavelengths. The experiment was repeated twice. Differentially expressedn genes from the above two cells were analyzed by use of computer. Of the total genes, 138 (3.4%) revealed differential expression in RF-48 cells compared with RF-1 cells: 81 (2.1%) genes revealed apparent up-regulation, and 56 (1.3%) genes revealed down-regulation. Forty-five genes involved in gastric adenocarcinoma metastasis were cloned using fluorescent differential display-PCR (FDD-PCR), including three novel genes. There were seven differentially expressed genes that presented the same behaviour under both detection methods. The possible roles of some differentially expressed genes, which may be involved in the mechanism of tumor metastasis, were discussed. cDNA chip was used to analyze gene expression in a high-throughput and large-scale manner in combination with FDD-PCR for cloning unknown novel genes. Some genes related to metastasis were preliminarily scanned, which would contribute to disclose the molecular mechanism of gastric adenocarcinoma metastasis and provide new targets for therapeutic intervention.

  8. Comprehensive evaluation of AmpliSeq transcriptome, a novel targeted whole transcriptome RNA sequencing methodology for global gene expression analysis.

    PubMed

    Li, Wenli; Turner, Amy; Aggarwal, Praful; Matter, Andrea; Storvick, Erin; Arnett, Donna K; Broeckel, Ulrich

    2015-12-16

    Whole transcriptome sequencing (RNA-seq) represents a powerful approach for whole transcriptome gene expression analysis. However, RNA-seq carries a few limitations, e.g., the requirement of a significant amount of input RNA and complications led by non-specific mapping of short reads. The Ion AmpliSeq Transcriptome Human Gene Expression Kit (AmpliSeq) was recently introduced by Life Technologies as a whole-transcriptome, targeted gene quantification kit to overcome these limitations of RNA-seq. To assess the performance of this new methodology, we performed a comprehensive comparison of AmpliSeq with RNA-seq using two well-established next-generation sequencing platforms (Illumina HiSeq and Ion Torrent Proton). We analyzed standard reference RNA samples and RNA samples obtained from human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs). Using published data from two standard RNA reference samples, we observed a strong concordance of log2 fold change for all genes when comparing AmpliSeq to Illumina HiSeq (Pearson's r = 0.92) and Ion Torrent Proton (Pearson's r = 0.92). We used ROC, Matthew's correlation coefficient and RMSD to determine the overall performance characteristics. All three statistical methods demonstrate AmpliSeq as a highly accurate method for differential gene expression analysis. Additionally, for genes with high abundance, AmpliSeq outperforms the two RNA-seq methods. When analyzing four closely related hiPSC-CM lines, we show that both AmpliSeq and RNA-seq capture similar global gene expression patterns consistent with known sources of variations. Our study indicates that AmpliSeq excels in the limiting areas of RNA-seq for gene expression quantification analysis. Thus, AmpliSeq stands as a very sensitive and cost-effective approach for very large scale gene expression analysis and mRNA marker screening with high accuracy.

  9. Temporal transcriptome profiling reveals expression partitioning of homeologous genes contributing to heat and drought acclimation in wheat (Triticum aestivum L.).

    PubMed

    Liu, Zhenshan; Xin, Mingming; Qin, Jinxia; Peng, Huiru; Ni, Zhongfu; Yao, Yingyin; Sun, Qixin

    2015-06-20

    Hexaploid wheat (Triticum aestivum) is a globally important crop. Heat, drought and their combination dramatically reduce wheat yield and quality, but the molecular mechanisms underlying wheat tolerance to extreme environments, especially stress combination, are largely unknown. As an allohexaploid, wheat consists of three closely related subgenomes (A, B, and D), and was reported to show improved tolerance to stress conditions compared to tetraploid. But so far very little is known about how wheat coordinates the expression of homeologous genes to cope with various environmental constraints on the whole-genome level. To explore the transcriptional response of wheat to the individual and combined stress, we performed high-throughput transcriptome sequencing of seedlings under normal condition and subjected to drought stress (DS), heat stress (HS) and their combination (HD) for 1 h and 6 h, and presented global gene expression reprograms in response to these three stresses. Gene Ontology (GO) enrichment analysis of DS, HS and HD responsive genes revealed an overlap and complexity of functional pathways between each other. Moreover, 4,375 wheat transcription factors were identified on a whole-genome scale based on the released scaffold information by IWGSC, and 1,328 were responsive to stress treatments. Then, the regulatory network analysis of HSFs and DREBs implicated they were both involved in the regulation of DS, HS and HD response and indicated a cross-talk between heat and drought stress. Finally, approximately 68.4 % of homeologous genes were found to exhibit expression partitioning in response to DS, HS or HD, which was further confirmed by using quantitative RT-PCR and Nullisomic-Tetrasomic lines. A large proportion of wheat homeologs exhibited expression partitioning under normal and abiotic stresses, which possibly contributes to the wide adaptability and distribution of hexaploid wheat in response to various environmental constraints.

  10. Genome-wide identification and expression analysis of SBP-like transcription factor genes in Moso Bamboo (Phyllostachys edulis).

    PubMed

    Pan, Feng; Wang, Yue; Liu, Huanglong; Wu, Min; Chu, Wenyuan; Chen, Danmei; Xiang, Yan

    2017-06-27

    The SQUAMOSA promoter binding protein-like (SPL) proteins are plant-specific transcription factors (TFs) that function in a variety of developmental processes including growth, flower development, and signal transduction. SPL proteins are encoded by a gene family, and these genes have been characterized in two model grass species, Zea mays and Oryza sativa. The SPL gene family has not been well studied in moso bamboo (Phyllostachys edulis), a woody grass species. We identified 32 putative PeSPL genes in the P. edulis genome. Phylogenetic analysis arranged the PeSPL protein sequences in eight groups. Similarly, phylogenetic analysis of the SBP-like and SBP proteins from rice and maize clustered them into eight groups analogous to those from P. edulis. Furthermore, the deduced PeSPL proteins in each group contained very similar conserved sequence motifs. Our analyses indicate that the PeSPL genes experienced a large-scale duplication event ~15 million years ago (MYA), and that divergence between the PeSPL and OsSPL genes occurred 34 MYA. The stress-response expression profiles and tissue-specificity of the putative PeSPL gene promoter regions showed that SPL genes in moso bamboo have potential biological functions in stress resistance as well as in growth and development. We therefore examined PeSPL gene expression in response to different plant hormone and drought (polyethylene glycol-6000; PEG) treatments to mimic biotic and abiotic stresses. Expression of three (PeSPL10, -12, -17), six (PeSPL1, -10, -12, -17, -20, -31), and nine (PeSPL5, -8, -9, -14, -15, -19, -20, -31, -32) genes remained relatively stable after treating with salicylic acid (SA), gibberellic acid (GA), and PEG, respectively, while the expression patterns of other genes changed. In addition, analysis of tissue-specific expression of the moso bamboo SPL genes during development showed differences in their spatiotemporal expression patterns, and many were expressed at high levels in flowers and leaves. The PeSPL genes play important roles in plant growth and development, including responses to stresses, and most of the genes are expressed in different tissues. Our study provides a comprehensive understanding of the PeSPL gene family and may enable future studies on the function and evolution of SPL genes in moso bamboo.

  11. An interactive web application for the dissemination of human systems immunology data.

    PubMed

    Speake, Cate; Presnell, Scott; Domico, Kelly; Zeitner, Brad; Bjork, Anna; Anderson, David; Mason, Michael J; Whalen, Elizabeth; Vargas, Olivia; Popov, Dimitry; Rinchai, Darawan; Jourde-Chiche, Noemie; Chiche, Laurent; Quinn, Charlie; Chaussabel, Damien

    2015-06-19

    Systems immunology approaches have proven invaluable in translational research settings. The current rate at which large-scale datasets are generated presents unique challenges and opportunities. Mining aggregates of these datasets could accelerate the pace of discovery, but new solutions are needed to integrate the heterogeneous data types with the contextual information that is necessary for interpretation. In addition, enabling tools and technologies facilitating investigators' interaction with large-scale datasets must be developed in order to promote insight and foster knowledge discovery. State of the art application programming was employed to develop an interactive web application for browsing and visualizing large and complex datasets. A collection of human immune transcriptome datasets were loaded alongside contextual information about the samples. We provide a resource enabling interactive query and navigation of transcriptome datasets relevant to human immunology research. Detailed information about studies and samples are displayed dynamically; if desired the associated data can be downloaded. Custom interactive visualizations of the data can be shared via email or social media. This application can be used to browse context-rich systems-scale data within and across systems immunology studies. This resource is publicly available online at [Gene Expression Browser Landing Page ( https://gxb.benaroyaresearch.org/dm3/landing.gsp )]. The source code is also available openly [Gene Expression Browser Source Code ( https://github.com/BenaroyaResearch/gxbrowser )]. We have developed a data browsing and visualization application capable of navigating increasingly large and complex datasets generated in the context of immunological studies. This intuitive tool ensures that, whether taken individually or as a whole, such datasets generated at great effort and expense remain interpretable and a ready source of insight for years to come.

  12. Identification of novel diagnostic biomarkers for thyroid carcinoma.

    PubMed

    Wang, Xiliang; Zhang, Qing; Cai, Zhiming; Dai, Yifan; Mou, Lisha

    2017-12-19

    Thyroid carcinoma (THCA) is the most universal endocrine malignancy worldwide. Unfortunately, a limited number of large-scale analyses have been performed to identify biomarkers for THCA. Here, we conducted a meta-analysis using 505 THCA patients and 59 normal controls from The Cancer Genome Atlas. After identifying differentially expressed long non-coding RNA (lncRNA) and protein coding genes (PCG), we found vast difference in various lncRNA-PCG co-expressed pairs in THCA. A dysregulation network with scale-free topology was constructed. Four molecules (LA16c-380H5.2, RP11-203J24.8, MLF1 and SDC4) could potentially serve as diagnostic biomarkers of THCA with high sensitivity and specificity. We further represent a diagnostic panel with expression cutoff values. Our results demonstrate the potential application of those four molecules as novel independent biomarkers for THCA diagnosis.

  13. Exploring root symbiotic programs in the model legume Medicago truncatula using EST analysis.

    PubMed

    Journet, Etienne-Pascal; van Tuinen, Diederik; Gouzy, Jérome; Crespeau, Hervé; Carreau, Véronique; Farmer, Mary-Jo; Niebel, Andreas; Schiex, Thomas; Jaillon, Olivier; Chatagnier, Odile; Godiard, Laurence; Micheli, Fabienne; Kahn, Daniel; Gianinazzi-Pearson, Vivienne; Gamas, Pascal

    2002-12-15

    We report on a large-scale expressed sequence tag (EST) sequencing and analysis program aimed at characterizing the sets of genes expressed in roots of the model legume Medicago truncatula during interactions with either of two microsymbionts, the nitrogen-fixing bacterium Sinorhizobium meliloti or the arbuscular mycorrhizal fungus Glomus intraradices. We have designed specific tools for in silico analysis of EST data, in relation to chimeric cDNA detection, EST clustering, encoded protein prediction, and detection of differential expression. Our 21 473 5'- and 3'-ESTs could be grouped into 6359 EST clusters, corresponding to distinct virtual genes, along with 52 498 other M.truncatula ESTs available in the dbEST (NCBI) database that were recruited in the process. These clusters were manually annotated, using a specifically developed annotation interface. Analysis of EST cluster distribution in various M.truncatula cDNA libraries, supported by a refined R test to evaluate statistical significance and by 'electronic northern' representation, enabled us to identify a large number of novel genes predicted to be up- or down-regulated during either symbiotic root interaction. These in silico analyses provide a first global view of the genetic programs for root symbioses in M.truncatula. A searchable database has been built and can be accessed through a public interface.

  14. Optimal Reference Genes for Gene Expression Normalization in Trichomonas vaginalis.

    PubMed

    dos Santos, Odelta; de Vargas Rigo, Graziela; Frasson, Amanda Piccoli; Macedo, Alexandre José; Tasca, Tiana

    2015-01-01

    Trichomonas vaginalis is the etiologic agent of trichomonosis, the most common non-viral sexually transmitted disease worldwide. This infection is associated with several health consequences, including cervical and prostate cancers and HIV acquisition. Gene expression analysis has been facilitated because of available genome sequences and large-scale transcriptomes in T. vaginalis, particularly using quantitative real-time polymerase chain reaction (qRT-PCR), one of the most used methods for molecular studies. Reference genes for normalization are crucial to ensure the accuracy of this method. However, to the best of our knowledge, a systematic validation of reference genes has not been performed for T. vaginalis. In this study, the transcripts of nine candidate reference genes were quantified using qRT-PCR under different cultivation conditions, and the stability of these genes was compared using the geNorm and NormFinder algorithms. The most stable reference genes were α-tubulin, actin and DNATopII, and, conversely, the widely used T. vaginalis reference genes GAPDH and β-tubulin were less stable. The PFOR gene was used to validate the reliability of the use of these candidate reference genes. As expected, the PFOR gene was upregulated when the trophozoites were cultivated with ferrous ammonium sulfate when the DNATopII, α-tubulin and actin genes were used as normalizing gene. By contrast, the PFOR gene was downregulated when the GAPDH gene was used as an internal control, leading to misinterpretation of the data. These results provide an important starting point for reference gene selection and gene expression analysis with qRT-PCR studies of T. vaginalis.

  15. Optimal Reference Genes for Gene Expression Normalization in Trichomonas vaginalis

    PubMed Central

    dos Santos, Odelta; de Vargas Rigo, Graziela; Frasson, Amanda Piccoli; Macedo, Alexandre José; Tasca, Tiana

    2015-01-01

    Trichomonas vaginalis is the etiologic agent of trichomonosis, the most common non-viral sexually transmitted disease worldwide. This infection is associated with several health consequences, including cervical and prostate cancers and HIV acquisition. Gene expression analysis has been facilitated because of available genome sequences and large-scale transcriptomes in T. vaginalis, particularly using quantitative real-time polymerase chain reaction (qRT-PCR), one of the most used methods for molecular studies. Reference genes for normalization are crucial to ensure the accuracy of this method. However, to the best of our knowledge, a systematic validation of reference genes has not been performed for T. vaginalis. In this study, the transcripts of nine candidate reference genes were quantified using qRT-PCR under different cultivation conditions, and the stability of these genes was compared using the geNorm and NormFinder algorithms. The most stable reference genes were α-tubulin, actin and DNATopII, and, conversely, the widely used T. vaginalis reference genes GAPDH and β-tubulin were less stable. The PFOR gene was used to validate the reliability of the use of these candidate reference genes. As expected, the PFOR gene was upregulated when the trophozoites were cultivated with ferrous ammonium sulfate when the DNATopII, α-tubulin and actin genes were used as normalizing gene. By contrast, the PFOR gene was downregulated when the GAPDH gene was used as an internal control, leading to misinterpretation of the data. These results provide an important starting point for reference gene selection and gene expression analysis with qRT-PCR studies of T. vaginalis. PMID:26393928

  16. Gene expression metadata analysis reveals molecular mechanisms employed by Phanerochaete chrysosporium during lignin degradation and detoxification of plant extractives.

    PubMed

    Kameshwar, Ayyappa Kumar Sista; Qin, Wensheng

    2017-10-01

    Lignin, most complex and abundant biopolymer on the earth's surface, attains its stability from intricate polyphenolic units and non-phenolic bonds, making it difficult to depolymerize or separate from other units of biomass. Eccentric lignin degrading ability and availability of annotated genome make Phanerochaete chrysosporium ideal for studying lignin degrading mechanisms. Decoding and understanding the molecular mechanisms underlying the process of lignin degradation will significantly aid the progressing biofuel industries and lead to the production of commercially vital platform chemicals. In this study, we have performed a large-scale metadata analysis to understand the common gene expression patterns of P. chrysosporium during lignin degradation. Gene expression datasets were retrieved from NCBI GEO database and analyzed using GEO2R and Bioconductor packages. Commonly expressed statistically significant genes among different datasets were further considered to understand their involvement in lignin degradation and detoxification mechanisms. We have observed three sets of enzymes commonly expressed during ligninolytic conditions which were later classified into primary ligninolytic, aromatic compound-degrading and other necessary enzymes. Similarly, we have observed three sets of genes coding for detoxification and stress-responsive, phase I and phase II metabolic enzymes. Results obtained in this study indicate the coordinated action of enzymes involved in lignin depolymerization and detoxification-stress responses under ligninolytic conditions. We have developed tentative network of genes and enzymes involved in lignin degradation and detoxification mechanisms by P. chrysosporium based on the literature and results obtained in this study. However, ambiguity raised due to higher expression of several uncharacterized proteins necessitates for further proteomic studies in P. chrysosporium.

  17. Evolution and Expression Patterns of TCP Genes in Asparagales

    PubMed Central

    Madrigal, Yesenia; Alzate, Juan F.; Pabón-Mora, Natalia

    2017-01-01

    CYCLOIDEA-like genes are involved in the symmetry gene network, limiting cell proliferation in the dorsal regions of bilateral flowers in core eudicots. CYC-like and closely related TCP genes (acronym for TEOSINTE BRANCHED1, CYCLOIDEA, and PROLIFERATION CELL FACTOR) have been poorly studied in Asparagales, the largest order of monocots that includes both bilateral flowers in Orchidaceae (ca. 25.000 spp) and radially symmetrical flowers in Hypoxidaceae (ca. 200 spp). With the aim of assessing TCP gene evolution in the Asparagales, we isolated TCP-like genes from publicly available databases and our own transcriptomes of Cattleya trianae (Orchidaceae) and Hypoxis decumbens (Hypoxidaceae). Our matrix contains 452 sequences representing the three major clades of TCP genes. Besides the previously identified CYC specific core eudicot duplications, our ML phylogenetic analyses recovered an early CIN-like duplication predating all angiosperms, two CIN-like Asparagales-specific duplications and a duplication prior to the diversification of Orchidoideae and Epidendroideae. In addition, we provide evidence of at least three duplications of PCF-like genes in Asparagales. While CIN-like and PCF-like genes have multiplied in Asparagales, likely enhancing the genetic network for cell proliferation, CYC-like genes remain as single, shorter copies with low expression. Homogeneous expression of CYC-like genes in the labellum as well as the lateral petals suggests little contribution to the bilateral perianth in C. trianae. CIN-like and PCF-like gene expression suggests conserved roles in cell proliferation in leaves, sepals and petals, carpels, ovules and fruits in Asparagales by comparison with previously reported functions in core eudicots and monocots. This is the first large scale analysis of TCP-like genes in Asparagales that will serve as a platform for in-depth functional studies in emerging model monocots. PMID:28144250

  18. At what scale should microarray data be analyzed?

    PubMed

    Huang, Shuguang; Yeo, Adeline A; Gelbert, Lawrence; Lin, Xi; Nisenbaum, Laura; Bemis, Kerry G

    2004-01-01

    The hybridization intensities derived from microarray experiments, for example Affymetrix's MAS5 signals, are very often transformed in one way or another before statistical models are fitted. The motivation for performing transformation is usually to satisfy the model assumptions such as normality and homogeneity in variance. Generally speaking, two types of strategies are often applied to microarray data depending on the analysis need: correlation analysis where all the gene intensities on the array are considered simultaneously, and gene-by-gene ANOVA where each gene is analyzed individually. We investigate the distributional properties of the Affymetrix GeneChip signal data under the two scenarios, focusing on the impact of analyzing the data at an inappropriate scale. The Box-Cox type of transformation is first investigated for the strategy of pooling genes. The commonly used log-transformation is particularly applied for comparison purposes. For the scenario where analysis is on a gene-by-gene basis, the model assumptions such as normality are explored. The impact of using a wrong scale is illustrated by log-transformation and quartic-root transformation. When all the genes on the array are considered together, the dependent relationship between the expression and its variation level can be satisfactorily removed by Box-Cox transformation. When genes are analyzed individually, the distributional properties of the intensities are shown to be gene dependent. Derivation and simulation show that some loss of power is incurred when a wrong scale is used, but due to the robustness of the t-test, the loss is acceptable when the fold-change is not very large.

  19. Development of a gene cloning system in a fast-growing and moderately thermophilic Streptomyces species and heterologous expression of Streptomyces antibiotic biosynthetic gene clusters

    PubMed Central

    2011-01-01

    Background Streptomyces species are a major source of antibiotics. They usually grow slowly at their optimal temperature and fermentation of industrial strains in a large scale often takes a long time, consuming more energy and materials than some other bacterial industrial strains (e.g., E. coli and Bacillus). Most thermophilic Streptomyces species grow fast, but no gene cloning systems have been developed in such strains. Results We report here the isolation of 41 fast-growing (about twice the rate of S. coelicolor), moderately thermophilic (growing at both 30°C and 50°C) Streptomyces strains, detection of one linear and three circular plasmids in them, and sequencing of a 6996-bp plasmid, pTSC1, from one of them. pTSC1-derived pCWH1 could replicate in both thermophilic and mesophilic Streptomyces strains. On the other hand, several Streptomyces replicons function in thermophilic Streptomyces species. By examining ten well-sporulating strains, we found two promising cloning hosts, 2C and 4F. A gene cloning system was established by using the two strains. The actinorhodin and anthramycin biosynthetic gene clusters from mesophilic S. coelicolor A3(2) and thermophilic S. refuineus were heterologously expressed in one of the hosts. Conclusions We have developed a gene cloning and expression system in a fast-growing and moderately thermophilic Streptomyces species. Although just a few plasmids and one antibiotic biosynthetic gene cluster from mesophilic Streptomyces were successfully expressed in thermophilic Streptomyces species, we expect that by utilizing thermophilic Streptomyces-specific promoters, more genes and especially antibiotic genes clusters of mesophilic Streptomyces should be heterologously expressed. PMID:22032628

  20. Differential gene expression patterns between smokers and non‐smokers: cause or consequence?

    PubMed Central

    Jansen, Rick; Brooks, Andy; Willemsen, Gonneke; van Grootheest, Gerard; de Geus, Eco; Smit, Jan H.; Penninx, Brenda W.; Boomsma, Dorret I.

    2015-01-01

    Abstract The molecular mechanisms causing smoking‐induced health decline are largely unknown. To elucidate the molecular pathways involved in cause and consequences of smoking behavior, we conducted a genome‐wide gene expression study in peripheral blood samples targeting 18 238 genes. Data of 743 smokers, 1686 never smokers and 890 ex‐smokers were available from two population‐based cohorts from the Netherlands. In addition, data of 56 monozygotic twin pairs discordant for ever smoking were used. One hundred thirty‐two genes were differentially expressed between current smokers and never smokers (P < 1.2 × 10−6, Bonferroni correction). The most significant genes were G protein‐coupled receptor 15 (P < 1 × 10−150) and leucine‐rich repeat neuronal 3 (P < 1 × 10−44). The smoking‐related genes were enriched for immune system, blood coagulation, natural killer cell and cancer pathways. By taking the data of ex‐smokers into account, expression of these 132 genes was classified into reversible (94 genes), slowly reversible (31 genes), irreversible (6 genes) or inconclusive (1 gene). Expression of 6 of the 132 genes (three reversible and three slowly reversible) was confirmed to be reactive to smoking as they were differentially expressed in monozygotic pairs discordant for smoking. Cis‐expression quantitative trait loci for GPR56 and RARRES3 (downregulated in smokers) were associated with increased number of cigarettes smoked per day in a large genome‐wide association meta‐analysis, suggesting a causative effect of GPR56 and RARRES3 expression on smoking behavior. In conclusion, differential gene expression patterns in smokers are extensive and cluster in several underlying disease pathways. Gene expression differences seem mainly direct consequences of smoking, and largely reversible after smoking cessation. However, we also identified DNA variants that may influence smoking behavior via the mediating gene expression. PMID:26594007

  1. Differential replication dynamics for large and small Vibrio chromosomes affect gene dosage, expression and location

    PubMed Central

    Dryselius, Rikard; Izutsu, Kaori; Honda, Takeshi; Iida, Tetsuya

    2008-01-01

    Background Replication of bacterial chromosomes increases copy numbers of genes located near origins of replication relative to genes located near termini. Such differential gene dosage depends on replication rate, doubling time and chromosome size. Although little explored, differential gene dosage may influence both gene expression and location. For vibrios, a diverse family of fast growing gammaproteobacteria, gene dosage may be particularly important as they harbor two chromosomes of different size. Results Here we examined replication dynamics and gene dosage effects for the separate chromosomes of three Vibrio species. We also investigated locations for specific gene types within the genome. The results showed consistently larger gene dosage differences for the large chromosome which also initiated replication long before the small. Accordingly, large chromosome gene expression levels were generally higher and showed an influence from gene dosage. This was reflected by a higher abundance of growth essential and growth contributing genes of which many locate near the origin of replication. In contrast, small chromosome gene expression levels were low and appeared independent of gene dosage. Also, species specific genes are highly abundant and an over-representation of genes involved in transcription could explain its gene dosage independent expression. Conclusion Here we establish a link between replication dynamics and differential gene dosage on one hand and gene expression levels and the location of specific gene types on the other. For vibrios, this relationship appears connected to a polarisation of genetic content between its chromosomes, which may both contribute to and be enhanced by an improved adaptive capacity. PMID:19032792

  2. Parallel Mutual Information Based Construction of Genome-Scale Networks on the Intel® Xeon Phi™ Coprocessor.

    PubMed

    Misra, Sanchit; Pamnany, Kiran; Aluru, Srinivas

    2015-01-01

    Construction of whole-genome networks from large-scale gene expression data is an important problem in systems biology. While several techniques have been developed, most cannot handle network reconstruction at the whole-genome scale, and the few that can, require large clusters. In this paper, we present a solution on the Intel Xeon Phi coprocessor, taking advantage of its multi-level parallelism including many x86-based cores, multiple threads per core, and vector processing units. We also present a solution on the Intel® Xeon® processor. Our solution is based on TINGe, a fast parallel network reconstruction technique that uses mutual information and permutation testing for assessing statistical significance. We demonstrate the first ever inference of a plant whole genome regulatory network on a single chip by constructing a 15,575 gene network of the plant Arabidopsis thaliana from 3,137 microarray experiments in only 22 minutes. In addition, our optimization for parallelizing mutual information computation on the Intel Xeon Phi coprocessor holds out lessons that are applicable to other domains.

  3. Effect of storage time on gene expression data acquired from unfrozen archived newborn blood spots.

    PubMed

    Ho, Nhan T; Busik, Julia V; Resau, James H; Paneth, Nigel; Khoo, Sok Kean

    2016-11-01

    Unfrozen archived newborn blood spots (NBS) have been shown to retain sufficient messenger RNA (mRNA) for gene expression profiling. However, the effect of storage time at ambient temperature for NBS samples in relation to the quality of gene expression data is relatively unknown. Here, we evaluated mRNA expression from quantitative real-time PCR (qRT-PCR) and microarray data obtained from NBS samples stored at ambient temperature to determine the effect of storage time on the quality of gene expression. These data were generated in a previous case-control study examining NBS in 53 children with cerebral palsy (CP) and 53 matched controls. NBS sample storage period ranged from 3 to 16years at ambient temperature. We found persistently low RNA integrity numbers (RIN=2.3±0.71) and 28S/18S rRNA ratios (~0) across NBS samples for all storage periods. In both qRT-PCR and microarray data, the expression of three common housekeeping genes-beta cytoskeletal actin (ACTB), glyceraldehyde 3-phosphate dehydrogenase (GAPDH), and peptidylprolyl isomerase A (PPIA)-decreased with increased storage time. Median values of each microarray probe intensity at log 2 scale also decreased over time. After eight years of storage, probe intensity values were largely reduced to background intensity levels. Of 21,500 genes tested, 89% significantly decreased in signal intensity, with 13,551, 10,730, and 9925 genes detected within 5years, > 5 to <10years, and >10years of storage, respectively. We also examined the expression of two gender-specific genes (X inactivation-specific transcript, XIST and lysine-specific demethylase 5D, KDM5D) and seven gene sets representing the inflammatory, hypoxic, coagulative, and thyroidal pathways hypothesized to be related to CP risk to determine the effect of storage time on the detection of these biologically relevant genes. We found the gender-specific genes and CP-related gene sets detectable in all storage periods, but exhibited differential expression (between male vs. female or CP vs. control) only within the first six years of storage. We concluded that gene expression data quality deteriorates in unfrozen archived NBS over time and that differential gene expression profiling and analysis is recommended for those NBS samples collected and stored within six years at ambient temperature. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. DNA-Demethylase Regulated Genes Show Methylation-Independent Spatiotemporal Expression Patterns

    PubMed Central

    Schumann, Ulrike; Lee, Joanne; Kazan, Kemal; Ayliffe, Michael; Wang, Ming-Bo

    2017-01-01

    Recent research has indicated that a subset of defense-related genes is downregulated in the Arabidopsis DNA demethylase triple mutant rdd (ros1 dml2 dml3) resulting in increased susceptibility to the fungal pathogen Fusarium oxysporum. In rdd plants these downregulated genes contain hypermethylated transposable element sequences (TE) in their promoters, suggesting that this methylation represses gene expression in the mutant and that these sequences are actively demethylated in wild-type plants to maintain gene expression. In this study, the tissue-specific and pathogen-inducible expression patterns of rdd-downregulated genes were investigated and the individual role of ROS1, DML2, and DML3 demethylases in these spatiotemporal regulation patterns was determined. Large differences in defense gene expression were observed between pathogen-infected and uninfected tissues and between root and shoot tissues in both WT and rdd plants, however, only subtle changes in promoter TE methylation patterns occurred. Therefore, while TE hypermethylation caused decreased gene expression in rdd plants it did not dramatically effect spatiotemporal gene regulation, suggesting that this latter regulation is largely methylation independent. Analysis of ros1-3, dml2-1, and dml3-1 single gene mutant lines showed that promoter TE hypermethylation and defense-related gene repression was predominantly, but not exclusively, due to loss of ROS1 activity. These data demonstrate that DNA demethylation of TE sequences, largely by ROS1, promotes defense-related gene expression but does not control spatiotemporal expression in Arabidopsis. Summary: Ros1-mediated DNA demethylation of promoter transposable elements is essential for activation of defense-related gene expression in response to fungal infection in Arabidopsis thaliana. PMID:28894455

  5. Plastid Transcriptomics and Translatomics of Tomato Fruit Development and Chloroplast-to-Chromoplast Differentiation: Chromoplast Gene Expression Largely Serves the Production of a Single Protein[W][OA

    PubMed Central

    Kahlau, Sabine; Bock, Ralph

    2008-01-01

    Plastid genes are expressed at high levels in photosynthetically active chloroplasts but are generally believed to be drastically downregulated in nongreen plastids. The genome-wide changes in the expression patterns of plastid genes during the development of nongreen plastid types as well as the contributions of transcriptional versus translational regulation are largely unknown. We report here a systematic transcriptomics and translatomics analysis of the tomato (Solanum lycopersicum) plastid genome during fruit development and chloroplast-to-chromoplast conversion. At the level of RNA accumulation, most but not all plastid genes are strongly downregulated in fruits compared with leaves. By contrast, chloroplast-to-chromoplast differentiation during fruit ripening is surprisingly not accompanied by large changes in plastid RNA accumulation. However, most plastid genes are translationally downregulated during chromoplast development. Both transcriptional and translational downregulation are more pronounced for photosynthesis-related genes than for genes involved in gene expression, indicating that some low-level plastid gene expression must be sustained in chromoplasts. High-level expression during chromoplast development identifies accD, the only plastid-encoded gene involved in fatty acid biosynthesis, as the target gene for which gene expression activity in chromoplasts is maintained. In addition, we have determined the developmental patterns of plastid RNA polymerase activities, intron splicing, and RNA editing and report specific developmental changes in the splicing and editing patterns of plastid transcripts. PMID:18441214

  6. Rate of Amino Acid Substitution Is Influenced by the Degree and Conservation of Male-Biased Transcription Over 50 Myr of Drosophila Evolution

    PubMed Central

    Grath, Sonja; Parsch, John

    2012-01-01

    Sex-biased gene expression (i.e., the differential expression of genes between males and females) is common among sexually reproducing species. However, genes often differ in their sex-bias classification or degree of sex bias between species. There is also an unequal distribution of sex-biased genes (especially male-biased genes) between the X chromosome and the autosomes. We used whole-genome expression data and evolutionary rate estimates for two different Drosophilid lineages, melanogaster and obscura, spanning an evolutionary time scale of around 50 Myr to investigate the influence of sex-biased gene expression and chromosomal location on the rate of molecular evolution. In both lineages, the rate of protein evolution correlated positively with the male/female expression ratio. Genes with highly male-biased expression, genes expressed specifically in male reproductive tissues, and genes with conserved male-biased expression over long evolutionary time scales showed the fastest rates of evolution. An analysis of sex-biased gene evolution in both lineages revealed evidence for a “fast-X” effect in which the rate of evolution was greater for X-linked than for autosomal genes. This pattern was particularly pronounced for male-biased genes. Genes located on the obscura “neo-X” chromosome, which originated from a recent X-autosome fusion, showed rates of evolution that were intermediate between genes located on the ancestral X-chromosome and the autosomes. This suggests that the shift to X-linkage led to an increase in the rate of molecular evolution. PMID:22321769

  7. In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development.

    PubMed

    Ozerov, Ivan V; Lezhnina, Ksenia V; Izumchenko, Evgeny; Artemov, Artem V; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N; Labat, Ivan; West, Michael D; Buzdin, Anton; Cantor, Charles R; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex

    2016-11-16

    Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy.

  8. In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development

    PubMed Central

    Ozerov, Ivan V.; Lezhnina, Ksenia V.; Izumchenko, Evgeny; Artemov, Artem V.; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N.; Labat, Ivan; West, Michael D.; Buzdin, Anton; Cantor, Charles R.; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex

    2016-01-01

    Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy. PMID:27848968

  9. Selection-driven evolution of sex-biased genes is consistent with sexual selection in Arabidopsis thaliana.

    PubMed

    Gossmann, Toni I; Schmid, Marc W; Grossniklaus, Ueli; Schmid, Karl J

    2014-03-01

    Sex-biased genes are genes with a preferential or specific expression in one sex and tend to show an accelerated rate of evolution in animals. Various hypotheses--which are not mutually exclusive--have been put forth to explain observed patterns of rapid evolution. One possible explanation is positive selection, but this has been shown only in few animal species and mostly for male-specific genes. Here, we present a large-scale study that investigates evolutionary patterns of sex-biased genes in the predominantly self-fertilizing plant Arabidopsis thaliana. Unlike most animal species, A. thaliana does not possess sex chromosomes, its flowers develop both male and female sexual organs, and it is characterized by low outcrossing rates. Using cell-specific gene expression data, we identified genes whose expression is enriched in comparison with all other tissues in the male and female gametes (sperm, egg, and central cell), as well as in synergids, pollen, and pollen tubes, which also play an important role in reproduction. Genes specifically expressed in gametes and synergids show higher rates of protein evolution compared with the genome-wide average and no evidence for positive selection. In contrast, pollen- and pollen tube-specific genes not only have lower rates of protein evolution but also exhibit a higher proportion of adaptive amino acid substitutions. We show that this is the result of increased levels of purifying and positive selection among genes with pollen- and pollen tube-specific expression. The increased proportion of adaptive substitutions cannot be explained by the fact that pollen- and pollen tube-expressed genes are enriched in segmental duplications, are on average older, or have a larger effective population size. Our observations are consistent with prezygotic sexual selection as a result of interactions during pollination and pollen tube growth such as pollen tube competition.

  10. Characteristics of the Lotus japonicus gene repertoire deduced from large-scale expressed sequence tag (EST) analysis.

    PubMed

    Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi

    2004-02-01

    To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.

  11. Identification of Immunity-Related Genes in Dialeurodes citri against Entomopathogenic Fungus Lecanicillium attenuatum by RNA-Seq Analysis.

    PubMed

    Yu, Shijiang; Ding, Lili; Luo, Ren; Li, Xiaojiao; Yang, Juan; Liu, Haoqiang; Cong, Lin; Ran, Chun

    2016-01-01

    Dialeurodes citri is a major pest in citrus producing areas, and large-scale outbreaks have occurred increasingly often in recent years. Lecanicillium attenuatum is an important entomopathogenic fungus that can parasitize and kill D. citri. We separated the fungus from corpses of D. citri larvae. However, the sound immune defense system of pests makes infection by an entomopathogenic fungus difficult. Here we used RNA sequencing technology (RNA-Seq) to build a transcriptome database for D. citri and performed digital gene expression profiling to screen genes that act in the immune defense of D. citri larvae infected with a pathogenic fungus. De novo assembly generated 84,733 unigenes with mean length of 772 nt. All unigenes were searched against GO, Nr, Swiss-Prot, COG, and KEGG databases and a total of 28,190 (33.3%) unigenes were annotated. We identified 129 immunity-related unigenes in transcriptome database that were related to pattern recognition receptors, information transduction factors and response factors. From the digital gene expression profile, we identified 441 unigenes that were differentially expressed in D. citri infected with L. attenuatum. Through calculated Log2Ratio values, we identified genes for which fold changes in expression were obvious, including cuticle protein, vitellogenin, cathepsin, prophenoloxidase, clip-domain serine protease, lysozyme, and others. Subsequent quantitative real-time polymerase chain reaction analysis verified the results. The identified genes may serve as target genes for microbial control of D. citri.

  12. Identification of Immunity-Related Genes in Dialeurodes citri against Entomopathogenic Fungus Lecanicillium attenuatum by RNA-Seq Analysis

    PubMed Central

    Yu, Shijiang; Ding, Lili; Luo, Ren; Li, Xiaojiao; Yang, Juan; Liu, Haoqiang; Cong, Lin; Ran, Chun

    2016-01-01

    Dialeurodes citri is a major pest in citrus producing areas, and large-scale outbreaks have occurred increasingly often in recent years. Lecanicillium attenuatum is an important entomopathogenic fungus that can parasitize and kill D. citri. We separated the fungus from corpses of D. citri larvae. However, the sound immune defense system of pests makes infection by an entomopathogenic fungus difficult. Here we used RNA sequencing technology (RNA-Seq) to build a transcriptome database for D. citri and performed digital gene expression profiling to screen genes that act in the immune defense of D. citri larvae infected with a pathogenic fungus. De novo assembly generated 84,733 unigenes with mean length of 772 nt. All unigenes were searched against GO, Nr, Swiss-Prot, COG, and KEGG databases and a total of 28,190 (33.3%) unigenes were annotated. We identified 129 immunity-related unigenes in transcriptome database that were related to pattern recognition receptors, information transduction factors and response factors. From the digital gene expression profile, we identified 441 unigenes that were differentially expressed in D. citri infected with L. attenuatum. Through calculated Log2Ratio values, we identified genes for which fold changes in expression were obvious, including cuticle protein, vitellogenin, cathepsin, prophenoloxidase, clip-domain serine protease, lysozyme, and others. Subsequent quantitative real-time polymerase chain reaction analysis verified the results. The identified genes may serve as target genes for microbial control of D. citri. PMID:27644092

  13. Phylogenomic detection and functional prediction of genes potentially important for plant meiosis.

    PubMed

    Zhang, Luoyan; Kong, Hongzhi; Ma, Hong; Yang, Ji

    2018-02-15

    Meiosis is a specialized type of cell division necessary for sexual reproduction in eukaryotes. A better understanding of the cytological procedures of meiosis has been achieved by comprehensive cytogenetic studies in plants, while the genetic mechanisms regulating meiotic progression remain incompletely understood. The increasing accumulation of complete genome sequences and large-scale gene expression datasets has provided a powerful resource for phylogenomic inference and unsupervised identification of genes involved in plant meiosis. By integrating sequence homology and expression data, 164, 131, 124 and 162 genes potentially important for meiosis were identified in the genomes of Arabidopsis thaliana, Oryza sativa, Selaginella moellendorffii and Pogonatum aloides, respectively. The predicted genes were assigned to 45 meiotic GO terms, and their functions were related to different processes occurring during meiosis in various organisms. Most of the predicted meiotic genes underwent lineage-specific duplication events during plant evolution, with about 30% of the predicted genes retaining only a single copy in higher plant genomes. The results of this study provided clues to design experiments for better functional characterization of meiotic genes in plants, promoting the phylogenomic approach to the evolutionary dynamics of the plant meiotic machineries. Copyright © 2017 Elsevier B.V. All rights reserved.

  14. Transcriptomic analysis of grain amaranth (Amaranthus hypochondriacus) using 454 pyrosequencing: comparison with A. tuberculatus, expression profiling in stems and in response to biotic and abiotic stress

    PubMed Central

    2011-01-01

    Background Amaranthus hypochondriacus, a grain amaranth, is a C4 plant noted by its ability to tolerate stressful conditions and produce highly nutritious seeds. These possess an optimal amino acid balance and constitute a rich source of health-promoting peptides. Although several recent studies, mostly involving subtractive hybridization strategies, have contributed to increase the relatively low number of grain amaranth expressed sequence tags (ESTs), transcriptomic information of this species remains limited, particularly regarding tissue-specific and biotic stress-related genes. Thus, a large scale transcriptome analysis was performed to generate stem- and (a)biotic stress-responsive gene expression profiles in grain amaranth. Results A total of 2,700,168 raw reads were obtained from six 454 pyrosequencing runs, which were assembled into 21,207 high quality sequences (20,408 isotigs + 799 contigs). The average sequence length was 1,064 bp and 930 bp for isotigs and contigs, respectively. Only 5,113 singletons were recovered after quality control. Contigs/isotigs were further incorporated into 15,667 isogroups. All unique sequences were queried against the nr, TAIR, UniRef100, UniRef50 and Amaranthaceae EST databases for annotation. Functional GO annotation was performed with all contigs/isotigs that produced significant hits with the TAIR database. Only 8,260 sequences were found to be homologous when the transcriptomes of A. tuberculatus and A. hypochondriacus were compared, most of which were associated with basic house-keeping processes. Digital expression analysis identified 1,971 differentially expressed genes in response to at least one of four stress treatments tested. These included several multiple-stress-inducible genes that could represent potential candidates for use in the engineering of stress-resistant plants. The transcriptomic data generated from pigmented stems shared similarity with findings reported in developing stems of Arabidopsis and black cottonwood (Populus trichocarpa). Conclusions This study represents the first large-scale transcriptomic analysis of A. hypochondriacus, considered to be a highly nutritious and stress-tolerant crop. Numerous genes were found to be induced in response to (a)biotic stress, many of which could further the understanding of the mechanisms that contribute to multiple stress-resistance in plants, a trait that has potential biotechnological applications in agriculture. PMID:21752295

  15. Large-Scale Gene-Centric Analysis Identifies Novel Variants for Coronary Artery Disease

    PubMed Central

    2011-01-01

    Coronary artery disease (CAD) has a significant genetic contribution that is incompletely characterized. To complement genome-wide association (GWA) studies, we conducted a large and systematic candidate gene study of CAD susceptibility, including analysis of many uncommon and functional variants. We examined 49,094 genetic variants in ∼2,100 genes of cardiovascular relevance, using a customised gene array in 15,596 CAD cases and 34,992 controls (11,202 cases and 30,733 controls of European descent; 4,394 cases and 4,259 controls of South Asian origin). We attempted to replicate putative novel associations in an additional 17,121 CAD cases and 40,473 controls. Potential mechanisms through which the novel variants could affect CAD risk were explored through association tests with vascular risk factors and gene expression. We confirmed associations of several previously known CAD susceptibility loci (eg, 9p21.3:p<10−33; LPA:p<10−19; 1p13.3:p<10−17) as well as three recently discovered loci (COL4A1/COL4A2, ZC3HC1, CYP17A1:p<5×10−7). However, we found essentially null results for most previously suggested CAD candidate genes. In our replication study of 24 promising common variants, we identified novel associations of variants in or near LIPA, IL5, TRIB1, and ABCG5/ABCG8, with per-allele odds ratios for CAD risk with each of the novel variants ranging from 1.06–1.09. Associations with variants at LIPA, TRIB1, and ABCG5/ABCG8 were supported by gene expression data or effects on lipid levels. Apart from the previously reported variants in LPA, none of the other ∼4,500 low frequency and functional variants showed a strong effect. Associations in South Asians did not differ appreciably from those in Europeans, except for 9p21.3 (per-allele odds ratio: 1.14 versus 1.27 respectively; P for heterogeneity = 0.003). This large-scale gene-centric analysis has identified several novel genes for CAD that relate to diverse biochemical and cellular functions and clarified the literature with regard to many previously suggested genes. PMID:21966275

  16. Transcription forms and remodels supercoiling domains unfolding large-scale chromatin structures

    PubMed Central

    Naughton, Catherine; Avlonitis, Nicolaos; Corless, Samuel; Prendergast, James G.; Mati, Ioulia K.; Eijk, Paul P.; Cockroft, Scott L.; Bradley, Mark; Ylstra, Bauke; Gilbert, Nick

    2013-01-01

    DNA supercoiling is an inherent consequence of twisting DNA and is critical for regulating gene expression and DNA replication. However, DNA supercoiling at a genomic scale in human cells is uncharacterized. To map supercoiling we used biotinylated-trimethylpsoralen as a DNA structure probe to show the genome is organized into supercoiling domains. Domains are formed and remodeled by RNA polymerase and topoisomerase activities and are flanked by GC-AT boundaries and CTCF binding sites. Under-wound domains are transcriptionally active, enriched in topoisomerase I, “open” chromatin fibers and DNaseI sites, but are depleted of topoisomerase II. Furthermore DNA supercoiling impacts on additional levels of chromatin compaction as under-wound domains are cytologically decondensed, topologically constrained, and decompacted by transcription of short RNAs. We suggest that supercoiling domains create a topological environment that facilitates gene activation providing an evolutionary purpose for clustering genes along chromosomes. PMID:23416946

  17. A maize database resource that captures tissue-specific and subcellular-localized gene expression, via fluorescent tags and confocal imaging (Maize Cell Genomics Database).

    PubMed

    Krishnakumar, Vivek; Choi, Yongwook; Beck, Erin; Wu, Qingyu; Luo, Anding; Sylvester, Anne; Jackson, David; Chan, Agnes P

    2015-01-01

    Maize is a global crop and a powerful system among grain crops for genetic and genomic studies. However, the development of novel biological tools and resources to aid in the functional identification of gene sequences is greatly needed. Towards this goal, we have developed a collection of maize marker lines for studying native gene expression in specific cell types and subcellular compartments using fluorescent proteins (FPs). To catalog FP expression, we have developed a public repository, the Maize Cell Genomics (MCG) Database, (http://maize.jcvi.org/cellgenomics), to organize a large data set of confocal images generated from the maize marker lines. To date, the collection represents major subcellular structures and also developmentally important progenitor cell populations. The resource is available to the research community, for example to study protein localization or interactions under various experimental conditions or mutant backgrounds. A subset of the marker lines can also be used to induce misexpression of target genes through a transactivation system. For future directions, the image repository can be expanded to accept new image submissions from the research community, and to perform customized large-scale computational image analysis. This community resource will provide a suite of new tools for gaining biological insights by following the dynamics of protein expression at the subcellular, cellular and tissue levels. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  18. A comprehensive analysis on preservation patterns of gene co-expression networks during Alzheimer's disease progression.

    PubMed

    Ray, Sumanta; Hossain, Sk Md Mosaddek; Khatun, Lutfunnesa; Mukhopadhyay, Anirban

    2017-12-20

    Alzheimer's disease (AD) is a chronic neuro-degenerative disruption of the brain which involves in large scale transcriptomic variation. The disease does not impact every regions of the brain at the same time, instead it progresses slowly involving somewhat sequential interaction with different regions. Analysis of the expression patterns of the genes in different regions of the brain influenced in AD surely contribute for a enhanced comprehension of AD pathogenesis and shed light on the early characterization of the disease. Here, we have proposed a framework to identify perturbation and preservation characteristics of gene expression patterns across six distinct regions of the brain ("EC", "HIP", "PC", "MTG", "SFG", and "VCX") affected in AD. Co-expression modules were discovered considering a couple of regions at once. These are then analyzed to know the preservation and perturbation characteristics. Different module preservation statistics and a rank aggregation mechanism have been adopted to detect the changes of expression patterns across brain regions. Gene ontology (GO) and pathway based analysis were also carried out to know the biological meaning of preserved and perturbed modules. In this article, we have extensively studied the preservation patterns of co-expressed modules in six distinct brain regions affected in AD. Some modules are emerged as the most preserved while some others are detected as perturbed between a pair of brain regions. Further investigation on the topological properties of preserved and non-preserved modules reveals a substantial association amongst "betweenness centrality" and "degree" of the involved genes. Our findings may render a deeper realization of the preservation characteristics of gene expression patterns in discrete brain regions affected by AD.

  19. Large scale systematic proteomic quantification from non-metastatic to metastatic colorectal cancer

    NASA Astrophysics Data System (ADS)

    Yin, Xuefei; Zhang, Yang; Guo, Shaowen; Jin, Hong; Wang, Wenhai; Yang, Pengyuan

    2015-07-01

    A systematic proteomic quantification of formalin-fixed, paraffin-embedded (FFPE) colorectal cancer tissues from stage I to stage IIIC was performed in large scale. 1017 proteins were identified with 338 proteins in quantitative changes by label free method, while 341 proteins were quantified with significant expression changes among 6294 proteins by iTRAQ method. We found that proteins related to migration expression increased and those for binding and adherent decreased during the colorectal cancer development according to the gene ontology (GO) annotation and ingenuity pathway analysis (IPA). The integrin alpha 5 (ITA5) in integrin family was focused, which was consistent with the metastasis related pathway. The expression level of ITA5 decreased in metastasis tissues and the result has been further verified by Western blotting. Another two cell migration related proteins vitronectin (VTN) and actin-related protein (ARP3) were also proved to be up-regulated by both mass spectrometry (MS) based quantification results and Western blotting. Up to now, our result shows one of the largest dataset in colorectal cancer proteomics research. Our strategy reveals a disease driven omics-pattern for the metastasis colorectal cancer.

  20. Derivation of large-scale cellular regulatory networks from biological time series data.

    PubMed

    de Bivort, Benjamin L

    2010-01-01

    Pharmacological agents and other perturbants of cellular homeostasis appear to nearly universally affect the activity of many genes, proteins, and signaling pathways. While this is due in part to nonspecificity of action of the drug or cellular stress, the large-scale self-regulatory behavior of the cell may also be responsible, as this typically means that when a cell switches states, dozens or hundreds of genes will respond in concert. If many genes act collectively in the cell during state transitions, rather than every gene acting independently, models of the cell can be created that are comprehensive of the action of all genes, using existing data, provided that the functional units in the model are collections of genes. Techniques to develop these large-scale cellular-level models are provided in detail, along with methods of analyzing them, and a brief summary of major conclusions about large-scale cellular networks to date.

  1. Lineage-Specific Evolutionary Histories and Regulation of Major Starch Metabolism Genes during Banana Ripening

    PubMed Central

    Jourda, Cyril; Cardi, Céline; Gibert, Olivier; Giraldo Toro, Andrès; Ricci, Julien; Mbéguié-A-Mbéguié, Didier; Yahiaoui, Nabila

    2016-01-01

    Starch is the most widespread and abundant storage carbohydrate in plants. It is also a major feature of cultivated bananas as it accumulates to large amounts during banana fruit development before almost complete conversion to soluble sugars during ripening. Little is known about the structure of major gene families involved in banana starch metabolism and their evolution compared to other species. To identify genes involved in banana starch metabolism and investigate their evolutionary history, we analyzed six gene families playing a crucial role in plant starch biosynthesis and degradation: the ADP-glucose pyrophosphorylases (AGPases), starch synthases (SS), starch branching enzymes (SBE), debranching enzymes (DBE), α-amylases (AMY) and β-amylases (BAM). Using comparative genomics and phylogenetic approaches, these genes were classified into families and sub-families and orthology relationships with functional genes in Eudicots and in grasses were identified. In addition to known ancestral duplications shaping starch metabolism gene families, independent evolution in banana and grasses also occurred through lineage-specific whole genome duplications for specific sub-families of AGPase, SS, SBE, and BAM genes; and through gene-scale duplications for AMY genes. In particular, banana lineage duplications yielded a set of AGPase, SBE and BAM genes that were highly or specifically expressed in banana fruits. Gene expression analysis highlighted a complex transcriptional reprogramming of starch metabolism genes during ripening of banana fruits. A differential regulation of expression between banana gene duplicates was identified for SBE and BAM genes, suggesting that part of starch metabolism regulation in the fruit evolved in the banana lineage. PMID:27994606

  2. Authentic Research Experience and “Big Data” Analysis in the Classroom: Maize Response to Abiotic Stress

    PubMed Central

    Makarevitch, Irina; Frechette, Cameo; Wiatros, Natalia

    2015-01-01

    Integration of inquiry-based approaches into curriculum is transforming the way science is taught and studied in undergraduate classrooms. Incorporating quantitative reasoning and mathematical skills into authentic biology undergraduate research projects has been shown to benefit students in developing various skills necessary for future scientists and to attract students to science, technology, engineering, and mathematics disciplines. While large-scale data analysis became an essential part of modern biological research, students have few opportunities to engage in analysis of large biological data sets. RNA-seq analysis, a tool that allows precise measurement of the level of gene expression for all genes in a genome, revolutionized molecular biology and provides ample opportunities for engaging students in authentic research. We developed, implemented, and assessed a series of authentic research laboratory exercises incorporating a large data RNA-seq analysis into an introductory undergraduate classroom. Our laboratory series is focused on analyzing gene expression changes in response to abiotic stress in maize seedlings; however, it could be easily adapted to the analysis of any other biological system with available RNA-seq data. Objective and subjective assessment of student learning demonstrated gains in understanding important biological concepts and in skills related to the process of science. PMID:26163561

  3. Major recent and independent changes in levels and patterns of expression have occurred at the b gene, a regulatory locus in maize.

    PubMed

    Selinger, D A; Chandler, V L

    1999-12-21

    The b locus encodes a transcription factor that regulates the expression of genes that produce purple anthocyanin pigment. Different b alleles are expressed in distinct tissues, causing tissue-specific anthocyanin production. Understanding how phenotypic diversity is produced and maintained at the b locus should provide models for how other regulatory genes, including those that influence morphological traits and development, evolve. We have investigated how different levels and patterns of pigmentation have evolved by determining the phenotypic and evolutionary relationships between 18 alleles that represent the diversity of b alleles in Zea mays. Although most of these alleles have few phenotypic differences, five alleles have very distinct tissue-specific patterns of pigmentation. Superimposing the phenotypes on the molecular phylogeny reveals that the alleles with strong and distinctive patterns of expression are closely related to alleles with weak expression, implying that the distinctive patterns have arisen recently. We have identified apparent insertions in three of the five phenotypically distinct alleles, and the fourth has unique upstream restriction fragment length polymorphisms relative to closely related alleles. The insertion in B-Peru has been shown to be responsible for its unique expression and, in the other two alleles, the presence of the insertion correlates with the phenotype. These results suggest that major changes in gene expression are probably the result of large-scale changes in DNA sequence and/or structure most likely mediated by transposable elements.

  4. Evaluation of two outlier-detection-based methods for detecting tissue-selective genes from microarray data.

    PubMed

    Kadota, Koji; Konishi, Tomokazu; Shimizu, Kentaro

    2007-05-01

    Large-scale expression profiling using DNA microarrays enables identification of tissue-selective genes for which expression is considerably higher and/or lower in some tissues than in others. Among numerous possible methods, only two outlier-detection-based methods (an AIC-based method and Sprent's non-parametric method) can treat equally various types of selective patterns, but they produce substantially different results. We investigated the performance of these two methods for different parameter settings and for a reduced number of samples. We focused on their ability to detect selective expression patterns robustly. We applied them to public microarray data collected from 36 normal human tissue samples and analyzed the effects of both changing the parameter settings and reducing the number of samples. The AIC-based method was more robust in both cases. The findings confirm that the use of the AIC-based method in the recently proposed ROKU method for detecting tissue-selective expression patterns is correct and that Sprent's method is not suitable for ROKU.

  5. A compact, in vivo screen of all 6-mers reveals drivers of tissue-specific expression and guides synthetic regulatory element design.

    PubMed

    Smith, Robin P; Riesenfeld, Samantha J; Holloway, Alisha K; Li, Qiang; Murphy, Karl K; Feliciano, Natalie M; Orecchia, Lorenzo; Oksenberg, Nir; Pollard, Katherine S; Ahituv, Nadav

    2013-07-18

    Large-scale annotation efforts have improved our ability to coarsely predict regulatory elements throughout vertebrate genomes. However, it is unclear how complex spatiotemporal patterns of gene expression driven by these elements emerge from the activity of short, transcription factor binding sequences. We describe a comprehensive promoter extension assay in which the regulatory potential of all 6 base-pair (bp) sequences was tested in the context of a minimal promoter. To enable this large-scale screen, we developed algorithms that use a reverse-complement aware decomposition of the de Bruijn graph to design a library of DNA oligomers incorporating every 6-bp sequence exactly once. Our library multiplexes all 4,096 unique 6-mers into 184 double-stranded 15-bp oligomers, which is sufficiently compact for in vivo testing. We injected each multiplexed construct into zebrafish embryos and scored GFP expression in 15 tissues at two developmental time points. Twenty-seven constructs produced consistent expression patterns, with the majority doing so in only one tissue. Functional sequences are enriched near biologically relevant genes, match motifs for developmental transcription factors, and are required for enhancer activity. By concatenating tissue-specific functional sequences, we generated completely synthetic enhancers for the notochord, epidermis, spinal cord, forebrain and otic lateral line, and show that short regulatory sequences do not always function modularly. This work introduces a unique in vivo catalog of short, functional regulatory sequences and demonstrates several important principles of regulatory element organization. Furthermore, we provide resources for designing compact, reverse-complement aware k-mer libraries.

  6. Effect of the difference in vehicles on gene expression in the rat liver--analysis of the control data in the Toxicogenomics Project Database.

    PubMed

    Takashima, Kayoko; Mizukawa, Yumiko; Morishita, Katsumi; Okuyama, Manabu; Kasahara, Toshihiko; Toritsuka, Naoki; Miyagishima, Toshikazu; Nagao, Taku; Urushidani, Tetsuro

    2006-05-08

    The Toxicogenomics Project is a 5-year collaborative project by the Japanese government and pharmaceutical companies in 2002. Its aim is to construct a large-scale toxicology database of 150 compounds orally administered to rats. The test consists of a single administration test (3, 6, 9 and 24 h) and a repeated administration test (3, 7, 14 and 28 days), and the conventional toxicology data together with the gene expression data in liver as analyzed by using Affymetrix GeneChip are being accumulated. In the project, either methylcellulose or corn oil is employed as vehicle. We examined whether the vehicle itself affects the analysis of gene expression and found that corn oil alone affected the food consumption and biochemical parameters mainly related to lipid metabolism, and this accompanied typical changes in the gene expression. Most of the genes modulated by corn oil were related to cholesterol or fatty acid metabolism (e.g., CYP7A1, CYP8B1, 3-hydroxy-3-methylglutaryl-Coenzyme A reductase, squalene epoxidase, angiopoietin-like protein 4, fatty acid synthase, fatty acid binding proteins), suggesting that the response was physiologic to the oil intake. Many of the lipid-related genes showed circadian rhythm within a day, but the expression pattern of general clock genes (e.g., period 2, arylhydrocarbon nuclear receptor translocator-like, D site albumin promoter binding protein) were unaffected by corn oil, suggesting that the effects are specific for lipid metabolism. These results would be useful for usage of the database especially when drugs with different vehicle control are compared.

  7. Early and long-standing rheumatoid arthritis: distinct molecular signatures identified by gene-expression profiling in synovia

    PubMed Central

    Lequerré, Thierry; Bansard, Carine; Vittecoq, Olivier; Derambure, Céline; Hiron, Martine; Daveau, Maryvonne; Tron, François; Ayral, Xavier; Biga, Norman; Auquit-Auckbur, Isabelle; Chiocchia, Gilles; Le Loët, Xavier; Salier, Jean-Philippe

    2009-01-01

    Introduction Rheumatoid arthritis (RA) is a heterogeneous disease and its underlying molecular mechanisms are still poorly understood. Because previous microarray studies have only focused on long-standing (LS) RA compared to osteoarthritis, we aimed to compare the molecular profiles of early and LS RA versus control synovia. Methods Synovial biopsies were obtained by arthroscopy from 15 patients (4 early untreated RA, 4 treated LS RA and 7 controls, who had traumatic or mechanical lesions). Extracted mRNAs were used for large-scale gene-expression profiling. The different gene-expression combinations identified by comparison of profiles of early, LS RA and healthy synovia were linked to the biological processes involved in each situation. Results Three combinations of 719, 116 and 52 transcripts discriminated, respectively, early from LS RA, and early or LS RA from healthy synovia. We identified several gene clusters and distinct molecular signatures specifically expressed during early or LS RA, thereby suggesting the involvement of different pathophysiological mechanisms during the course of RA. Conclusions Early and LS RA have distinct molecular signatures with different biological processes participating at different times during the course of the disease. These results suggest that better knowledge of the main biological processes involved at a given RA stage might help to choose the most appropriate treatment. PMID:19563633

  8. Down-regulation of transmembrane carbonic anhydrases in renal cell carcinoma cell lines by wild-type von Hippel-Lindau transgenes

    PubMed Central

    Ivanov, Sergey V.; Kuzmin, Igor; Wei, Ming-Hui; Pack, Svetlana; Geil, Laura; Johnson, Bruce E.; Stanbridge, Eric J.; Lerman, Michael I.

    1998-01-01

    To discover genes involved in von Hippel-Lindau (VHL)-mediated carcinogenesis, we used renal cell carcinoma cell lines stably transfected with wild-type VHL-expressing transgenes. Large-scale RNA differential display technology applied to these cell lines identified several differentially expressed genes, including an alpha carbonic anhydrase gene, termed CA12. The deduced protein sequence was classified as a one-pass transmembrane CA possessing an apparently intact catalytic domain in the extracellular CA module. Reintroduced wild-type VHL strongly inhibited the overexpression of the CA12 gene in the parental renal cell carcinoma cell lines. Similar results were obtained with CA9, encoding another transmembrane CA with an intact catalytic domain. Although both domains of the VHL protein contribute to regulation of CA12 expression, the elongin binding domain alone could effectively regulate CA9 expression. We mapped CA12 and CA9 loci to chromosome bands 15q22 and 17q21.2 respectively, regions prone to amplification in some human cancers. Additional experiments are needed to define the role of CA IX and CA XII enzymes in the regulation of pH in the extracellular microenvironment and its potential impact on cancer cell growth. PMID:9770531

  9. Seasonal and latitudinal acclimatization of cardiac transcriptome responses to thermal stress in porcelain crabs, Petrolisthes cinctipes.

    PubMed

    Stillman, Jonathon H; Tagmount, Abderrahmane

    2009-10-01

    Central predictions of climate warming models include increased climate variability and increased severity of heat waves. Physiological acclimatization in populations across large-scale ecological gradients in habitat temperature fluctuation is an important factor to consider in detecting responses to climate change related increases in thermal fluctuation. We measured in vivo cardiac thermal maxima and used microarrays to profile transcriptome heat and cold stress responses in cardiac tissue of intertidal zone porcelain crabs across biogeographic and seasonal gradients in habitat temperature fluctuation. We observed acclimatization dependent induction of heat shock proteins, as well as unknown genes with heat shock protein-like expression profiles. Thermal acclimatization had the largest effect on heat stress responses of extensin-like, beta tubulin, and unknown genes. For these genes, crabs acclimatized to thermally variable sites had higher constitutive expression than specimens from low variability sites, but heat stress dramatically induced expression in specimens from low variability sites and repressed expression in specimens from highly variable sites. Our application of ecological transcriptomics has yielded new biomarkers that may represent sensitive indicators of acclimatization to habitat temperature fluctuation. Our study also has identified novel genes whose further description may yield novel understanding of cellular responses to thermal acclimatization or thermal stress.

  10. Piggy: a rapid, large-scale pan-genome analysis tool for intergenic regions in bacteria.

    PubMed

    Thorpe, Harry A; Bayliss, Sion C; Sheppard, Samuel K; Feil, Edward J

    2018-04-01

    The concept of the "pan-genome," which refers to the total complement of genes within a given sample or species, is well established in bacterial genomics. Rapid and scalable pipelines are available for managing and interpreting pan-genomes from large batches of annotated assemblies. However, despite overwhelming evidence that variation in intergenic regions in bacteria can directly influence phenotypes, most current approaches for analyzing pan-genomes focus exclusively on protein-coding sequences. To address this we present Piggy, a novel pipeline that emulates Roary except that it is based only on intergenic regions. A key utility provided by Piggy is the detection of highly divergent ("switched") intergenic regions (IGRs) upstream of genes. We demonstrate the use of Piggy on large datasets of clinically important lineages of Staphylococcus aureus and Escherichia coli. For S. aureus, we show that highly divergent (switched) IGRs are associated with differences in gene expression and we establish a multilocus reference database of IGR alleles (igMLST; implemented in BIGSdb).

  11. Single cell Hi-C reveals cell-to-cell variability in chromosome structure

    PubMed Central

    Schoenfelder, Stefan; Yaffe, Eitan; Dean, Wendy; Laue, Ernest D.; Tanay, Amos; Fraser, Peter

    2013-01-01

    Large-scale chromosome structure and spatial nuclear arrangement have been linked to control of gene expression and DNA replication and repair. Genomic techniques based on chromosome conformation capture assess contacts for millions of loci simultaneously, but do so by averaging chromosome conformations from millions of nuclei. Here we introduce single cell Hi-C, combined with genome-wide statistical analysis and structural modeling of single copy X chromosomes, to show that individual chromosomes maintain domain organisation at the megabase scale, but show variable cell-to-cell chromosome territory structures at larger scales. Despite this structural stochasticity, localisation of active gene domains to boundaries of territories is a hallmark of chromosomal conformation. Single cell Hi-C data bridge current gaps between genomics and microscopy studies of chromosomes, demonstrating how modular organisation underlies dynamic chromosome structure, and how this structure is probabilistically linked with genome activity patterns. PMID:24067610

  12. Genome-wide methylation and gene expression changes in newborn rats following maternal protein restriction and reversal by folic acid.

    PubMed

    Altobelli, Gioia; Bogdarina, Irina G; Stupka, Elia; Clark, Adrian J L; Langley-Evans, Simon

    2013-01-01

    A large body of evidence from human and animal studies demonstrates that the maternal diet during pregnancy can programme physiological and metabolic functions in the developing fetus, effectively determining susceptibility to later disease. The mechanistic basis of such programming is unclear but may involve resetting of epigenetic marks and fetal gene expression. The aim of this study was to evaluate genome-wide DNA methylation and gene expression in the livers of newborn rats exposed to maternal protein restriction. On day one postnatally, there were 618 differentially expressed genes and 1183 differentially methylated regions (FDR 5%). The functional analysis of differentially expressed genes indicated a significant effect on DNA repair/cycle/maintenance functions and of lipid, amino acid metabolism and circadian functions. Enrichment for known biological functions was found to be associated with differentially methylated regions. Moreover, these epigenetically altered regions overlapped genetic loci associated with metabolic and cardiovascular diseases. Both expression changes and DNA methylation changes were largely reversed by supplementing the protein restricted diet with folic acid. Although the epigenetic and gene expression signatures appeared to underpin largely different biological processes, the gene expression profile of DNA methyl transferases was altered, providing a potential link between the two molecular signatures. The data showed that maternal protein restriction is associated with widespread differential gene expression and DNA methylation across the genome, and that folic acid is able to reset both molecular signatures.

  13. The Human EST Ontology Explorer: a tissue-oriented visualization system for ontologies distribution in human EST collections.

    PubMed

    Merelli, Ivan; Caprera, Andrea; Stella, Alessandra; Del Corvo, Marcello; Milanesi, Luciano; Lazzari, Barbara

    2009-10-15

    The NCBI dbEST currently contains more than eight million human Expressed Sequenced Tags (ESTs). This wide collection represents an important source of information for gene expression studies, provided it can be inspected according to biologically relevant criteria. EST data can be browsed using different dedicated web resources, which allow to investigate library specific gene expression levels and to make comparisons among libraries, highlighting significant differences in gene expression. Nonetheless, no tool is available to examine distributions of quantitative EST collections in Gene Ontology (GO) categories, nor to retrieve information concerning library-dependent EST involvement in metabolic pathways. In this work we present the Human EST Ontology Explorer (HEOE) http://www.itb.cnr.it/ptp/human_est_explorer, a web facility for comparison of expression levels among libraries from several healthy and diseased tissues. The HEOE provides library-dependent statistics on the distribution of sequences in the GO Direct Acyclic Graph (DAG) that can be browsed at each GO hierarchical level. The tool is based on large-scale BLAST annotation of EST sequences. Due to the huge number of input sequences, this BLAST analysis was performed with the aid of grid computing technology, which is particularly suitable to address data parallel task. Relying on the achieved annotation, library-specific distributions of ESTs in the GO Graph were inferred. A pathway-based search interface was also implemented, for a quick evaluation of the representation of libraries in metabolic pathways. EST processing steps were integrated in a semi-automatic procedure that relies on Perl scripts and stores results in a MySQL database. A PHP-based web interface offers the possibility to simultaneously visualize, retrieve and compare data from the different libraries. Statistically significant differences in GO categories among user selected libraries can also be computed. The HEOE provides an alternative and complementary way to inspect EST expression levels with respect to approaches currently offered by other resources. Furthermore, BLAST computation on the whole human EST dataset was a suitable test of grid scalability in the context of large-scale bioinformatics analysis. The HEOE currently comprises sequence analysis from 70 non-normalized libraries, representing a comprehensive overview on healthy and unhealthy tissues. As the analysis procedure can be easily applied to other libraries, the number of represented tissues is intended to increase.

  14. Synthetic Gene Network with Positive Feedback Loop Amplifies Cellulase Gene Expression in Neurospora crassa.

    PubMed

    Matsu-Ura, Toru; Dovzhenok, Andrey A; Coradetti, Samuel T; Subramanian, Krithika R; Meyer, Daniel R; Kwon, Jaesang J; Kim, Caleb; Salomonis, Nathan; Glass, N Louise; Lim, Sookkyung; Hong, Christian I

    2018-05-18

    Second-generation or lignocellulosic biofuels are a tangible source of renewable energy, which is critical to combat climate change by reducing the carbon footprint. Filamentous fungi secrete cellulose-degrading enzymes called cellulases, which are used for production of lignocellulosic biofuels. However, inefficient production of cellulases is a major obstacle for industrial-scale production of second-generation biofuels. We used computational simulations to design and implement synthetic positive feedback loops to increase gene expression of a key transcription factor, CLR-2, that activates a large number of cellulases in a filamentous fungus, Neurospora crassa. Overexpression of CLR-2 reveals previously unappreciated roles of CLR-2 in lignocellulosic gene network, which enabled simultaneous induction of approximately 50% of 78 lignocellulosic degradation-related genes in our engineered Neurospora strains. This engineering results in dramatically increased cellulase activity due to cooperative orchestration of multiple enzymes involved in the cellulose degradation pathway. Our work provides a proof of principle in utilizing mathematical modeling and synthetic biology to improve the efficiency of cellulase synthesis for second-generation biofuel production.

  15. Using scale and feather traits for module construction provides a functional approach to chicken epidermal development.

    PubMed

    Bao, Weier; Greenwold, Matthew J; Sawyer, Roger H

    2017-11-01

    Gene co-expression network analysis has been a research method widely used in systematically exploring gene function and interaction. Using the Weighted Gene Co-expression Network Analysis (WGCNA) approach to construct a gene co-expression network using data from a customized 44K microarray transcriptome of chicken epidermal embryogenesis, we have identified two distinct modules that are highly correlated with scale or feather development traits. Signaling pathways related to feather development were enriched in the traditional KEGG pathway analysis and functional terms relating specifically to embryonic epidermal development were also enriched in the Gene Ontology analysis. Significant enrichment annotations were discovered from customized enrichment tools such as Modular Single-Set Enrichment Test (MSET) and Medical Subject Headings (MeSH). Hub genes in both trait-correlated modules showed strong specific functional enrichment toward epidermal development. Also, regulatory elements, such as transcription factors and miRNAs, were targeted in the significant enrichment result. This work highlights the advantage of this methodology for functional prediction of genes not previously associated with scale- and feather trait-related modules.

  16. Multiplex titration RT-PCR: rapid determination of gene expression patterns for a large number of genes

    NASA Technical Reports Server (NTRS)

    Nebenfuhr, A.; Lomax, T. L.

    1998-01-01

    We have developed an improved method for determination of gene expression levels with RT-PCR. The procedure is rapid and does not require extensive optimization or densitometric analysis. Since the detection of individual transcripts is PCR-based, small amounts of tissue samples are sufficient for the analysis of expression patterns in large gene families. Using this method, we were able to rapidly screen nine members of the Aux/IAA family of auxin-responsive genes and identify those genes which vary in message abundance in a tissue- and light-specific manner. While not offering the accuracy of conventional semi-quantitative or competitive RT-PCR, our method allows quick screening of large numbers of genes in a wide range of RNA samples with just a thermal cycler and standard gel analysis equipment.

  17. A proposed metric for assessing the measurement quality of individual microarrays

    PubMed Central

    Kim, Kyoungmi; Page, Grier P; Beasley, T Mark; Barnes, Stephen; Scheirer, Katherine E; Allison, David B

    2006-01-01

    Background High-density microarray technology is increasingly applied to study gene expression levels on a large scale. Microarray experiments rely on several critical steps that may introduce error and uncertainty in analyses. These steps include mRNA sample extraction, amplification and labeling, hybridization, and scanning. In some cases this may be manifested as systematic spatial variation on the surface of microarray in which expression measurements within an individual array may vary as a function of geographic position on the array surface. Results We hypothesized that an index of the degree of spatiality of gene expression measurements associated with their physical geographic locations on an array could indicate the summary of the physical reliability of the microarray. We introduced a novel way to formulate this index using a statistical analysis tool. Our approach regressed gene expression intensity measurements on a polynomial response surface of the microarray's Cartesian coordinates. We demonstrated this method using a fixed model and presented results from real and simulated datasets. Conclusion We demonstrated the potential of such a quantitative metric for assessing the reliability of individual arrays. Moreover, we showed that this procedure can be incorporated into laboratory practice as a means to set quality control specifications and as a tool to determine whether an array has sufficient quality to be retained in terms of spatial correlation of gene expression measurements. PMID:16430768

  18. Patterns Cancer Prevention Through Induction of Phase 2 Enzymes

    DTIC Science & Technology

    2003-04-01

    2) enzymes. During our Phase I Award, we identified sulforaphane as the most potent inducer of carcinogen defenses in the prostate cell. We have...characterized global effects of sulforaphane in prostate cancer cell lines using cDNA microarray technology that allows large-scale determination of...changes in gene expression. These findings argue strongly for a preventive intervention trial involving with sulforaphane . During our Phase 2 Award, we used

  19. Combined strategies for improving expression of Citrobacter amalonaticus phytase in Pichia pastoris.

    PubMed

    Li, Cheng; Lin, Ying; Zheng, Xueyun; Pang, Nuo; Liao, Xihao; Liu, Xiaoxiao; Huang, Yuanyuan; Liang, Shuli

    2015-09-26

    Phytase is used as an animal feed additive that degrades phytic acid and reduces feeding costs and pollution caused by fecal excretion of phosphorus. Some phytases have been expressed in Pichia pastoris, among which the phytase from Citrobacter amalonaticus CGMCC 1696 had high specific activity (3548 U/mg). Improvement of the phytase expression level will contribute to facilitate its industrial applications. To improve the phytase expression, we use modification of P AOX1 and the α-factor signal peptide, increasing the gene copy number, and overexpressing HAC1 (i) to enhance folding and secretion of the protein in the endoplasmic reticulum. The genetic stability and fermentation in 10-L scaled-up fed-batch fermenter was performed to prepare for the industrial production. The phytase gene from C. amalonaticus CGMCC 1696 was cloned under the control of the AOX1 promoter (P AOX1 ) and expressed in P. pastoris. The phytase activity achieved was 414 U/mL. Modifications of P AOX1 and the α-factor signal peptide increased the phytase yield by 35 and 12%, respectively. Next, on increasing the copy number of the Phy gene to six, the phytase yield was 141% higher than in the strain containing only a single gene copy. Furthermore, on overexpression of HAC1 (i) (i indicating induced), a gene encoding Hac1p that regulates the unfolded protein response, the phytase yield achieved was 0.75 g/L with an activity of 2119 U/mL, 412% higher than for the original strain. The plasmids in this high-phytase expression strain were stable during incubation at 30 °C in Yeast Extract Peptone Dextrose (YPD) Medium. In a 10-L scaled-up fed-batch fermenter, the phytase yield achieved was 9.58 g/L with an activity of 35,032 U/mL. The production of a secreted protein will reach its limit at a specific gene copy number where further increases in transcription and translation due to the higher abundance of gene copies will not enhance the secretion process any further. Enhancement of protein folding in the ER can alleviate bottlenecks in the folding and secretion pathways during the overexpression of heterologous proteins in P. pastoris. Using modification of P AOX1 and the α-factor signal peptide, increasing the gene copy number, and overexpressing HAC1 (i) to enhance folding and secretion of the protein in the endoplasmic reticulum, we have successfully increased the phytase yield 412% relative to the original strain. In a 10-L fed-batch fermenter, the phytase yield achieved was 9.58 g/L with an activity of 35,032 U/mL. Large-scale production of phytase can be applied towards different biocatalytic and feed additive applications.

  20. Cloud-Scale Genomic Signals Processing for Robust Large-Scale Cancer Genomic Microarray Data Analysis.

    PubMed

    Harvey, Benjamin Simeon; Ji, Soo-Yeon

    2017-01-01

    As microarray data available to scientists continues to increase in size and complexity, it has become overwhelmingly important to find multiple ways to bring forth oncological inference to the bioinformatics community through the analysis of large-scale cancer genomic (LSCG) DNA and mRNA microarray data that is useful to scientists. Though there have been many attempts to elucidate the issue of bringing forth biological interpretation by means of wavelet preprocessing and classification, there has not been a research effort that focuses on a cloud-scale distributed parallel (CSDP) separable 1-D wavelet decomposition technique for denoising through differential expression thresholding and classification of LSCG microarray data. This research presents a novel methodology that utilizes a CSDP separable 1-D method for wavelet-based transformation in order to initialize a threshold which will retain significantly expressed genes through the denoising process for robust classification of cancer patients. Additionally, the overall study was implemented and encompassed within CSDP environment. The utilization of cloud computing and wavelet-based thresholding for denoising was used for the classification of samples within the Global Cancer Map, Cancer Cell Line Encyclopedia, and The Cancer Genome Atlas. The results proved that separable 1-D parallel distributed wavelet denoising in the cloud and differential expression thresholding increased the computational performance and enabled the generation of higher quality LSCG microarray datasets, which led to more accurate classification results.

  1. Differential gene expression in Schistosoma japonicum schistosomula from Wistar rats and BALB/c mice

    PubMed Central

    2011-01-01

    Background More than 46 species of mammals can be naturally infected with Schistosoma japonicum in the mainland of China. Mice are permissive and may act as the definitive host of the life cycle. In contrast, rats are less susceptible to S. japonicum infection, and are considered to provide an unsuitable micro-environment for parasite growth and development. Since little is known of what effects this micro-environment has on the parasite itself, we have in the present study utilised a S. japonicum oligonucleotide microarray to compare the gene expression differences of 10-day-old schistosomula maintained in Wistar rats with those maintained in BALB/c mice. Results In total 3,468 schistosome genes were found to be differentially expressed, of which the majority (3,335) were down-regulated (≤ 2 fold) and 133 were up-regulated (≥ 2 fold) in schistosomula from Wistar rats compared with those from BALB/c mice. Gene ontology (GO) analysis revealed that of the differentially expressed genes with already established functions or close homology to well characterized genes in another organisms, many are related to important biological functions or molecular processes. Among the genes that were down-regulated in schistosomula from Wistar rats, some were associated with metabolism, signal transduction and development. Of these genes related to metabolic processes, areas including translation, protein and amino acid phosphorylation, proteolysis, oxidoreductase activities, catalytic activities and hydrolase activities, were represented. KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis of differential expressed genes indicated that of the 328 genes that had a specific KEGG pathway annotation, 324 were down-regulated and were mainly associated with metabolism, growth, redox pathway, oxidative phosphorylation, the cell cycle, ubiquitin-mediated proteolysis, protein export and the MAPK (mitogen-activated protein kinases) signaling pathway. Conclusions This work presents the first large scale gene expression study identifying the differences between schistosomula maintained in mice and those maintained in rats, and specifically highlights differential expression that may impact on the survival and development of the parasite within the definitive host. The research presented here provides valuable information for the better understanding of schistosome development and host-parasite interactions. PMID:21819550

  2. MEXPRESS: visualizing expression, DNA methylation and clinical TCGA data.

    PubMed

    Koch, Alexander; De Meyer, Tim; Jeschke, Jana; Van Criekinge, Wim

    2015-08-26

    In recent years, increasing amounts of genomic and clinical cancer data have become publically available through large-scale collaborative projects such as The Cancer Genome Atlas (TCGA). However, as long as these datasets are difficult to access and interpret, they are essentially useless for a major part of the research community and their scientific potential will not be fully realized. To address these issues we developed MEXPRESS, a straightforward and easy-to-use web tool for the integration and visualization of the expression, DNA methylation and clinical TCGA data on a single-gene level ( http://mexpress.be ). In comparison to existing tools, MEXPRESS allows researchers to quickly visualize and interpret the different TCGA datasets and their relationships for a single gene, as demonstrated for GSTP1 in prostate adenocarcinoma. We also used MEXPRESS to reveal the differences in the DNA methylation status of the PAM50 marker gene MLPH between the breast cancer subtypes and how these differences were linked to the expression of MPLH. We have created a user-friendly tool for the visualization and interpretation of TCGA data, offering clinical researchers a simple way to evaluate the TCGA data for their genes or candidate biomarkers of interest.

  3. High density DNA microarrays: algorithms and biomedical applications.

    PubMed

    Liu, Wei-Min

    2004-08-01

    DNA microarrays are devices capable of detecting the identity and abundance of numerous DNA or RNA segments in samples. They are used for analyzing gene expressions, identifying genetic markers and detecting mutations on a genomic scale. The fundamental chemical mechanism of DNA microarrays is the hybridization between probes and targets due to the hydrogen bonds of nucleotide base pairing. Since the cross hybridization is inevitable, and probes or targets may form undesirable secondary or tertiary structures, the microarray data contain noise and depend on experimental conditions. It is crucial to apply proper statistical algorithms to obtain useful signals from noisy data. After we obtained the signals of a large amount of probes, we need to derive the biomedical information such as the existence of a transcript in a cell, the difference of expression levels of a gene in multiple samples, and the type of a genetic marker. Furthermore, after the expression levels of thousands of genes or the genotypes of thousands of single nucleotide polymorphisms are determined, it is usually important to find a small number of genes or markers that are related to a disease, individual reactions to drugs, or other phenotypes. All these applications need careful data analyses and reliable algorithms.

  4. Gene expression-based dosimetry by dose and time in mice following acute radiation exposure.

    PubMed

    Tucker, James D; Divine, George W; Grever, William E; Thomas, Robert A; Joiner, Michael C; Smolinski, Joseph M; Auner, Gregory W

    2013-01-01

    Rapid and reliable methods for performing biological dosimetry are of paramount importance in the event of a large-scale nuclear event. Traditional dosimetry approaches lack the requisite rapid assessment capability, ease of use, portability and low cost, which are factors needed for triaging a large number of victims. Here we describe the results of experiments in which mice were acutely exposed to (60)Co gamma rays at doses of 0 (control) to 10 Gy. Blood was obtained from irradiated mice 0.5, 1, 2, 3, 5, and 7 days after exposure. mRNA expression levels of 106 selected genes were obtained by reverse-transcription real time PCR. Stepwise regression of dose received against individual gene transcript expression levels provided optimal dosimetry at each time point. The results indicate that only 4-7 different gene transcripts are needed to explain ≥ 0.69 of the variance (R(2)), and that receiver-operator characteristics, a measure of sensitivity and specificity, of ≥ 0.93 for these statistical models were achieved at each time point. These models provide an excellent description of the relationship between the actual and predicted doses up to 6 Gy. At doses of 8 and 10 Gy there appears to be saturation of the radiation-response signals with a corresponding diminution of accuracy. These results suggest that similar analyses in humans may be advantageous for use in a field-portable device designed to assess exposures in mass casualty situations.

  5. Gene Expression: Sizing it all up

    USDA-ARS?s Scientific Manuscript database

    Genomic architecture appears to be a largely unexplored component of gene expression. Although surely not the end of the story, we are learning that when it comes to gene expression, size is important. We have been surprised to find that certain patterns of expression, tissue-specific versus constit...

  6. An elm EST database for identifying leaf beetle egg-induced defense genes

    PubMed Central

    2012-01-01

    Background Plants can defend themselves against herbivorous insects prior to the onset of larval feeding by responding to the eggs laid on their leaves. In the European field elm (Ulmus minor), egg laying by the elm leaf beetle ( Xanthogaleruca luteola) activates the emission of volatiles that attract specialised egg parasitoids, which in turn kill the eggs. Little is known about the transcriptional changes that insect eggs trigger in plants and how such indirect defense mechanisms are orchestrated in the context of other biological processes. Results Here we present the first large scale study of egg-induced changes in the transcriptional profile of a tree. Five cDNA libraries were generated from leaves of (i) untreated control elms, and elms treated with (ii) egg laying and feeding by elm leaf beetles, (iii) feeding, (iv) artificial transfer of egg clutches, and (v) methyl jasmonate. A total of 361,196 ESTs expressed sequence tags (ESTs) were identified which clustered into 52,823 unique transcripts (Unitrans) and were stored in a database with a public web interface. Among the analyzed Unitrans, 73% could be annotated by homology to known genes in the UniProt (Plant) database, particularly to those from Vitis, Ricinus, Populus and Arabidopsis. Comparative in silico analysis among the different treatments revealed differences in Gene Ontology term abundances. Defense- and stress-related gene transcripts were present in high abundance in leaves after herbivore egg laying, but transcripts involved in photosynthesis showed decreased abundance. Many pathogen-related genes and genes involved in phytohormone signaling were expressed, indicative of jasmonic acid biosynthesis and activation of jasmonic acid responsive genes. Cross-comparisons between different libraries based on expression profiles allowed the identification of genes with a potential relevance in egg-induced defenses, as well as other biological processes, including signal transduction, transport and primary metabolism. Conclusion Here we present a dataset for a large-scale study of the mechanisms of plant defense against insect eggs in a co-evolved, natural ecological plant–insect system. The EST database analysis provided here is a first step in elucidating the transcriptional responses of elm to elm leaf beetle infestation, and adds further to our knowledge on insect egg-induced transcriptomic changes in plants. The sequences identified in our comparative analysis give many hints about novel defense mechanisms directed towards eggs. PMID:22702658

  7. An elm EST database for identifying leaf beetle egg-induced defense genes.

    PubMed

    Büchel, Kerstin; McDowell, Eric; Nelson, Will; Descour, Anne; Gershenzon, Jonathan; Hilker, Monika; Soderlund, Carol; Gang, David R; Fenning, Trevor; Meiners, Torsten

    2012-06-15

    Plants can defend themselves against herbivorous insects prior to the onset of larval feeding by responding to the eggs laid on their leaves. In the European field elm (Ulmus minor), egg laying by the elm leaf beetle ( Xanthogaleruca luteola) activates the emission of volatiles that attract specialised egg parasitoids, which in turn kill the eggs. Little is known about the transcriptional changes that insect eggs trigger in plants and how such indirect defense mechanisms are orchestrated in the context of other biological processes. Here we present the first large scale study of egg-induced changes in the transcriptional profile of a tree. Five cDNA libraries were generated from leaves of (i) untreated control elms, and elms treated with (ii) egg laying and feeding by elm leaf beetles, (iii) feeding, (iv) artificial transfer of egg clutches, and (v) methyl jasmonate. A total of 361,196 ESTs expressed sequence tags (ESTs) were identified which clustered into 52,823 unique transcripts (Unitrans) and were stored in a database with a public web interface. Among the analyzed Unitrans, 73% could be annotated by homology to known genes in the UniProt (Plant) database, particularly to those from Vitis, Ricinus, Populus and Arabidopsis. Comparative in silico analysis among the different treatments revealed differences in Gene Ontology term abundances. Defense- and stress-related gene transcripts were present in high abundance in leaves after herbivore egg laying, but transcripts involved in photosynthesis showed decreased abundance. Many pathogen-related genes and genes involved in phytohormone signaling were expressed, indicative of jasmonic acid biosynthesis and activation of jasmonic acid responsive genes. Cross-comparisons between different libraries based on expression profiles allowed the identification of genes with a potential relevance in egg-induced defenses, as well as other biological processes, including signal transduction, transport and primary metabolism. Here we present a dataset for a large-scale study of the mechanisms of plant defense against insect eggs in a co-evolved, natural ecological plant-insect system. The EST database analysis provided here is a first step in elucidating the transcriptional responses of elm to elm leaf beetle infestation, and adds further to our knowledge on insect egg-induced transcriptomic changes in plants. The sequences identified in our comparative analysis give many hints about novel defense mechanisms directed towards eggs.

  8. Transcriptome Analysis of Three Sheep Intestinal Regions reveals Key Pathways and Hub Regulatory Genes of Large Intestinal Lipid Metabolism.

    PubMed

    Chao, Tianle; Wang, Guizhi; Ji, Zhibin; Liu, Zhaohua; Hou, Lei; Wang, Jin; Wang, Jianmin

    2017-07-13

    The large intestine, also known as the hindgut, is an important part of the animal digestive system. Recent studies on digestive system development in ruminants have focused on the rumen and the small intestine, but the molecular mechanisms underlying sheep large intestine metabolism remain poorly understood. To identify genes related to intestinal metabolism and to reveal molecular regulation mechanisms, we sequenced and compared the transcriptomes of mucosal epithelial tissues among the cecum, proximal colon and duodenum. A total of 4,221 transcripts from 3,254 genes were identified as differentially expressed transcripts. Between the large intestine and duodenum, differentially expressed transcripts were found to be significantly enriched in 6 metabolism-related pathways, among which PPAR signaling was identified as a key pathway. Three genes, CPT1A, LPL and PCK1, were identified as higher expression hub genes in the large intestine. Between the cecum and colon, differentially expressed transcripts were significantly enriched in 5 lipid metabolism related pathways, and CEPT1 and MBOAT1 were identified as hub genes. This study provides important information regarding the molecular mechanisms of intestinal metabolism in sheep and may provide a basis for further study.

  9. Prader-Willi syndrome: intellectual abilities and behavioural features by genetic subtype.

    PubMed

    Milner, Katja M; Craig, Ellen E; Thompson, Russell J; Veltman, Marijcke W M; Thomas, N Simon; Roberts, Sian; Bellamy, Margaret; Curran, Sarah R; Sporikou, Caroline M J; Bolton, Patrick F

    2005-10-01

    Studies of chromosome 15 abnormality have implicated over-expression of paternally imprinted genes in the 15q11-13 region in the aetiology of autism. To test this hypothesis we compared individuals with Prader-Willi syndrome (PWS) due to uniparental disomy (UPD--where paternally imprinted genes are over-expressed) to individuals with the 15q11-13 deletion form of the syndrome (where paternally imprinted genes are not over-expressed). We also tested reports that PWS cases due to the larger type I (TI) form of deletion show differences to cases with the smaller type II (TII) deletion. Ninety-six individuals with PWS were recruited from genetic centres and the PWS association. Forty-nine individuals were confirmed as having maternal UPD of chromosome 15 and were age and sex matched to 47 individuals with a deletion involving 15q11-13 (32 had the shorter (T II) deletion, and 14 had the longer (TI) deletion). Behavioural assessments were carried out blind to genetic status, using the Autism Diagnostic Observation Schedule (ADOS), the Autism Diagnostic Interview (ADI), the Autism Screening Questionnaire (ASQ), the Children's Yale-Brown Obsessive-Compulsive Scale (CY-BOCS), the Vineland Adaptive Behaviour Scales (VABS), and measurements of intellectual ability, including the Wechsler and Mullen Scales and Raven's Matrices. UPD cases exhibited significantly more autistic-like impairments in reciprocal social interaction on questionnaire, interview and standardised observational measures. Comparison of TI and TII deletion cases revealed few differences, but ability levels tended to be lower in the TI deletion cases. Findings from a large study comparing deletion and UPD forms of Prader-Willi syndrome were consistent with other evidence in indicating that paternally imprinted genes in the 15q11-13 region constitute a genetic risk factor for aspects of autistic symptomatology. These genes may therefore play a role in the aetiology of autism. By contrast with another report, there was no clear-cut relationship between the size of the deletion and the form of cognitive and behavioural phenotype.

  10. Identification of novel diagnostic biomarkers for thyroid carcinoma

    PubMed Central

    Wang, Xiliang; Zhang, Qing; Cai, Zhiming; Dai, Yifan; Mou, Lisha

    2017-01-01

    Thyroid carcinoma (THCA) is the most universal endocrine malignancy worldwide. Unfortunately, a limited number of large-scale analyses have been performed to identify biomarkers for THCA. Here, we conducted a meta-analysis using 505 THCA patients and 59 normal controls from The Cancer Genome Atlas. After identifying differentially expressed long non-coding RNA (lncRNA) and protein coding genes (PCG), we found vast difference in various lncRNA-PCG co-expressed pairs in THCA. A dysregulation network with scale-free topology was constructed. Four molecules (LA16c-380H5.2, RP11-203J24.8, MLF1 and SDC4) could potentially serve as diagnostic biomarkers of THCA with high sensitivity and specificity. We further represent a diagnostic panel with expression cutoff values. Our results demonstrate the potential application of those four molecules as novel independent biomarkers for THCA diagnosis. PMID:29340074

  11. Finding genes discriminating smokers from non-smokers by applying a growing self-organizing clustering method to large airway epithelium cell microarray data.

    PubMed

    Shahdoust, Maryam; Hajizadeh, Ebrahim; Mozdarani, Hossein; Chehrei, Ali

    2013-01-01

    Cigarette smoking is the major risk factor for development of lung cancer. Identification of effects of tobacco on airway gene expression may provide insight into the causes. This research aimed to compare gene expression of large airway epithelium cells in normal smokers (n=13) and non-smokers (n=9) in order to find genes which discriminate the two groups and assess cigarette smoking effects on large airway epithelium cells. Genes discriminating smokers from non-smokers were identified by applying a neural network clustering method, growing self-organizing maps (GSOM), to microarray data according to class discrimination scores. An index was computed based on differentiation between each mean of gene expression in the two groups. This clustering approach provided the possibility of comparing thousands of genes simultaneously. The applied approach compared the mean of 7,129 genes in smokers and non-smokers simultaneously and classified the genes of large airway epithelium cells which had differently expressed in smokers comparing with non-smokers. Seven genes were identified which had the highest different expression in smokers compared with the non-smokers group: NQO1, H19, ALDH3A1, AKR1C1, ABHD2, GPX2 and ADH7. Most (NQO1, ALDH3A1, AKR1C1, H19 and GPX2) are known to be clinically notable in lung cancer studies. Furthermore, statistical discriminate analysis showed that these genes could classify samples in smokers and non-smokers correctly with 100% accuracy. With the performed GSOM map, other nodes with high average discriminate scores included genes with alterations strongly related to the lung cancer such as AKR1C3, CYP1B1, UCHL1 and AKR1B10. This clustering by comparing expression of thousands of genes at the same time revealed alteration in normal smokers. Most of the identified genes were strongly relevant to lung cancer in the existing literature. The genes may be utilized to identify smokers with increased risk for lung cancer. A large sample study is now recommended to determine relations between the genes ABHD2 and ADH7 and smoking.

  12. Course 10: Three Lectures on Biological Networks

    NASA Astrophysics Data System (ADS)

    Magnasco, M. O.

    1 Enzymatic networks. Proofreading knots: How DNA topoisomerases disentangle DNA 1.1 Length scales and energy scales 1.2 DNA topology 1.3 Topoisomerases 1.4 Knots and supercoils 1.5 Topological equilibrium 1.6 Can topoisomerases recognize topology? 1.7 Proposal: Kinetic proofreading 1.8 How to do it twice 1.9 The care and proofreading of knots 1.10 Suppression of supercoils 1.11 Problems and outlook 1.12 Disquisition 2 Gene expression networks. Methods for analysis of DNA chip experiments 2.1 The regulation of gene expression 2.2 Gene expression arrays 2.3 Analysis of array data 2.4 Some simplifying assumptions 2.5 Probeset analysis 2.6 Discussion 3 Neural and gene expression networks: Song-induced gene expression in the canary brain 3.1 The study of songbirds 3.2 Canary song 3.3 ZENK 3.4 The blush 3.5 Histological analysis 3.6 Natural vs. artificial 3.7 The Blush II: gAP 3.8 Meditation

  13. Transcriptomics of environmental acclimatization and survival in wild adult Pacific sockeye salmon (Oncorhynchus nerka) during spawning migration.

    PubMed

    Evans, Tyler G; Hammill, Edd; Kaukinen, Karia; Schulze, Angela D; Patterson, David A; English, Karl K; Curtis, Janelle M R; Miller, Kristina M

    2011-11-01

    Environmental shifts accompanying salmon spawning migrations from ocean feeding grounds to natal freshwater streams can be severe, with the underlying stress often cited as a cause of increased mortality. Here, a salmonid microarray was used to characterize changes in gene expression occurring between ocean and river habitats in gill and liver tissues of wild migrating sockeye salmon (Oncorhynchus nerka Walbaum) returning to spawn in the Fraser River, British Columbia, Canada. Expression profiles indicate that the transcriptome of migrating salmon is strongly affected by shifting abiotic and biotic conditions encountered along migration routes. Conspicuous shifts in gene expression associated with changing salinity, temperature, pathogen exposure and dissolved oxygen indicate that these environmental variables most strongly impact physiology during spawning migrations. Notably, transcriptional changes related to osmoregulation were largely preparatory and occurred well before salmon encountered freshwater. In the river environment, differential expression of genes linked with elevated temperatures indicated that thermal regimes within the Fraser River are approaching tolerance limits for adult salmon. To empirically correlate gene expression with survival, biopsy sampling of gill tissue and transcriptomic profiling were combined with telemetry. Many genes correlated with environmental variables were differentially expressed between premature mortalities and successful migrants. Parametric survival analyses demonstrated a broad-scale transcriptional regulator, cofactor required for Sp1 transcriptional activation (CRSP), to be significantly predictive of survival. As the environmental characteristics of salmon habitats continue to change, establishing how current environmental conditions influence salmon physiology under natural conditions is critical to conserving this ecologically and economically important fish species. © 2011 Blackwell Publishing Ltd.

  14. Particle Radiation signals the Expression of Genes in stress-associated Pathways

    NASA Astrophysics Data System (ADS)

    Blakely, E.; Chang, P.; Bjornstad, K.; Dosanjh, M.; Cherbonnel, C.; Rosen, C.

    The explosive development of microarray screening methods has propelled genome research in a variety of biological systems allowing investigators to examine large-scale alterations in gene expression for research in toxicology pathology and therapy The radiation environment in space is complex and encompasses a variety of highly energetic and charged particles Estimation of biological responses after exposure to these types of radiation is important for NASA in their plans for long-term manned space missions Instead of using the 10 000 gene arrays that are in the marketplace we have chosen to examine particle radiation-induced changes in gene expression using a focused DNA microarray system to study the expression of about 100 genes specifically associated with both the upstream and downstream aspects of the TP53 stress-responsive pathway Genes that are regulated by TP53 include functional clusters that are implicated in cell cycle arrest apoptosis and DNA repair A cultured human lens epithelial cell model Blakely et al IOVS 41 3808 2000 was used for these studies Additional human normal and radiosensitive fibroblast cell lines have also been examined Lens cells were grown on matrix-coated substrate and exposed to 55 MeV u protons at the 88 cyclotron in LBNL or 1 GeV u Iron ions at the NASA Space Radiation Laboratory The other cells lines were grown on conventional tissue culture plasticware RNA and proteins were harvested at different times after irradiation RNA was isolated from sham-treated or select irradiated populations

  15. Heterologous expression of the immunomodulatory protein gene from Ganoderma sinense in the basidiomycete Coprinopsis cinerea.

    PubMed

    Han, F; Liu, Y; Guo, L Q; Zeng, X L; Liu, Z M; Lin, J F

    2010-11-01

    FIP-gsi, a fungal immunomodulatory protein found in Ganoderma sinense, has antitumour, anti-allergy and immunomodulatory activities and is regulated by the fip-gsi gene. In this study, we aimed to express the fip-gsi gene from G. sinense in Coprinopsis cinerea to increase yield of FIPs-gsi. A fungal expression vector pBfip-gsi containing the gpd promoter from Agaricus bisporus and the fip-gsi gene from the G. sinense was constructed and transformed into C. cinerea. PCR and Southern blotting analysis verified the successful integration of the exogenous gene fip-gsi into the genome of C. cinerea. RT-PCR and Northern blotting analysis confirmed that the fip-gsi gene was transcribed in C. cinerea. The yield of the FIP-gsi protein reached 314mg kg(-1) fresh mycelia. The molecular weight of the FIP-gsi was 13kDa, and the FIP-gsi was capable of hemagglutinating mouse red blood cells, but no such activity was observed towards human red blood cells in vitro. The fip-gsi from G. sinense has been successfully translated in C. cinerea, and the yield of bioactive FIP-gsi protein was high. This is the first report using the C. cinerea for the heterologous expression of FIP-gsi protein and it might supply a basis for large-scale production of the protein. © 2010 The Authors. Journal of Applied Microbiology © 2010 The Society for Applied Microbiology.

  16. Prokaryotic expression of CP gene of Fritillary virus Y infecting Thunberg fritillary and antiserum preparation.

    PubMed

    Wei, Chuan-Bao; Wei, Yang-Yang; Yang, Yu; Liu, Shi-Liang; Hu, Hao-Yu; He, Yue

    2011-10-01

    To prepare antiserum against Fritillary virus Y (FVY) CP for detecting FVY and study serological relationships with other viruses. Specific primer was designed according to Genbank (accession: AM039800) to amplify CP gene of FVY infecting Thunberg fritillary. Sequence relationship with other potyviruses was made by Blast. The CP gene was inserted into pSBET and expressed in Escherichia coli BL21 (DE3) plys E strain. The object protein was purified by 12% SDS-PAGE firstly and subsequently 5% - 20% gradient SDS-PAGE. The antiserum against the CP was raised in mouse and its specificity was confirmed by Western blot analysis. The reactivity of the antiserum produced to FVY CP was tested by Western blot against the over-expressed coat proteins of 17 potyviruses. The ability to combine with nature FVY particles was confirmed by ELISA analysis. It shared 81.2% nucleotide acids identities with TrVY (Tricyrtis virus Y, AY 864850) CP gene, 68.1% with SMV-P (Soybean mosaic virus Pinellia strain, AJ507388. 2) CP gene and 67.2% with ZYMV (Zucchini yellow mosaic virus Luan isolate) CP gene. The prepared antiserum was special to FVY CP, also reacted moderately to the expressed CP of SMV-P (Soybean mosaic virus Pinellia strain) and weakly to that of ZYMV (Zucchini yellow mosaic virus Luan isolate). The antibody could combine to nature FVY particles and the antiserum is suitable for FVY detection by ELISA in large scale.

  17. Genome multiplication as adaptation to tissue survival: evidence from gene expression in mammalian heart and liver.

    PubMed

    Anatskaya, Olga V; Vinogradov, Alexander E

    2007-01-01

    To elucidate the functional significance of genome multiplication in somatic tissues, we performed a large-scale analysis of ploidy-associated changes in expression of non-tissue-specific (i.e., broadly expressed) genes in the heart and liver of human and mouse (6585 homologous genes were analyzed). These species have inverse patterns of polyploidization in cardiomyocytes and hepatocytes. The between-species comparison of two pairs of homologous tissues with crisscross contrast in ploidy levels allows the removal of the effects of species and tissue specificity on the profile of gene activity. The different tests performed from the standpoint of modular biology revealed a consistent picture of ploidy-associated alteration in a wide range of functional gene groups. The major effects consisted of hypoxia-inducible factor-triggered changes in main cellular processes and signaling pathways, activation of defense against DNA lesions, acceleration of protein turnover and transcription, and the impairment of apoptosis, the immune response, and cytoskeleton maintenance. We also found a severe decline in aerobic respiration and stimulation of sugar and fatty acid metabolism. These metabolic rearrangements create a special type of metabolism that can be considered intermediate between aerobic and anaerobic. The metabolic and physiological changes revealed (reflected in the alteration of gene expression) help explain the unique ability of polyploid tissues to combine proliferation and differentiation, which are separated in diploid tissues. We argue that genome multiplication promotes cell survival and tissue regeneration under stressful conditions.

  18. Mammalian Synthetic Biology: Time for Big MACs.

    PubMed

    Martella, Andrea; Pollard, Steven M; Dai, Junbiao; Cai, Yizhi

    2016-10-21

    The enabling technologies of synthetic biology are opening up new opportunities for engineering and enhancement of mammalian cells. This will stimulate diverse applications in many life science sectors such as regenerative medicine, development of biosensing cell lines, therapeutic protein production, and generation of new synthetic genetic regulatory circuits. Harnessing the full potential of these new engineering-based approaches requires the design and assembly of large DNA constructs-potentially up to chromosome scale-and the effective delivery of these large DNA payloads to the host cell. Random integration of large transgenes, encoding therapeutic proteins or genetic circuits into host chromosomes, has several drawbacks such as risks of insertional mutagenesis, lack of control over transgene copy-number and position-specific effects; these can compromise the intended functioning of genetic circuits. The development of a system orthogonal to the endogenous genome is therefore beneficial. Mammalian artificial chromosomes (MACs) are functional, add-on chromosomal elements, which behave as normal chromosomes-being replicating and portioned to daughter cells at each cell division. They are deployed as useful gene expression vectors as they remain independent from the host genome. MACs are maintained as a single-copy and can accommodate multiple gene expression cassettes of, in theory, unlimited DNA size (MACs up to 10 megabases have been constructed). MACs therefore enabled control over ectopic gene expression and represent an excellent platform to rapidly prototype and characterize novel synthetic gene circuits without recourse to engineering the host genome. This review describes the obstacles synthetic biologists face when working with mammalian systems and how the development of improved MACs can overcome these-particularly given the spectacular advances in DNA synthesis and assembly that are fuelling this research area.

  19. Identification of transcriptome involved in atrazine detoxification and degradation in alfalfa (Medicago sativa) exposed to realistic environmental contamination.

    PubMed

    Zhang, Jing Jing; Lu, Yi Chen; Zhang, Shu Hao; Lu, Feng Fan; Yang, Hong

    2016-08-01

    Plants are constantly exposed to a variety of toxic compounds (or xenobiotics) such as pesticides (or herbicides). Atrazine (ATZ) as herbicide has become one of the environmental contaminants due to its intensive use during crop production. Plants have evolved strategies to cope with the adverse impact of ATZ. However, the mechanism for ATZ degradation and detoxification in plants is largely unknown. Here we employed a global RNA-sequencing (RNA-Seq) strategy to dissect transcriptome variation in alfalfa (Medicago sativa) exposed to ATZ. Four libraries were constructed including Root-ATZ (root control, ATZ-free), Shoot-ATZ, Root+ATZ (root treated with ATZ) and Shoot+ATZ. Hierarchical clustering was performed to display the expression patterns for all differentially expressed genes (DEGs) under ATZ exposure. Transcripts involved in ATZ detoxification, stress responses (e.g. oxidation and reduction, conjugation and hydrolytic reactions), and regulations of cysteine biosynthesis were identified. Several genes encoding glycosyltransferases, glutathione S-transferases or ABC transporters were up-regulated notably. Also, many other genes involved in oxidation-reduction, conjugation, and hydrolysis for herbicide degradation were differentially expressed. These results suggest that ATZ in alfalfa can be detoxified or degraded through different pathways. The expression patterns of some DEGs by high-throughput sequencing were well confirmed by qRT-PCR. Our results not only highlight the transcriptional complexity in alfalfa exposed to ATZ but represent a major improvement for analyzing transcriptional changes on a large scale as well. Copyright © 2016 Elsevier Inc. All rights reserved.

  20. Merkel cell polyomavirus recruits MYCL to the EP400 complex to promote oncogenesis.

    PubMed

    Cheng, Jingwei; Park, Donglim Esther; Berrios, Christian; White, Elizabeth A; Arora, Reety; Yoon, Rosa; Branigan, Timothy; Xiao, Tengfei; Westerling, Thomas; Federation, Alexander; Zeid, Rhamy; Strober, Benjamin; Swanson, Selene K; Florens, Laurence; Bradner, James E; Brown, Myles; Howley, Peter M; Padi, Megha; Washburn, Michael P; DeCaprio, James A

    2017-10-01

    Merkel cell carcinoma (MCC) frequently contains integrated copies of Merkel cell polyomavirus DNA that express a truncated form of Large T antigen (LT) and an intact Small T antigen (ST). While LT binds RB and inactivates its tumor suppressor function, it is less clear how ST contributes to MCC tumorigenesis. Here we show that ST binds specifically to the MYC homolog MYCL (L-MYC) and recruits it to the 15-component EP400 histone acetyltransferase and chromatin remodeling complex. We performed a large-scale immunoprecipitation for ST and identified co-precipitating proteins by mass spectrometry. In addition to protein phosphatase 2A (PP2A) subunits, we identified MYCL and its heterodimeric partner MAX plus the EP400 complex. Immunoprecipitation for MAX and EP400 complex components confirmed their association with ST. We determined that the ST-MYCL-EP400 complex binds together to specific gene promoters and activates their expression by integrating chromatin immunoprecipitation with sequencing (ChIP-seq) and RNA-seq. MYCL and EP400 were required for maintenance of cell viability and cooperated with ST to promote gene expression in MCC cell lines. A genome-wide CRISPR-Cas9 screen confirmed the requirement for MYCL and EP400 in MCPyV-positive MCC cell lines. We demonstrate that ST can activate gene expression in a EP400 and MYCL dependent manner and this activity contributes to cellular transformation and generation of induced pluripotent stem cells.

  1. Large-scale purification and characterization of recombinant human stem cell factor in Escherichia coli.

    PubMed

    Chen, Liang-Hua; Cai, Feng; Zhang, Dan-Ju; Zhang, Li; Zhu, Peng; Gao, Shun

    2017-07-01

    The pharmacological importance of recombinant human stem cell factor (rhSCF) has increased the demand to establish effective and large-scale production and purification processes. A good source of bioactive recombinant protein with capability of being scaled-up without losing activity has always been a challenge. The objectives of the study were the rapid and efficient pilot-scale expression and purification of rhSCF. The gene encoding stem cell factor (SCF) was cloned into pBV220 and transformed into Escherichia coli. The recombinant SCF was expressed and isolated using a procedure consisting of isolation of inclusion bodies (IBs), denaturation, and refolding followed by chromatographic steps toward purification. The yield of rhSCF reached 835.6 g/20 L, and the expression levels of rhSCF were about 33.9% of the total E. coli protein content. rhSCF was purified by isolation of IBs, denaturation, and refolding, followed by SP-Sepharose chromatography, Source 30 reversed-phase chromatography, and Q-Sepharose chromatography. This procedure was developed to isolate 5.5 g of rhSCF (99.5% purity) with specific activity at 0.96 × 10 6  IU/mg, endotoxin levels of pyrogen at 1.0 EU/mg, and bacterial DNA at 10 ng/mg. Pilot-scale fermentations and purifications were set up for the production of rhSCF that can be upscaled for industry. © 2016 International Union of Biochemistry and Molecular Biology, Inc.

  2. Global Analysis of Transcriptome Responses and Gene Expression Profiles to Cold Stress of Jatropha curcas L.

    PubMed Central

    Wang, Haibo; Zou, Zhurong; Wang, Shasha; Gong, Ming

    2013-01-01

    Background Jatropha curcas L., also called the Physic nut, is an oil-rich shrub with multiple uses, including biodiesel production, and is currently exploited as a renewable energy resource in many countries. Nevertheless, because of its origin from the tropical MidAmerican zone, J. curcas confers an inherent but undesirable characteristic (low cold resistance) that may seriously restrict its large-scale popularization. This adaptive flaw can be genetically improved by elucidating the mechanisms underlying plant tolerance to cold temperatures. The newly developed Illumina Hiseq™ 2000 RNA-seq and Digital Gene Expression (DGE) are deep high-throughput approaches for gene expression analysis at the transcriptome level, using which we carefully investigated the gene expression profiles in response to cold stress to gain insight into the molecular mechanisms of cold response in J. curcas. Results In total, 45,251 unigenes were obtained by assembly of clean data generated by RNA-seq analysis of the J. curcas transcriptome. A total of 33,363 and 912 complete or partial coding sequences (CDSs) were determined by protein database alignments and ESTScan prediction, respectively. Among these unigenes, more than 41.52% were involved in approximately 128 known metabolic or signaling pathways, and 4,185 were possibly associated with cold resistance. DGE analysis was used to assess the changes in gene expression when exposed to cold condition (12°C) for 12, 24, and 48 h. The results showed that 3,178 genes were significantly upregulated and 1,244 were downregulated under cold stress. These genes were then functionally annotated based on the transcriptome data from RNA-seq analysis. Conclusions This study provides a global view of transcriptome response and gene expression profiling of J. curcas in response to cold stress. The results can help improve our current understanding of the mechanisms underlying plant cold resistance and favor the screening of crucial genes for genetically enhancing cold resistance in J. curcas. PMID:24349370

  3. Global analysis of transcriptome responses and gene expression profiles to cold stress of Jatropha curcas L.

    PubMed

    Wang, Haibo; Zou, Zhurong; Wang, Shasha; Gong, Ming

    2013-01-01

    Jatropha curcas L., also called the Physic nut, is an oil-rich shrub with multiple uses, including biodiesel production, and is currently exploited as a renewable energy resource in many countries. Nevertheless, because of its origin from the tropical MidAmerican zone, J. curcas confers an inherent but undesirable characteristic (low cold resistance) that may seriously restrict its large-scale popularization. This adaptive flaw can be genetically improved by elucidating the mechanisms underlying plant tolerance to cold temperatures. The newly developed Illumina Hiseq™ 2000 RNA-seq and Digital Gene Expression (DGE) are deep high-throughput approaches for gene expression analysis at the transcriptome level, using which we carefully investigated the gene expression profiles in response to cold stress to gain insight into the molecular mechanisms of cold response in J. curcas. In total, 45,251 unigenes were obtained by assembly of clean data generated by RNA-seq analysis of the J. curcas transcriptome. A total of 33,363 and 912 complete or partial coding sequences (CDSs) were determined by protein database alignments and ESTScan prediction, respectively. Among these unigenes, more than 41.52% were involved in approximately 128 known metabolic or signaling pathways, and 4,185 were possibly associated with cold resistance. DGE analysis was used to assess the changes in gene expression when exposed to cold condition (12°C) for 12, 24, and 48 h. The results showed that 3,178 genes were significantly upregulated and 1,244 were downregulated under cold stress. These genes were then functionally annotated based on the transcriptome data from RNA-seq analysis. This study provides a global view of transcriptome response and gene expression profiling of J. curcas in response to cold stress. The results can help improve our current understanding of the mechanisms underlying plant cold resistance and favor the screening of crucial genes for genetically enhancing cold resistance in J. curcas.

  4. PLEXdb: Gene expression resources for plants and plant pathogens

    USDA-ARS?s Scientific Manuscript database

    PLEXdb (Plant Expression Database), in partnership with community databases, supports comparisons of gene expression across multiple plant and pathogen species, promoting individuals and/or consortia to upload genome-scale data sets to contrast them to previously archived data. These analyses facili...

  5. Spermatogenesis in mammals: proteomic insights.

    PubMed

    Chocu, Sophie; Calvel, Pierre; Rolland, Antoine D; Pineau, Charles

    2012-08-01

    Spermatogenesis is a highly sophisticated process involved in the transmission of genetic heritage. It includes halving ploidy, repackaging of the chromatin for transport, and the equipment of developing spermatids and eventually spermatozoa with the advanced apparatus (e.g., tightly packed mitochondrial sheat in the mid piece, elongating of the tail, reduction of cytoplasmic volume) to elicit motility once they reach the epididymis. Mammalian spermatogenesis is divided into three phases. In the first the primitive germ cells or spermatogonia undergo a series of mitotic divisions. In the second the spermatocytes undergo two consecutive divisions in meiosis to produce haploid spermatids. In the third the spermatids differentiate into spermatozoa in a process called spermiogenesis. Paracrine, autocrine, juxtacrine, and endocrine pathways all contribute to the regulation of the process. The array of structural elements and chemical factors modulating somatic and germ cell activity is such that the network linking the various cellular activities during spermatogenesis is unimaginably complex. Over the past two decades, advances in genomics have greatly improved our knowledge of spermatogenesis, by identifying numerous genes essential for the development of functional male gametes. Large-scale analyses of testicular function have deepened our insight into normal and pathological spermatogenesis. Progress in genome sequencing and microarray technology have been exploited for genome-wide expression studies, leading to the identification of hundreds of genes differentially expressed within the testis. However, although proteomics has now come of age, the proteomics-based investigation of spermatogenesis remains in its infancy. Here, we review the state-of-the-art of large-scale proteomic analyses of spermatogenesis, from germ cell development during sex determination to spermatogenesis in the adult. Indeed, a few laboratories have undertaken differential protein profiling expression studies and/or systematic analyses of testicular proteomes in entire organs or isolated cells from various species. We consider the pros and cons of proteomics for studying the testicular germ cell gene expression program. Finally, we address the use of protein datasets, through integrative genomics (i.e., combining genomics, transcriptomics, and proteomics), bioinformatics, and modelling.

  6. Developmental Transcriptome of Aplysia californica

    PubMed Central

    HEYLAND, ANDREAS; VUE, ZER; VOOLSTRA, CHRISTIAN R.; MEDINA, MÓNICA; MOROZ, LEONID L.

    2014-01-01

    Genome-wide transcriptional changes in development provide important insight into mechanisms underlying growth, differentiation, and patterning. However, such large-scale developmental studies have been limited to a few representatives of Ecdysozoans and Chordates. Here, we characterize transcriptomes of embryonic, larval, and metamorphic development in the marine mollusc Aplysia californica and reveal novel molecular components associated with life history transitions. Specifically, we identify more than 20 signal peptides, putative hormones, and transcription factors in association with early development and metamorphic stages—many of which seem to be evolutionarily conserved elements of signal transduction pathways. We also characterize genes related to biomineralization—a critical process of molluscan development. In summary, our experiment provides the first large-scale survey of gene expression in mollusc development, and complements previous studies on the regulatory mechanisms underlying body plan patterning and the formation of larval and juvenile structures. This study serves as a resource for further functional annotation of transcripts and genes in Aplysia, specifically and molluscs in general. A comparison of the Aplysia developmental transcriptome with similar studies in the zebra fish Danio rerio, the fruit fly Drosophila melanogaster, the nematode Caenorhabditis elegans, and other studies on molluscs suggests an overall highly divergent pattern of gene regulatory mechanisms that are likely a consequence of the different developmental modes of these organisms. PMID:21328528

  7. Soybean kinome: functional classification and gene expression patterns

    PubMed Central

    Liu, Jinyi; Chen, Nana; Grant, Joshua N.; Cheng, Zong-Ming (Max); Stewart, C. Neal; Hewezi, Tarek

    2015-01-01

    The protein kinase (PK) gene family is one of the largest and most highly conserved gene families in plants and plays a role in nearly all biological functions. While a large number of genes have been predicted to encode PKs in soybean, a comprehensive functional classification and global analysis of expression patterns of this large gene family is lacking. In this study, we identified the entire soybean PK repertoire or kinome, which comprised 2166 putative PK genes, representing 4.67% of all soybean protein-coding genes. The soybean kinome was classified into 19 groups, 81 families, and 122 subfamilies. The receptor-like kinase (RLK) group was remarkably large, containing 1418 genes. Collinearity analysis indicated that whole-genome segmental duplication events may have played a key role in the expansion of the soybean kinome, whereas tandem duplications might have contributed to the expansion of specific subfamilies. Gene structure, subcellular localization prediction, and gene expression patterns indicated extensive functional divergence of PK subfamilies. Global gene expression analysis of soybean PK subfamilies revealed tissue- and stress-specific expression patterns, implying regulatory functions over a wide range of developmental and physiological processes. In addition, tissue and stress co-expression network analysis uncovered specific subfamilies with narrow or wide interconnected relationships, indicative of their association with particular or broad signalling pathways, respectively. Taken together, our analyses provide a foundation for further functional studies to reveal the biological and molecular functions of PKs in soybean. PMID:25614662

  8. Comparative Study of Regulatory Circuits in Two Sea Urchin Species Reveals Tight Control of Timing and High Conservation of Expression Dynamics

    PubMed Central

    Gildor, Tsvia; Ben-Tabou de-Leon, Smadar

    2015-01-01

    Accurate temporal control of gene expression is essential for normal development and must be robust to natural genetic and environmental variation. Studying gene expression variation within and between related species can delineate the level of expression variability that development can tolerate. Here we exploit the comprehensive model of sea urchin gene regulatory networks and generate high-density expression profiles of key regulatory genes of the Mediterranean sea urchin, Paracentrotus lividus (Pl). The high resolution of our studies reveals highly reproducible gene initiation times that have lower variation than those of maximal mRNA levels between different individuals of the same species. This observation supports a threshold behavior of gene activation that is less sensitive to input concentrations. We then compare Mediterranean sea urchin gene expression profiles to those of its Pacific Ocean relative, Strongylocentrotus purpuratus (Sp). These species shared a common ancestor about 40 million years ago and show highly similar embryonic morphologies. Our comparative analyses of five regulatory circuits operating in different embryonic territories reveal a high conservation of the temporal order of gene activation but also some cases of divergence. A linear ratio of 1.3-fold between gene initiation times in Pl and Sp is partially explained by scaling of the developmental rates with temperature. Scaling the developmental rates according to the estimated Sp-Pl ratio and normalizing the expression levels reveals a striking conservation of relative dynamics of gene expression between the species. Overall, our findings demonstrate the ability of biological developmental systems to tightly control the timing of gene activation and relative dynamics and overcome expression noise induced by genetic variation and growth conditions. PMID:26230518

  9. A RNA-Seq Analysis of the Rat Supraoptic Nucleus Transcriptome: Effects of Salt Loading on Gene Expression

    PubMed Central

    Salinas, Yasmmyn D.; Shi, YiJun; Greenwood, Michael; Hoe, See Ziau; Murphy, David; Gainer, Harold

    2015-01-01

    Magnocellular neurons (MCNs) in the hypothalamo-neurohypophysial system (HNS) are highly specialized to release large amounts of arginine vasopressin (Avp) or oxytocin (Oxt) into the blood stream and play critical roles in the regulation of body fluid homeostasis. The MCNs are osmosensory neurons and are excited by exposure to hypertonic solutions and inhibited by hypotonic solutions. The MCNs respond to systemic hypertonic and hypotonic stimulation with large changes in the expression of their Avp and Oxt genes, and microarray studies have shown that these osmotic perturbations also cause large changes in global gene expression in the HNS. In this paper, we examine gene expression in the rat supraoptic nucleus (SON) under normosmotic and chronic salt-loading SL) conditions by the first time using “new-generation”, RNA sequencing (RNA-Seq) methods. We reliably detect 9,709 genes as present in the SON by RNA-Seq, and 552 of these genes were changed in expression as a result of chronic SL. These genes reflect diverse functions, and 42 of these are involved in either transcriptional or translational processes. In addition, we compare the SON transcriptomes resolved by RNA-Seq methods with the SON transcriptomes determined by Affymetrix microarray methods in rats under the same osmotic conditions, and find that there are 6,466 genes present in the SON that are represented in both data sets, although 1,040 of the expressed genes were found only in the microarray data, and 2,762 of the expressed genes are selectively found in the RNA-Seq data and not the microarray data. These data provide the research community a comprehensive view of the transcriptome in the SON under normosmotic conditions and the changes in specific gene expression evoked by salt loading. PMID:25897513

  10. SZGR 2.0: a one-stop shop of schizophrenia candidate genes

    PubMed Central

    Jia, Peilin; Han, Guangchun; Zhao, Junfei; Lu, Pinyi; Zhao, Zhongming

    2017-01-01

    SZGR 2.0 is a comprehensive resource of candidate variants and genes for schizophrenia, covering genetic, epigenetic, transcriptomic, translational and many other types of evidence. By systematic review and curation of multiple lines of evidence, we included almost all variants and genes that have ever been reported to be associated with schizophrenia. In particular, we collected ∼4200 common variants reported in genome-wide association studies, ∼1000 de novo mutations discovered by large-scale sequencing of family samples, 215 genes spanning rare and replication copy number variations, 99 genes overlapping with linkage regions, 240 differentially expressed genes, 4651 differentially methylated genes and 49 genes as antipsychotic drug targets. To facilitate interpretation, we included various functional annotation data, especially brain eQTL, methylation QTL, brain expression featured in deep categorization of brain areas and developmental stages and brain-specific promoter and enhancer annotations. Furthermore, we conducted cross-study, cross-data type and integrative analyses of the multidimensional data deposited in SZGR 2.0, and made the data and results available through a user-friendly interface. In summary, SZGR 2.0 provides a one-stop shop of schizophrenia variants and genes and their function and regulation, providing an important resource in the schizophrenia and other mental disease community. SZGR 2.0 is available at https://bioinfo.uth.edu/SZGR/. PMID:27733502

  11. Directed random walks and constraint programming reveal active pathways in hepatocyte growth factor signaling.

    PubMed

    Kittas, Aristotelis; Delobelle, Aurélien; Schmitt, Sabrina; Breuhahn, Kai; Guziolowski, Carito; Grabe, Niels

    2016-01-01

    An effective means to analyze mRNA expression data is to take advantage of established knowledge from pathway databases, using methods such as pathway-enrichment analyses. However, pathway databases are not case-specific and expression data could be used to infer gene-regulation patterns in the context of specific pathways. In addition, canonical pathways may not always describe the signaling mechanisms properly, because interactions can frequently occur between genes in different pathways. Relatively few methods have been proposed to date for generating and analyzing such networks, preserving the causality between gene interactions and reasoning over the qualitative logic of regulatory effects. We present an algorithm (MCWalk) integrated with a logic programming approach, to discover subgraphs in large-scale signaling networks by random walks in a fully automated pipeline. As an exemplary application, we uncover the signal transduction mechanisms in a gene interaction network describing hepatocyte growth factor-stimulated cell migration and proliferation from gene-expression measured with microarray and RT-qPCR using in-house perturbation experiments in a keratinocyte-fibroblast co-culture. The resulting subgraphs illustrate possible associations of hepatocyte growth factor receptor c-Met nodes, differentially expressed genes and cellular states. Using perturbation experiments and Answer Set programming, we are able to select those which are more consistent with the experimental data. We discover key regulator nodes by measuring the frequency with which they are traversed when connecting signaling between receptors and significantly regulated genes and predict their expression-shift consistently with the measured data. The Java implementation of MCWalk is publicly available under the MIT license at: https://bitbucket.org/akittas/biosubg. © 2015 FEBS.

  12. De Novo Assembly and Annotation of the Transcriptome of the Agricultural Weed Ipomoea purpurea Uncovers Gene Expression Changes Associated with Herbicide Resistance

    PubMed Central

    Leslie, Trent; Baucom, Regina S.

    2014-01-01

    Human-mediated selection can lead to rapid evolution in very short time scales, and the evolution of herbicide resistance in agricultural weeds is an excellent example of this phenomenon. The common morning glory, Ipomoea purpurea, is resistant to the herbicide glyphosate, but genetic investigations of this trait have been hampered by the lack of genomic resources for this species. Here, we present the annotated transcriptome of the common morning glory, Ipomoea purpurea, along with an examination of whole genome expression profiling to assess potential gene expression differences between three artificially selected herbicide resistant lines and three susceptible lines. The assembled Ipomoea transcriptome reported in this work contains 65,459 assembled transcripts, ~28,000 of which were functionally annotated by assignment to Gene Ontology categories. Our RNA-seq survey using this reference transcriptome identified 19 differentially expressed genes associated with resistance—one of which, a cytochrome P450, belongs to a large plant family of genes involved in xenobiotic detoxification. The differentially expressed genes also broadly implicated receptor-like kinases, which were down-regulated in the resistant lines, and other growth and defense genes, which were up-regulated in resistant lines. Interestingly, the target of glyphosate—EPSP synthase—was not overexpressed in the resistant Ipomoea lines as in other glyphosate resistant weeds. Overall, this work identifies potential candidate resistance loci for future investigations and dramatically increases genomic resources for this species. The assembled transcriptome presented herein will also provide a valuable resource to the Ipomoea community, as well as to those interested in utilizing the close relationship between the Convolvulaceae and the Solanaceae for phylogenetic and comparative genomics examinations. PMID:25155274

  13. De novo assembly and annotation of the transcriptome of the agricultural weed Ipomoea purpurea uncovers gene expression changes associated with herbicide resistance.

    PubMed

    Leslie, Trent; Baucom, Regina S

    2014-08-25

    Human-mediated selection can lead to rapid evolution in very short time scales, and the evolution of herbicide resistance in agricultural weeds is an excellent example of this phenomenon. The common morning glory, Ipomoea purpurea, is resistant to the herbicide glyphosate, but genetic investigations of this trait have been hampered by the lack of genomic resources for this species. Here, we present the annotated transcriptome of the common morning glory, Ipomoea purpurea, along with an examination of whole genome expression profiling to assess potential gene expression differences between three artificially selected herbicide resistant lines and three susceptible lines. The assembled Ipomoea transcriptome reported in this work contains 65,459 assembled transcripts, ~28,000 of which were functionally annotated by assignment to Gene Ontology categories. Our RNA-seq survey using this reference transcriptome identified 19 differentially expressed genes associated with resistance-one of which, a cytochrome P450, belongs to a large plant family of genes involved in xenobiotic detoxification. The differentially expressed genes also broadly implicated receptor-like kinases, which were down-regulated in the resistant lines, and other growth and defense genes, which were up-regulated in resistant lines. Interestingly, the target of glyphosate-EPSP synthase-was not overexpressed in the resistant Ipomoea lines as in other glyphosate resistant weeds. Overall, this work identifies potential candidate resistance loci for future investigations and dramatically increases genomic resources for this species. The assembled transcriptome presented herein will also provide a valuable resource to the Ipomoea community, as well as to those interested in utilizing the close relationship between the Convolvulaceae and the Solanaceae for phylogenetic and comparative genomics examinations. Copyright © 2014 Leslie and Baucom.

  14. Annotation of gene function in citrus using gene expression information and co-expression networks

    PubMed Central

    2014-01-01

    Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. Results We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Conclusions Integration of citrus gene co-expression networks, functional enrichment analysis and gene expression information provide opportunities to infer gene function in citrus. We present a publicly accessible tool, Network Inference for Citrus Co-Expression (NICCE, http://citrus.adelaide.edu.au/nicce/home.aspx), for the gene co-expression analysis in citrus. PMID:25023870

  15. Comparison of the Transcriptomes of Ginger (Zingiber officinale Rosc.) and Mango Ginger (Curcuma amada Roxb.) in Response to the Bacterial Wilt Infection

    PubMed Central

    Prasath, Duraisamy; Karthika, Raveendran; Habeeba, Naduva Thadath; Suraby, Erinjery Jose; Rosana, Ottakandathil Babu; Shaji, Avaroth; Eapen, Santhosh Joseph; Deshpande, Uday; Anandaraj, Muthuswamy

    2014-01-01

    Bacterial wilt in ginger (Zingiber officinale Rosc.) caused by Ralstonia solanacearum is one of the most important production constraints in tropical, sub-tropical and warm temperature regions of the world. Lack of resistant genotype adds constraints to the crop management. However, mango ginger (Curcuma amada Roxb.), which is resistant to R. solanacearum, is a potential donor, if the exact mechanism of resistance is understood. To identify genes involved in resistance to R. solanacearum, we have sequenced the transcriptome from wilt-sensitive ginger and wilt-resistant mango ginger using Illumina sequencing technology. A total of 26387032 and 22268804 paired-end reads were obtained after quality filtering for C. amada and Z. officinale, respectively. A total of 36359 and 32312 assembled transcript sequences were obtained from both the species. The functions of the unigenes cover a diverse set of molecular functions and biological processes, among which we identified a large number of genes associated with resistance to stresses and response to biotic stimuli. Large scale expression profiling showed that many of the disease resistance related genes were expressed more in C. amada. Comparative analysis also identified genes belonging to different pathways of plant defense against biotic stresses that are differentially expressed in either ginger or mango ginger. The identification of many defense related genes differentially expressed provides many insights to the resistance mechanism to R. solanacearum and for studying potential pathways involved in responses to pathogen. Also, several candidate genes that may underline the difference in resistance to R. solanacearum between ginger and mango ginger were identified. Finally, we have developed a web resource, ginger transcriptome database, which provides public access to the data. Our study is among the first to demonstrate the use of Illumina short read sequencing for de novo transcriptome assembly and comparison in non-model species of Zingiberaceae. PMID:24940878

  16. Comparison of the transcriptomes of ginger (Zingiber officinale Rosc.) and mango ginger (Curcuma amada Roxb.) in response to the bacterial wilt infection.

    PubMed

    Prasath, Duraisamy; Karthika, Raveendran; Habeeba, Naduva Thadath; Suraby, Erinjery Jose; Rosana, Ottakandathil Babu; Shaji, Avaroth; Eapen, Santhosh Joseph; Deshpande, Uday; Anandaraj, Muthuswamy

    2014-01-01

    Bacterial wilt in ginger (Zingiber officinale Rosc.) caused by Ralstonia solanacearum is one of the most important production constraints in tropical, sub-tropical and warm temperature regions of the world. Lack of resistant genotype adds constraints to the crop management. However, mango ginger (Curcuma amada Roxb.), which is resistant to R. solanacearum, is a potential donor, if the exact mechanism of resistance is understood. To identify genes involved in resistance to R. solanacearum, we have sequenced the transcriptome from wilt-sensitive ginger and wilt-resistant mango ginger using Illumina sequencing technology. A total of 26387032 and 22268804 paired-end reads were obtained after quality filtering for C. amada and Z. officinale, respectively. A total of 36359 and 32312 assembled transcript sequences were obtained from both the species. The functions of the unigenes cover a diverse set of molecular functions and biological processes, among which we identified a large number of genes associated with resistance to stresses and response to biotic stimuli. Large scale expression profiling showed that many of the disease resistance related genes were expressed more in C. amada. Comparative analysis also identified genes belonging to different pathways of plant defense against biotic stresses that are differentially expressed in either ginger or mango ginger. The identification of many defense related genes differentially expressed provides many insights to the resistance mechanism to R. solanacearum and for studying potential pathways involved in responses to pathogen. Also, several candidate genes that may underline the difference in resistance to R. solanacearum between ginger and mango ginger were identified. Finally, we have developed a web resource, ginger transcriptome database, which provides public access to the data. Our study is among the first to demonstrate the use of Illumina short read sequencing for de novo transcriptome assembly and comparison in non-model species of Zingiberaceae.

  17. Zebrafish Whole-Adult-Organism Chemogenomics for Large-Scale Predictive and Discovery Chemical Biology

    PubMed Central

    Lam, Siew Hong; Mathavan, Sinnakarupan; Tong, Yan; Li, Haixia; Karuturi, R. Krishna Murthy; Wu, Yilian; Vega, Vinsensius B.; Liu, Edison T.; Gong, Zhiyuan

    2008-01-01

    The ability to perform large-scale, expression-based chemogenomics on whole adult organisms, as in invertebrate models (worm and fly), is highly desirable for a vertebrate model but its feasibility and potential has not been demonstrated. We performed expression-based chemogenomics on the whole adult organism of a vertebrate model, the zebrafish, and demonstrated its potential for large-scale predictive and discovery chemical biology. Focusing on two classes of compounds with wide implications to human health, polycyclic (halogenated) aromatic hydrocarbons [P(H)AHs] and estrogenic compounds (ECs), we generated robust prediction models that can discriminate compounds of the same class from those of different classes in two large independent experiments. The robust expression signatures led to the identification of biomarkers for potent aryl hydrocarbon receptor (AHR) and estrogen receptor (ER) agonists, respectively, and were validated in multiple targeted tissues. Knowledge-based data mining of human homologs of zebrafish genes revealed highly conserved chemical-induced biological responses/effects, health risks, and novel biological insights associated with AHR and ER that could be inferred to humans. Thus, our study presents an effective, high-throughput strategy of capturing molecular snapshots of chemical-induced biological states of a whole adult vertebrate that provides information on biomarkers of effects, deregulated signaling pathways, and possible affected biological functions, perturbed physiological systems, and increased health risks. These findings place zebrafish in a strategic position to bridge the wide gap between cell-based and rodent models in chemogenomics research and applications, especially in preclinical drug discovery and toxicology. PMID:18618001

  18. Differential gene expression in human abdominal aortic aneurysm and aortic occlusive disease

    PubMed Central

    Moran, Corey S.; Schreurs, Charlotte; Lindeman, Jan H. N.; Walker, Philip J.; Nataatmadja, Maria; West, Malcolm; Holdt, Lesca M.; Hinterseher, Irene; Pilarsky, Christian; Golledge, Jonathan

    2015-01-01

    Abdominal aortic aneurysm (AAA) and aortic occlusive disease (AOD) represent common causes of morbidity and mortality in elderly populations which were previously believed to have common aetiologies. The aim of this study was to assess the gene expression in human AAA and AOD. We performed microarrays using aortic specimen obtained from 20 patients with small AAAs (≤ 55mm), 29 patients with large AAAs (> 55mm), 9 AOD patients, and 10 control aortic specimens obtained from organ donors. Some differentially expressed genes were validated by quantitative-PCR (qRT-PCR)/immunohistochemistry. We identified 840 and 1,014 differentially expressed genes in small and large AAAs, respectively. Immune-related pathways including cytokine-cytokine receptor interaction and T-cell-receptor signalling were upregulated in both small and large AAAs. Examples of validated genes included CTLA4 (2.01-fold upregulated in small AAA, P = 0.002), NKTR (2.37-and 2.66-fold upregulated in small and large AAA with P = 0.041 and P = 0.015, respectively), and CD8A (2.57-fold upregulated in large AAA, P = 0.004). 1,765 differentially expressed genes were identified in AOD. Pathways upregulated in AOD included metabolic and oxidative phosphorylation categories. The UCP2 gene was downregulated in AOD (3.73-fold downregulated, validated P = 0.017). In conclusion, the AAA and AOD transcriptomes were very different suggesting that AAA and AOD have distinct pathogenic mechanisms. PMID:25944698

  19. Physiologically Shrinking the Solution Space of a Saccharomyces cerevisiae Genome-Scale Model Suggests the Role of the Metabolic Network in Shaping Gene Expression Noise.

    PubMed

    Chi, Baofang; Tao, Shiheng; Liu, Yanlin

    2015-01-01

    Sampling the solution space of genome-scale models is generally conducted to determine the feasible region for metabolic flux distribution. Because the region for actual metabolic states resides only in a small fraction of the entire space, it is necessary to shrink the solution space to improve the predictive power of a model. A common strategy is to constrain models by integrating extra datasets such as high-throughput datasets and C13-labeled flux datasets. However, studies refining these approaches by performing a meta-analysis of massive experimental metabolic flux measurements, which are closely linked to cellular phenotypes, are limited. In the present study, experimentally identified metabolic flux data from 96 published reports were systematically reviewed. Several strong associations among metabolic flux phenotypes were observed. These phenotype-phenotype associations at the flux level were quantified and integrated into a Saccharomyces cerevisiae genome-scale model as extra physiological constraints. By sampling the shrunken solution space of the model, the metabolic flux fluctuation level, which is an intrinsic trait of metabolic reactions determined by the network, was estimated and utilized to explore its relationship to gene expression noise. Although no correlation was observed in all enzyme-coding genes, a relationship between metabolic flux fluctuation and expression noise of genes associated with enzyme-dosage sensitive reactions was detected, suggesting that the metabolic network plays a role in shaping gene expression noise. Such correlation was mainly attributed to the genes corresponding to non-essential reactions, rather than essential ones. This was at least partially, due to regulations underlying the flux phenotype-phenotype associations. Altogether, this study proposes a new approach in shrinking the solution space of a genome-scale model, of which sampling provides new insights into gene expression noise.

  20. Comparative transcriptome analysis of lufenuron-resistant and susceptible strains of Spodoptera frugiperda (Lepidoptera: Noctuidae).

    PubMed

    do Nascimento, Antonio Rogério Bezerra; Fresia, Pablo; Cônsoli, Fernando Luis; Omoto, Celso

    2015-11-21

    The evolution of insecticide resistance in Spodoptera frugiperda (Lepidoptera: Noctuidae) has resulted in large economic losses and disturbances to the environment and agroecosystems. Resistance to lufenuron, a chitin biosynthesis inhibitor insecticide, was recently documented in Brazilian populations of S. frugiperda. Thus, we utilized large-scale cDNA sequencing (RNA-Seq analysis) to compare the pattern of gene expression between lufenuron-resistant (LUF-R) and susceptible (LUF-S) S. larvae in an attempt to identify the molecular basis behind the resistance mechanism(s) of S. frugiperda to this insecticide. A transcriptome was assembled using approximately 19.6 million 100 bp-long single-end reads, which generated 18,506 transcripts with a N50 of 996 bp. A search against the NCBI non-redundant database generated 51.1% (9,457) functionally annotated transcripts. A large portion of the alignments were homologous to insects, with the majority (45%) being similar to sequences of Bombyx mori (Lepidoptera: Bombycidae). Moreover, 10% of the alignments were similar to sequences of various species of Spodoptera (Lepidoptera: Noctuidae), with 3% of them being similar to sequences of S. frugiperda. A comparative analysis of the gene expression between LUF-R and LUF-S S. frugiperda larvae identified 940 differentially expressed transcripts (p ≤ 0.05, t-test; fold change ≥ 4). Six of them were associated with cuticle metabolism. Of those, four were overexpressed in LUF-R larvae. The machinery involved with the detoxification process was represented by 35 differentially expressed transcripts; 24 of them belonging to P450 monooxygenases, four to glutathione-S-transferases, six to carboxylases and one to sulfotransferases. RNA-Seq analysis was validated for a number of selected candidate transcripts by using quantitative real time PCR (qPCR). The gene expression profile of LUF-R larvae of S. frugiperda differs from LUF-S larvae. In general, gene expression is much higher in resistant larvae when compared to the susceptible ones, particularly for those genes involved with pathways for xenobiotic detoxification, mainly represented by P450 monooxygenases transcripts. Our data indicate that enzymes involved with the detoxification process, and mostly the P450, are one of the resistance mechanisms employed by the LUF-R S. frugiperda larvae against lufenuron.

  1. Differential gene expression in small and large rainbow trout derived from two seasonal spawning groups

    PubMed Central

    2014-01-01

    Background Growth in fishes is regulated via many environmental and physiological factors and is shaped by the genetic background of each individual. Previous microarray studies of salmonid growth have examined fish experiencing either muscle wastage or accelerated growth patterns following refeeding, or the influence of growth hormone and transgenesis. This study determines the gene expression profiles of genetically unmanipulated large and small fish from a domesticated salmonid strain reared on a typical feeding regime. Gene expression profiles of white muscle and liver from rainbow trout (Oncorhynchus mykiss) from two seasonal spawning groups (September and December lots) within a single strain were examined when the fish were 15 months of age to assess the influence of season (late fall vs. onset of spring) and body size (large vs. small). Results Although IGFBP1 gene expression was up-regulated in the livers of small fish in both seasonal lots, few expression differences were detected in the liver overall. Faster growing Dec. fish showed a greater number of differences in white muscle expression compared to Sept. fish. Significant differences in the GO Generic Level 3 categories ‘response to external stimulus’, ‘establishment of localization’, and ‘response to stress’ were detected in white muscle tissue between large and small fish. Larger fish showed up-regulation of cytoskeletal component genes while many genes related to myofibril components of muscle tissue were up-regulated in small fish. Most of the genes up-regulated in large fish within the ‘response to stress’ category are involved in immunity while in small fish most of these gene functions are related to apoptosis. Conclusions A higher proportion of genes in white muscle compared to liver showed similar patterns of up- or down-regulation within the same size class across seasons supporting their utility as biomarkers for growth in rainbow trout. Differences between large and small Sept. fish in the ‘response to stress’ and ‘response to external stimulus’ categories for white muscle tissue, suggests that smaller fish have a greater inability to handle stress compared to the large fish. Sampling season had a significant impact on the expression of genes related to the growth process in rainbow trout. PMID:24450799

  2. Cloning and characterization of a 9-lipoxygenase gene induced by pathogen attack from Nicotiana benthamiana for biotechnological application

    PubMed Central

    2011-01-01

    Background Plant lipoxygenases (LOXs) have been proposed to form biologically active compounds both during normal developmental stages such as germination or growth as well as during responses to environmental stress such as wounding or pathogen attack. In our previous study, we found that enzyme activity of endogenous 9-LOX in Nicotiana benthamiana was highly induced by agroinfiltration using a tobacco mosaic virus (TMV) based vector system. Results A LOX gene which is expressed after treatment of the viral vectors was isolated from Nicotiana benthamiana. As the encoded LOX has a high amino acid identity to other 9-LOX proteins, the gene was named as Nb-9-LOX. It was heterologously expressed in yeast cells and its enzymatic activity was characterized. The yeast cells expressed large quantities of stable 9-LOX (0.9 U ml-1 cell cultures) which can oxygenate linoleic acid resulting in high yields (18 μmol ml-1 cell cultures) of hydroperoxy fatty acid. The product specificity of Nb-9-LOX was examined by incubation of linoleic acid and Nb-9-LOX in combination with a 13-hydroperoxide lyase from watermelon (Cl-13-HPL) or a 9/13-hydroperoxide lyase from melon (Cm-9/13-HPL) and by LC-MS analysis. The result showed that Nb-9-LOX possesses both 9- and 13-LOX specificity, with high predominance for the 9-LOX function. The combination of recombinant Nb-9-LOX and recombinant Cm-9/13-HPL produced large amounts of C9-aldehydes (3.3 μmol mg-1 crude protein). The yield of C9-aldehydes from linoleic acid was 64%. Conclusion The yeast expressed Nb-9-LOX can be used to produce C9-aldehydes on a large scale in combination with a HPL gene with 9-HPL function, or to effectively produce 9-hydroxy-10(E),12(Z)-octadecadienoic acid in a biocatalytic process in combination with cysteine as a mild reducing agent. PMID:21450085

  3. Gene Expression in Parp1 Deficient Mice Exposed to a Median Lethal Dose of Gamma Rays.

    PubMed

    Kumar, M A Suresh; Laiakis, Evagelia C; Ghandhi, Shanaz A; Morton, Shad R; Fornace, Albert J; Amundson, Sally A

    2018-05-10

    There is a current interest in the development of biodosimetric methods for rapidly assessing radiation exposure in the wake of a large-scale radiological event. This work was initially focused on determining the exposure dose to an individual using biological indicators. Gene expression signatures show promise for biodosimetric application, but little is known about how these signatures might translate for the assessment of radiological injury in radiosensitive individuals, who comprise a significant proportion of the general population, and who would likely require treatment after exposure to lower doses. Using Parp1 -/- mice as a model radiation-sensitive genotype, we have investigated the effect of this DNA repair deficiency on the gene expression response to radiation. Although Parp1 is known to play general roles in regulating transcription, the pattern of gene expression changes observed in Parp1 -/- mice 24 h postirradiation to a LD 50/30 was remarkably similar to that in wild-type mice after exposure to LD 50/30 . Similar levels of activation of both the p53 and NFκB radiation response pathways were indicated in both strains. In contrast, exposure of wild-type mice to a sublethal dose that was equal to the Parp1 -/- LD 50/30 , which resulted in a lower magnitude gene expression response. Thus, Parp1 -/- mice displayed a heightened gene expression response to radiation, which was more similar to the wild-type response to an equitoxic dose than to an equal absorbed dose. Gene expression classifiers trained on the wild-type data correctly identified all wild-type samples as unexposed, exposed to a sublethal dose or exposed to an LD 50/30 . All unexposed samples from Parp1 -/- mice were also correctly classified with the same gene set, and 80% of irradiated Parp1 -/- samples were identified as exposed to an LD 50/30 . The results of this study suggest that, at least for some pathways that may influence radiosensitivity in humans, specific gene expression signatures have the potential to accurately detect the extent of radiological injury, rather than serving only as a surrogate of physical radiation dose.

  4. A biomarker-based screen of a gene expression compendium reveals regulation of Nrf2 by CAR and STAT5b

    EPA Science Inventory

    Computational approaches were developed to identify factors that regulate Nrf2 in a large gene expression compendium of microarray profiles including >2000 comparisons which queried the effects of chemicals, genes, diets, and infectious agents on gene expression in the mouse l...

  5. Diffusion and scaling during early embryonic pattern formation.

    PubMed

    Gregor, Thomas; Bialek, William; de Ruyter van Steveninck, Rob R; Tank, David W; Wieschaus, Eric F

    2005-12-20

    Development of spatial patterns in multicellular organisms depends on gradients in the concentration of signaling molecules that control gene expression. In the Drosophila embryo, Bicoid (Bcd) morphogen controls cell fate along 70% of the anteroposterior axis but is translated from mRNA localized at the anterior pole. Gradients of Bcd and other morphogens are thought to arise through diffusion, but this basic assumption has never been rigorously tested in living embryos. Furthermore, because diffusion sets a relationship between length and time scales, it is hard to see how patterns of gene expression established by diffusion would scale proportionately as egg size changes during evolution. Here, we show that the motion of inert molecules through the embryo is well described by the diffusion equation on the relevant length and time scales, and that effective diffusion constants are essentially the same in closely related dipteran species with embryos of very different size. Nonetheless, patterns of gene expression in these different species scale with egg length. We show that this scaling can be traced back to scaling of the Bcd gradient itself. Our results, together with constraints imposed by the time scales of development, suggest that the mechanism for scaling is a species-specific adaptation of the Bcd lifetime.

  6. Potential large scale production of meningococcal vaccines by stable overexpression of fHbp in the rice seeds.

    PubMed

    Ma, Jian; Wang, Yunpeng; Xu, Nuo; Jin, Libo; Liu, Jia; Xing, Shaochen; Li, Xiaokun

    2018-06-25

    Factor H binding protein (fHbp) is the most promising vaccine candidate against serogroup B of Neisseria meningitidis which is a major cause of morbidity and mortality in children. In order to facilitate large scale production of a commercial vaccine, we previously used transgenic Arabidopsis thaliana, but plant-derived fHbp is still far away from a commercial vaccine due to less biomass production. Herein, we presented an alternative route for the production of recombinant fHbp from the seeds of transgenic rice. The OsrfHbp gene encoding recombinant fHbp fused protein was introduced into the genome of rice via Agrobacterium-mediated transformation. The both stable integration and transcription of the foreign OsrfHbp were confirmed by Southern blotting and RT-PCR analysis respectively. Further, the expression of fHbp protein was measured by immunoblotting analysis and quantified by ELISA. The results indicated that fHbp was successfully expressed and the highest yield of fHbp was 0.52 ± 0.03% of TSP in the transgenic rice seeds. The purified fHbp protein showed good antigenicity and immunogenicity in the animal model. The results of this experiment offer a novel approach for large-scale production of plant-derived commercial vaccine fHbp. Copyright © 2018. Published by Elsevier Inc.

  7. Transcriptomic Analysis Reveals Mechanisms of Sterile and Fertile Flower Differentiation and Development in Viburnum macrocephalum f. keteleeri

    PubMed Central

    Lu, Zhaogeng; Xu, Jing; Li, Weixing; Zhang, Li; Cui, Jiawen; He, Qingsong; Wang, Li; Jin, Biao

    2017-01-01

    Sterile and fertile flowers are an important evolutionary developmental (evo-devo) phenotype in angiosperm flowers, playing important roles in pollinator attraction and sexual reproductive success. However, the gene regulatory mechanisms underlying fertile and sterile flower differentiation and development remain largely unknown. Viburnum macrocephalum f. keteleeri, which possesses fertile and sterile flowers in a single inflorescence, is a useful candidate species for investigating the regulatory networks in differentiation and development. We developed a de novo-assembled flower reference transcriptome. Using RNA sequencing (RNA-seq), we compared the expression patterns of fertile and sterile flowers isolated from the same inflorescence over its rapid developmental stages. The flower reference transcriptome consisted of 105,683 non-redundant transcripts, of which 5,675 transcripts showed significant differential expression between fertile and sterile flowers. Combined with morphological and cytological changes between fertile and sterile flowers, we identified expression changes of many genes potentially involved in reproductive processes, phytohormone signaling, and cell proliferation and expansion using RNA-seq and qRT-PCR. In particular, many transcription factors (TFs), including MADS-box family members and ABCDE-class genes, were identified, and expression changes in TFs involved in multiple functions were analyzed and highlighted to determine their roles in regulating fertile and sterile flower differentiation and development. Our large-scale transcriptional analysis of fertile and sterile flowers revealed the dynamics of transcriptional networks and potentially key components in regulating differentiation and development of fertile and sterile flowers in Viburnum macrocephalum f. keteleeri. Our data provide a useful resource for Viburnum transcriptional research and offer insights into gene regulation of differentiation of diverse evo-devo processes in flowers. PMID:28298915

  8. Sex-Biased Gene Expression and Sexual Conflict throughout Development

    PubMed Central

    Ingleby, Fiona C.; Flis, Ilona; Morrow, Edward H.

    2015-01-01

    Sex-biased gene expression is likely to account for most sexually dimorphic traits because males and females share much of their genome. When fitness optima differ between sexes for a shared trait, sexual dimorphism can allow each sex to express their optimum trait phenotype, and in this way, the evolution of sex-biased gene expression is one mechanism that could help to resolve intralocus sexual conflict. Genome-wide patterns of sex-biased gene expression have been identified in a number of studies, which we review here. However, very little is known about how sex-biased gene expression relates to sex-specific fitness and about how sex-biased gene expression and conflict vary throughout development or across different genotypes, populations, and environments. We discuss the importance of these neglected areas of research and use data from a small-scale experiment on sex-specific expression of genes throughout development to highlight potentially interesting avenues for future research. PMID:25376837

  9. The rapid evolution of X-linked male-biased gene expression and the large-X effect in Drosophila yakuba, D. santomea, and their hybrids.

    PubMed

    Llopart, Ana

    2012-12-01

    The X chromosome has a large effect on hybrid dysfunction, particularly on hybrid male sterility. Although the evidence for this so-called large-X effect is clear, its molecular causes are not yet fully understood. One possibility is that, under certain conditions, evolution proceeds faster in X-linked than in autosomal loci (i.e., faster-X effect) due to both natural selection and their hemizygosity in males, an effect that is expected to be greatest in genes with male-biased expression. Here, I study genome-wide variation in transcript abundance between Drosophila yakuba and D. santomea, within these species and in their hybrid males to evaluate both the faster-X and large-X effects at the level of expression. I find that in X-linked male-biased genes (MBGs) expression evolves faster than in their autosomal counterparts, an effect that is accompanied by a unique reduction in expression polymorphism. This suggests that Darwinian selection is driving expression differences between species, likely enhanced by the hemizygosity of the X chromosome in males. Despite the recent split of the two sister species under study, abundant changes in both cis- and trans-regulatory elements underlie expression divergence in the majority of the genes analyzed, with significant differences in allelic ratios of transcript abundance between the two reciprocal F(1) hybrid males. Cis-trans coevolution at molecular level, evolved shortly after populations become isolated, may therefore contribute to explain the breakdown of the regulation of gene expression in hybrid males. Additionally, the X chromosome plays a large role in this hybrid male misexpression, which affects not only MBG but also, to a lesser degree, nonsex-biased genes. Interestingly, hybrid male misexpression is concentrated mostly in autosomal genes, likely facilitated by the rapid evolution of sex-linked trans-acting factors. I suggest that the faster evolution of X-linked MBGs, at both protein and expression levels, contributes to explain the large effect of the X chromosome on hybrid male sterility, likely mediating widespread autosomal misexpression through the preferential recognition of cis-regulatory elements by conspecific trans-acting factors (i.e., cis-trans conspecific recognition).

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kolker, Eugene

    Our project focused primarily on analysis of different types of data produced by global high-throughput technologies, data integration of gene annotation, and gene and protein expression information, as well as on getting a better functional annotation of Shewanella genes. Specifically, four of our numerous major activities and achievements include the development of: statistical models for identification and expression proteomics, superior to currently available approaches (including our own earlier ones); approaches to improve gene annotations on the whole-organism scale; standards for annotation, transcriptomics and proteomics approaches; and generalized approaches for data integration of gene annotation, gene and protein expression information.

  11. Expression of the Arabidopsis thaliana BBX32 gene in soybean increases grain yield.

    PubMed

    Preuss, Sasha B; Meister, Robert; Xu, Qingzhang; Urwin, Carl P; Tripodi, Federico A; Screen, Steven E; Anil, Veena S; Zhu, Shuquan; Morrell, James A; Liu, Grace; Ratcliffe, Oliver J; Reuber, T Lynne; Khanna, Rajnish; Goldman, Barry S; Bell, Erin; Ziegler, Todd E; McClerren, Amanda L; Ruff, Thomas G; Petracek, Marie E

    2012-01-01

    Crop yield is a highly complex quantitative trait. Historically, successful breeding for improved grain yield has led to crop plants with improved source capacity, altered plant architecture, and increased resistance to abiotic and biotic stresses. To date, transgenic approaches towards improving crop grain yield have primarily focused on protecting plants from herbicide, insects, or disease. In contrast, we have focused on identifying genes that, when expressed in soybean, improve the intrinsic ability of the plant to yield more. Through the large scale screening of candidate genes in transgenic soybean, we identified an Arabidopsis thaliana B-box domain gene (AtBBX32) that significantly increases soybean grain yield year after year in multiple transgenic events in multi-location field trials. In order to understand the underlying physiological changes that are associated with increased yield in transgenic soybean, we examined phenotypic differences in two AtBBX32-expressing lines and found increases in plant height and node, flower, pod, and seed number. We propose that these phenotypic changes are likely the result of changes in the timing of reproductive development in transgenic soybean that lead to the increased duration of the pod and seed development period. Consistent with the role of BBX32 in A. thaliana in regulating light signaling, we show that the constitutive expression of AtBBX32 in soybean alters the abundance of a subset of gene transcripts in the early morning hours. In particular, AtBBX32 alters transcript levels of the soybean clock genes GmTOC1 and LHY-CCA1-like2 (GmLCL2). We propose that through the expression of AtBBX32 and modulation of the abundance of circadian clock genes during the transition from dark to light, the timing of critical phases of reproductive development are altered. These findings demonstrate a specific role for AtBBX32 in modulating soybean development, and demonstrate the validity of expressing single genes in crops to deliver increased agricultural productivity.

  12. Elucidation of the effect of brain cortex tetrapeptide Cortagen on gene expression in mouse heart by microarray.

    PubMed

    Anisimov, Sergey V; Khavinson, Vladimir Kh; Anisimov, Vladimir N

    2004-01-01

    Aging is associated with significant alterations in gene expression in numerous organs and tissues. Anti-aging therapy with peptide bioregulators holds much promise for the correction of age-associated changes, making a screening for their molecular targets in tissues an important question of modern gerontology. The synthetic tetrapeptide Cortagen (Ala-Glu-Asp-Pro) was obtained by directed synthesis based on amino acid analysis of natural brain cortex peptide preparation Cortexin. In humans, Cortagen demonstrated a pronounced therapeutic effect upon the structural and functional posttraumatic recovery of peripheral nerve tissue. Importantly, other effects were also observed in cardiovascular and cerebrovascular parameters. Based on these latter observations, we hypothesized that acute course of Cortagen treatment, large-scale transcriptome analysis, and identification of transcripts with altered expression in heart would facilitate our understanding of the mechanisms responsible for this peptide biological effects. We therefore analyzed the expression of 15,247 transcripts in the heart of female 6-months CBA mice receiving injections of Cortagen for 5 consecutive days was studied by cDNA microarrays. Comparative analysis of cDNA microarray hybridisation with heart samples from control and experimental group revealed 234 clones (1,53% of the total number of clones) with significant changes of expression that matched 110 known genes belonging to various functional categories. Maximum up- and down-regulation was +5.42 and -2.86, respectively. Intercomparison of changes in cardiac expression profile induced by synthetic peptides (Cortagen, Vilon, Epitalon) and pineal peptide hormone melatonin revealed both common and specific effects of Cortagen upon gene expression in heart.

  13. Extraordinary diversity of visual opsin genes in dragonflies

    PubMed Central

    Futahashi, Ryo; Kawahara-Miki, Ryouka; Kinoshita, Michiyo; Yoshitake, Kazutoshi; Yajima, Shunsuke; Arikawa, Kentaro; Fukatsu, Takema

    2015-01-01

    Dragonflies are colorful and large-eyed animals strongly dependent on color vision. Here we report an extraordinary large number of opsin genes in dragonflies and their characteristic spatiotemporal expression patterns. Exhaustive transcriptomic and genomic surveys of three dragonflies of the family Libellulidae consistently identified 20 opsin genes, consisting of 4 nonvisual opsin genes and 16 visual opsin genes of 1 UV, 5 short-wavelength (SW), and 10 long-wavelength (LW) type. Comprehensive transcriptomic survey of the other dragonflies representing an additional 10 families also identified as many as 15–33 opsin genes. Molecular phylogenetic analysis revealed dynamic multiplications and losses of the opsin genes in the course of evolution. In contrast to many SW and LW genes expressed in adults, only one SW gene and several LW genes were expressed in larvae, reflecting less visual dependence and LW-skewed light conditions for their lifestyle under water. In this context, notably, the sand-burrowing or pit-dwelling species tended to lack SW gene expression in larvae. In adult visual organs: (i) many SW genes and a few LW genes were expressed in the dorsal region of compound eyes, presumably for processing SW-skewed light from the sky; (ii) a few SW genes and many LW genes were expressed in the ventral region of compound eyes, probably for perceiving terrestrial objects; and (iii) expression of a specific LW gene was associated with ocelli. Our findings suggest that the stage- and region-specific expressions of the diverse opsin genes underlie the behavior, ecology, and adaptation of dragonflies. PMID:25713365

  14. Developmental and Environmental Regulation of Aquaporin Gene Expression across Populus Species: Divergence or Redundancy?

    PubMed Central

    Cohen, David; Bogeat-Triboulot, Marie-Béatrice; Vialet-Chabrand, Silvère; Merret, Rémy; Courty, Pierre-Emmanuel; Moretti, Sébastien; Bizet, François; Guilliot, Agnès; Hummel, Irène

    2013-01-01

    Aquaporins (AQPs) are membrane channels belonging to the major intrinsic proteins family and are known for their ability to facilitate water movement. While in Populus trichocarpa, AQP proteins form a large family encompassing fifty-five genes, most of the experimental work focused on a few genes or subfamilies. The current work was undertaken to develop a comprehensive picture of the whole AQP gene family in Populus species by delineating gene expression domain and distinguishing responsiveness to developmental and environmental cues. Since duplication events amplified the poplar AQP family, we addressed the question of expression redundancy between gene duplicates. On these purposes, we carried a meta-analysis of all publicly available Affymetrix experiments. Our in-silico strategy controlled for previously identified biases in cross-species transcriptomics, a necessary step for any comparative transcriptomics based on multispecies design chips. Three poplar AQPs were not supported by any expression data, even in a large collection of situations (abiotic and biotic constraints, temporal oscillations and mutants). The expression of 11 AQPs was never or poorly regulated whatever the wideness of their expression domain and their expression level. Our work highlighted that PtTIP1;4 was the most responsive gene of the AQP family. A high functional divergence between gene duplicates was detected across species and in response to tested cues, except for the root-expressed PtTIP2;3/PtTIP2;4 pair exhibiting 80% convergent responses. Our meta-analysis assessed key features of aquaporin expression which had remained hidden in single experiments, such as expression wideness, response specificity and genotype and environment interactions. By consolidating expression profiles using independent experimental series, we showed that the large expansion of AQP family in poplar was accompanied with a strong divergence of gene expression, even if some cases of functional redundancy could be suspected. PMID:23393587

  15. Developmental and environmental regulation of Aquaporin gene expression across Populus species: divergence or redundancy?

    PubMed

    Cohen, David; Bogeat-Triboulot, Marie-Béatrice; Vialet-Chabrand, Silvère; Merret, Rémy; Courty, Pierre-Emmanuel; Moretti, Sébastien; Bizet, François; Guilliot, Agnès; Hummel, Irène

    2013-01-01

    Aquaporins (AQPs) are membrane channels belonging to the major intrinsic proteins family and are known for their ability to facilitate water movement. While in Populus trichocarpa, AQP proteins form a large family encompassing fifty-five genes, most of the experimental work focused on a few genes or subfamilies. The current work was undertaken to develop a comprehensive picture of the whole AQP gene family in Populus species by delineating gene expression domain and distinguishing responsiveness to developmental and environmental cues. Since duplication events amplified the poplar AQP family, we addressed the question of expression redundancy between gene duplicates. On these purposes, we carried a meta-analysis of all publicly available Affymetrix experiments. Our in-silico strategy controlled for previously identified biases in cross-species transcriptomics, a necessary step for any comparative transcriptomics based on multispecies design chips. Three poplar AQPs were not supported by any expression data, even in a large collection of situations (abiotic and biotic constraints, temporal oscillations and mutants). The expression of 11 AQPs was never or poorly regulated whatever the wideness of their expression domain and their expression level. Our work highlighted that PtTIP1;4 was the most responsive gene of the AQP family. A high functional divergence between gene duplicates was detected across species and in response to tested cues, except for the root-expressed PtTIP2;3/PtTIP2;4 pair exhibiting 80% convergent responses. Our meta-analysis assessed key features of aquaporin expression which had remained hidden in single experiments, such as expression wideness, response specificity and genotype and environment interactions. By consolidating expression profiles using independent experimental series, we showed that the large expansion of AQP family in poplar was accompanied with a strong divergence of gene expression, even if some cases of functional redundancy could be suspected.

  16. Comparison of growth-related traits and gene expression profiles between the offspring of neomale (XX) and normal male (XY) rainbow trout.

    PubMed

    Kocmarek, Andrea L; Ferguson, Moira M; Danzmann, Roy G

    2015-04-01

    All-female lines of fish are created by crossing sex reversed (XX genotype) males with normal females. All-female lines avoid the deleterious phenotypic effects that are typical of precocious maturation in males. To determine whether all-female and mixed sex populations of rainbow trout (Oncorhynchus mykiss) differ in performance, we compared the growth and gene expression profiles in progeny groups produced by crossing a XX male and a XY male to the same five females. Body weight and length were measured in the resulting all-female (XX) and mixed sex (XX/XY) offspring groups. Microarray experiments with liver and white muscle were used to determine if the gene expression profiles of large and small XX offspring differ from those in large and small XX/XY offspring. We detected no significant differences in body length and weight between offspring groups but XX offspring were significantly less variable in the value of these traits. A large number of upregulated genes were shared between the large XX and large XX/XY offspring; the small XX and small XX/XY offspring also shared similar expression profiles. No GO category differences were seen in the liver or between the large XX and large XX/XY offspring in the muscle. The greatest differences between the small XX and small XX/XY offspring were in the genes assigned to the "small molecule metabolic process" and "cellular metabolic process" GO level 3 categories. Similarly, genes within these categories as well as the category "macromolecule metabolic process" were more highly expressed in small compared to large XX fish.

  17. Comparative methods for the analysis of gene-expression evolution: an example using yeast functional genomic data.

    PubMed

    Oakley, Todd H; Gu, Zhenglong; Abouheif, Ehab; Patel, Nipam H; Li, Wen-Hsiung

    2005-01-01

    Understanding the evolution of gene function is a primary challenge of modern evolutionary biology. Despite an expanding database from genomic and developmental studies, we are lacking quantitative methods for analyzing the evolution of some important measures of gene function, such as gene-expression patterns. Here, we introduce phylogenetic comparative methods to compare different models of gene-expression evolution in a maximum-likelihood framework. We find that expression of duplicated genes has evolved according to a nonphylogenetic model, where closely related genes are no more likely than more distantly related genes to share common expression patterns. These results are consistent with previous studies that found rapid evolution of gene expression during the history of yeast. The comparative methods presented here are general enough to test a wide range of evolutionary hypotheses using genomic-scale data from any organism.

  18. Genomewide transcriptional reprogramming in the seagrass Cymodocea nodosa under experimental ocean acidification.

    PubMed

    Ruocco, Miriam; Musacchia, Francesco; Olivé, Irene; Costa, Monya M; Barrote, Isabel; Santos, Rui; Sanges, Remo; Procaccini, Gabriele; Silva, João

    2017-08-01

    Here, we report the first use of massive-scale RNA-sequencing to explore seagrass response to CO 2 -driven ocean acidification (OA). Large-scale gene expression changes in the seagrass Cymodocea nodosa occurred at CO 2 levels projected by the end of the century. C. nodosa transcriptome was obtained using Illumina RNA-Seq technology and de novo assembly, and differential gene expression was explored in plants exposed to short-term high CO 2 /low pH conditions. At high pCO 2 , there was a significant increased expression of transcripts associated with photosynthesis, including light reaction functions and CO 2 fixation, and also to respiratory pathways, specifically for enzymes involved in glycolysis, in the tricarboxylic acid cycle and in the energy metabolism of the mitochondrial electron transport. The upregulation of respiratory metabolism is probably supported by the increased availability of photosynthates and increased energy demand for biosynthesis and stress-related processes under elevated CO 2 and low pH. The upregulation of several chaperones resembling heat stress-induced changes in gene expression highlighted the positive role these proteins play in tolerance to intracellular acid stress in seagrasses. OA further modifies C. nodosa secondary metabolism inducing the transcription of enzymes related to biosynthesis of carbon-based secondary compounds, in particular the synthesis of polyphenols and isoprenoid compounds that have a variety of biological functions including plant defence. By demonstrating which physiological processes are most sensitive to OA, this research provides a major advance in the understanding of seagrass metabolism in the context of altered seawater chemistry from global climate change. © 2017 John Wiley & Sons Ltd.

  19. Differential Expression Patterns in Chemosensory and Non-Chemosensory Tissues of Putative Chemosensory Genes Identified by Transcriptome Analysis of Insect Pest the Purple Stem Borer Sesamia inferens (Walker)

    PubMed Central

    Zhang, Ya-Nan; Jin, Jun-Yan; Jin, Rong; Xia, Yi-Han; Zhou, Jing-Jiang; Deng, Jian-Yu; Dong, Shuang-Lin

    2013-01-01

    Background A large number of insect chemosensory genes from different gene subfamilies have been identified and annotated, but their functional diversity and complexity are largely unknown. A systemic examination of expression patterns in chemosensory organs could provide important information. Methodology/Principal Findings We identified 92 putative chemosensory genes by analysing the transcriptome of the antennae and female sex pheromone gland of the purple stem borer Sesamia inferens, among them 87 are novel in this species, including 24 transcripts encoding for odorant binding proteins (OBPs), 24 for chemosensory proteins (CSPs), 2 for sensory neuron membrane proteins (SNMPs), 39 for odorant receptors (ORs) and 3 for ionotropic receptors (IRs). The transcriptome analyses were validated and quantified with a detailed global expression profiling by Reverse Transcription-PCR for all 92 transcripts and by Quantitative Real Time RT-PCR for selected 16 ones. Among the chemosensory gene subfamilies, CSP transcripts are most widely and evenly expressed in different tissues and stages, OBP transcripts showed a clear antenna bias and most of OR transcripts are only detected in adult antennae. Our results also revealed that some OR transcripts, such as the transcripts of SNMP2 and 2 IRs were expressed in non-chemosensory tissues, and some CSP transcripts were antenna-biased expression. Furthermore, no chemosensory transcript is specific to female sex pheromone gland and very few are found in the heads. Conclusion Our study revealed that there are a large number of chemosensory genes expressed in S. inferens, and some of them displayed unusual expression profile in non-chemosensory tissues. The identification of a large set of putative chemosensory genes of each subfamily from a single insect species, together with their different expression profiles provide further information in understanding the functions of these chemosensory genes in S. inferens as well as other insects. PMID:23894529

  20. Differential expression patterns in chemosensory and non-chemosensory tissues of putative chemosensory genes identified by transcriptome analysis of insect pest the purple stem borer Sesamia inferens (Walker).

    PubMed

    Zhang, Ya-Nan; Jin, Jun-Yan; Jin, Rong; Xia, Yi-Han; Zhou, Jing-Jiang; Deng, Jian-Yu; Dong, Shuang-Lin

    2013-01-01

    A large number of insect chemosensory genes from different gene subfamilies have been identified and annotated, but their functional diversity and complexity are largely unknown. A systemic examination of expression patterns in chemosensory organs could provide important information. We identified 92 putative chemosensory genes by analysing the transcriptome of the antennae and female sex pheromone gland of the purple stem borer Sesamia inferens, among them 87 are novel in this species, including 24 transcripts encoding for odorant binding proteins (OBPs), 24 for chemosensory proteins (CSPs), 2 for sensory neuron membrane proteins (SNMPs), 39 for odorant receptors (ORs) and 3 for ionotropic receptors (IRs). The transcriptome analyses were validated and quantified with a detailed global expression profiling by Reverse Transcription-PCR for all 92 transcripts and by Quantitative Real Time RT-PCR for selected 16 ones. Among the chemosensory gene subfamilies, CSP transcripts are most widely and evenly expressed in different tissues and stages, OBP transcripts showed a clear antenna bias and most of OR transcripts are only detected in adult antennae. Our results also revealed that some OR transcripts, such as the transcripts of SNMP2 and 2 IRs were expressed in non-chemosensory tissues, and some CSP transcripts were antenna-biased expression. Furthermore, no chemosensory transcript is specific to female sex pheromone gland and very few are found in the heads. Our study revealed that there are a large number of chemosensory genes expressed in S. inferens, and some of them displayed unusual expression profile in non-chemosensory tissues. The identification of a large set of putative chemosensory genes of each subfamily from a single insect species, together with their different expression profiles provide further information in understanding the functions of these chemosensory genes in S. inferens as well as other insects.

  1. Fungal and host transcriptome analysis of pH-regulated genes during colonization of apple fruits by Penicillium expansum.

    PubMed

    Barad, Shiri; Sela, Noa; Kumar, Dilip; Kumar-Dubey, Amit; Glam-Matana, Nofar; Sherman, Amir; Prusky, Dov

    2016-05-04

    Penicillium expansum is a destructive phytopathogen that causes decay in deciduous fruits during postharvest handling and storage. During colonization the fungus secretes D-gluconic acid (GLA), which modulates environmental pH and regulates mycotoxin accumulation in colonized tissue. Till now no transcriptomic analysis has addressed the specific contribution of the pathogen's pH regulation to the P. expansum colonization process. For this purpose total RNA from the leading edge of P. expansum-colonized apple tissue of cv. 'Golden Delicious' and from fungal cultures grown under pH 4 or 7 were sequenced and their gene expression patterns were compared. We present a large-scale analysis of the transcriptome data of P. expansum and apple response to fungal colonization. The fungal analysis revealed nine different clusters of gene expression patterns that were divided among three major groups in which the colonized tissue showed, respectively: (i) differing transcript expression patterns between mycelial growth at pH 4 and pH 7; (ii) similar transcript expression patterns of mycelial growth at pH 4; and (iii) similar transcript expression patterns of mycelial growth at pH 7. Each group was functionally characterized in order to decipher genes that are important for pH regulation and also for colonization of apple fruits by Penicillium. Furthermore, comparison of gene expression of healthy apple tissue with that of colonized tissue showed that differentially expressed genes revealed up-regulation of the jasmonic acid and mevalonate pathways, and also down-regulation of the glycogen and starch biosynthesis pathways. Overall, we identified important genes and functionalities of P. expansum that were controlled by the environmental pH. Differential expression patterns of genes belonging to the same gene family suggest that genes were selectively activated according to their optimal environmental conditions (pH, in vitro or in vivo) to enable the fungus to cope with varying conditions and to make optimal use of available enzymes. Comparison between the activation of the colonized host's gene responses by alkalizing Colletotrichum gloeosporioides and acidifying P. expansum pathogens indicated similar gene response patterns, but stronger responses to P. expansum, suggesting the importance of acidification by P. expansum as a factor in its increased aggressiveness.

  2. APPLICATION OF DNA MICROARRAYS TO REPRODUCTIVE TOXICOLOGY AND THE DEVELOPMENT OF A TESTIS ARRAY

    EPA Science Inventory

    With the advent of sequence information for entire mammalian genomes, it is now possible to analyze gene expression and gene polymorphisms on a genomic scale. The primary tool for analysis of gene expression is the DNA microarray. We have used commercially available cDNA micro...

  3. BIOMONITORING THE TOXICOGENOMIC RESPONSE TO ENDOCRINE DISRUPTING CHEMICALS IN HUMANS, LABORATORY SPECIES AND WILDLIFE

    EPA Science Inventory

    With the advent of sequence information for entire eukaryotic genomes, it is now possible to analyze gene expression on a genomic scale. The primary tool for genomic analysis of gene expression is the gene microarray. We have used commercially available and custom cDNA microarray...

  4. NetMiner-an ensemble pipeline for building genome-wide and high-quality gene co-expression network using massive-scale RNA-seq samples.

    PubMed

    Yu, Hua; Jiao, Bingke; Lu, Lu; Wang, Pengfei; Chen, Shuangcheng; Liang, Chengzhi; Liu, Wei

    2018-01-01

    Accurately reconstructing gene co-expression network is of great importance for uncovering the genetic architecture underlying complex and various phenotypes. The recent availability of high-throughput RNA-seq sequencing has made genome-wide detecting and quantifying of the novel, rare and low-abundance transcripts practical. However, its potential merits in reconstructing gene co-expression network have still not been well explored. Using massive-scale RNA-seq samples, we have designed an ensemble pipeline, called NetMiner, for building genome-scale and high-quality Gene Co-expression Network (GCN) by integrating three frequently used inference algorithms. We constructed a RNA-seq-based GCN in one species of monocot rice. The quality of network obtained by our method was verified and evaluated by the curated gene functional association data sets, which obviously outperformed each single method. In addition, the powerful capability of network for associating genes with functions and agronomic traits was shown by enrichment analysis and case studies. In particular, we demonstrated the potential value of our proposed method to predict the biological roles of unknown protein-coding genes, long non-coding RNA (lncRNA) genes and circular RNA (circRNA) genes. Our results provided a valuable and highly reliable data source to select key candidate genes for subsequent experimental validation. To facilitate identification of novel genes regulating important biological processes and phenotypes in other plants or animals, we have published the source code of NetMiner, making it freely available at https://github.com/czllab/NetMiner.

  5. Variable DAXX gene methylation is a common feature of placental trophoblast differentiation, preeclampsia, and response to hypoxia.

    PubMed

    Novakovic, Boris; Evain-Brion, Danièle; Murthi, Padma; Fournier, Thiery; Saffery, Richard

    2017-06-01

    Placental functioning relies on the appropriate differentiation of progenitor villous cytotrophoblasts (CTBs) into extravillous cytotrophoblasts (EVCTs), including invasive EVCTs, and the multinucleated syncytiotrophoblast (ST) layer. This is accompanied by a general move away from a proliferative, immature phenotype. Genome-scale expression studies have provided valuable insight into genes that are associated with the shift to both an invasive EVCT and ST phenotype, whereas genome-scale DNA methylation analysis has shown that differentiation to ST involves widespread methylation shifts, which are counteracted by low oxygen. In the current study, we sought to identify DNA methylation variation that is associated with transition from CTB to ST in vitro and from a noninvasive to invasive EVCT phenotype after culture on Matrigel. Of the several hundred differentially methylated regions that were identified in each comparison, the majority showed a loss of methylation with differentiation. This included a large differentially methylated region (DMR) in the gene body of death domain-associated protein 6 ( DAXX ), which lost methylation during both CTB syncytialization to ST and EVCT differentiation to invasive EVCT. Comparison to publicly available methylation array data identified the same DMR as among the most consistently differentially methylated genes in placental samples from preeclampsia pregnancies. Of interest, in vitro culture of CTB or ST in low oxygen increases methylation in the same region, which correlates with delayed differentiation. Analysis of combined epigenomics signatures confirmed DAXX DMR as a likely regulatory element, and direct gene expression analysis identified a positive association between methylation at this site and DAXX expression levels. The widespread dynamic nature of DAXX methylation in association with trophoblast differentiation and placenta-associated pathologies is consistent with an important role for this gene in proper placental development and function.-Novakovic, B., Evain-Brion, D., Murthi, P., Fournier, T., Saffery, R. Variable DAXX gene methylation is a common feature of placental trophoblast differentiation, preeclampsia, and response to hypoxia. © FASEB.

  6. Gene selection and cancer type classification of diffuse large-B-cell lymphoma using a bivariate mixture model for two-species data.

    PubMed

    Su, Yuhua; Nielsen, Dahlia; Zhu, Lei; Richards, Kristy; Suter, Steven; Breen, Matthew; Motsinger-Reif, Alison; Osborne, Jason

    2013-01-05

    : A bivariate mixture model utilizing information across two species was proposed to solve the fundamental problem of identifying differentially expressed genes in microarray experiments. The model utility was illustrated using a dog and human lymphoma data set prepared by a group of scientists in the College of Veterinary Medicine at North Carolina State University. A small number of genes were identified as being differentially expressed in both species and the human genes in this cluster serve as a good predictor for classifying diffuse large-B-cell lymphoma (DLBCL) patients into two subgroups, the germinal center B-cell-like diffuse large B-cell lymphoma and the activated B-cell-like diffuse large B-cell lymphoma. The number of human genes that were observed to be significantly differentially expressed (21) from the two-species analysis was very small compared to the number of human genes (190) identified with only one-species analysis (human data). The genes may be clinically relevant/important, as this small set achieved low misclassification rates of DLBCL subtypes. Additionally, the two subgroups defined by this cluster of human genes had significantly different survival functions, indicating that the stratification based on gene-expression profiling using the proposed mixture model provided improved insight into the clinical differences between the two cancer subtypes.

  7. Digital gene expression profiling of flax (Linum usitatissimum L.) stem peel identifies genes enriched in fiber-bearing phloem tissue.

    PubMed

    Guo, Yuan; Qiu, Caisheng; Long, Songhua; Chen, Ping; Hao, Dongmei; Preisner, Marta; Wang, Hui; Wang, Yufu

    2017-08-30

    To better understand the molecular mechanisms and gene expression characteristics associated with development of bast fiber cell within flax stem phloem, the gene expression profiling of flax stem peels and leaves were screened, using Illumina's Digital Gene Expression (DGE) analysis. Four DGE libraries (2 for stem peel and 2 for leaf), ranging from 6.7 to 9.2 million clean reads were obtained, which produced 7.0 million and 6.8 million mapped reads for flax stem peel and leave, respectively. By differential gene expression analysis, a total of 975 genes, of which 708 (73%) genes have protein-coding annotation, were identified as phloem enriched genes putatively involved in the processes of polysaccharide and cell wall metabolism. Differential expression genes (DEGs) was validated using quantitative RT-PCR, the expression pattern of all nine genes determined by qRT-PCR fitted in well with that obtained by sequencing analysis. Cluster and Gene Ontology (GO) analysis revealed that a large number of genes related to metabolic process, catalytic activity and binding category were expressed predominantly in the stem peels. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the phloem enriched genes suggested approximately 111 biological pathways. The large number of genes and pathways produced from DGE sequencing will expand our understanding of the complex molecular and cellular events in flax bast fiber development and provide a foundation for future studies on fiber development in other bast fiber crops. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Nonviral vectors for cancer gene therapy: prospects for integrating vectors and combination therapies.

    PubMed

    Ohlfest, John R; Freese, Andrew B; Largaespada, David A

    2005-12-01

    Gene therapy has the potential to improve the clinical outcome of many cancers by transferring therapeutic genes into tumor cells or normal host tissue. Gene transfer into tumor cells or tumor-associated stroma is being employed to induce tumor cell death, stimulate anti-tumor immune response, inhibit angiogenesis, and control tumor cell growth. Viral vectors have been used to achieve this proof of principle in animal models and, in select cases, in human clinical trials. Nevertheless, there has been considerable interest in developing nonviral vectors for cancer gene therapy. Nonviral vectors are simpler, more amenable to large-scale manufacture, and potentially safer for clinical use. Nonviral vectors were once limited by low gene transfer efficiency and transient or steadily declining gene expression. However, recent improvements in plasmid-based vectors and delivery methods are showing promise in circumventing these obstacles. This article reviews the current status of nonviral cancer gene therapy, with an emphasis on combination strategies, long-term gene transfer using transposons and bacteriophage integrases, and future directions.

  9. CMIP: a software package capable of reconstructing genome-wide regulatory networks using gene expression data.

    PubMed

    Zheng, Guangyong; Xu, Yaochen; Zhang, Xiujun; Liu, Zhi-Ping; Wang, Zhuo; Chen, Luonan; Zhu, Xin-Guang

    2016-12-23

    A gene regulatory network (GRN) represents interactions of genes inside a cell or tissue, in which vertexes and edges stand for genes and their regulatory interactions respectively. Reconstruction of gene regulatory networks, in particular, genome-scale networks, is essential for comparative exploration of different species and mechanistic investigation of biological processes. Currently, most of network inference methods are computationally intensive, which are usually effective for small-scale tasks (e.g., networks with a few hundred genes), but are difficult to construct GRNs at genome-scale. Here, we present a software package for gene regulatory network reconstruction at a genomic level, in which gene interaction is measured by the conditional mutual information measurement using a parallel computing framework (so the package is named CMIP). The package is a greatly improved implementation of our previous PCA-CMI algorithm. In CMIP, we provide not only an automatic threshold determination method but also an effective parallel computing framework for network inference. Performance tests on benchmark datasets show that the accuracy of CMIP is comparable to most current network inference methods. Moreover, running tests on synthetic datasets demonstrate that CMIP can handle large datasets especially genome-wide datasets within an acceptable time period. In addition, successful application on a real genomic dataset confirms its practical applicability of the package. This new software package provides a powerful tool for genomic network reconstruction to biological community. The software can be accessed at http://www.picb.ac.cn/CMIP/ .

  10. Large scale aggregate microarray analysis reveals three distinct molecular subclasses of human preeclampsia.

    PubMed

    Leavey, Katherine; Bainbridge, Shannon A; Cox, Brian J

    2015-01-01

    Preeclampsia (PE) is a life-threatening hypertensive pathology of pregnancy affecting 3-5% of all pregnancies. To date, PE has no cure, early detection markers, or effective treatments short of the removal of what is thought to be the causative organ, the placenta, which may necessitate a preterm delivery. Additionally, numerous small placental microarray studies attempting to identify "PE-specific" genes have yielded inconsistent results. We therefore hypothesize that preeclampsia is a multifactorial disease encompassing several pathology subclasses, and that large cohort placental gene expression analysis will reveal these groups. To address our hypothesis, we utilized known bioinformatic methods to aggregate 7 microarray data sets across multiple platforms in order to generate a large data set of 173 patient samples, including 77 with preeclampsia. Unsupervised clustering of these patient samples revealed three distinct molecular subclasses of PE. This included a "canonical" PE subclass demonstrating elevated expression of known PE markers and genes associated with poor oxygenation and increased secretion, as well as two other subclasses potentially representing a poor maternal response to pregnancy and an immunological presentation of preeclampsia. Our analysis sheds new light on the heterogeneity of PE patients, and offers up additional avenues for future investigation. Hopefully, our subclassification of preeclampsia based on molecular diversity will finally lead to the development of robust diagnostics and patient-based treatments for this disorder.

  11. Using the Saccharomyces Genome Database (SGD) for analysis of genomic information

    PubMed Central

    Skrzypek, Marek S.; Hirschman, Jodi

    2011-01-01

    Analysis of genomic data requires access to software tools that place the sequence-derived information in the context of biology. The Saccharomyces Genome Database (SGD) integrates functional information about budding yeast genes and their products with a set of analysis tools that facilitate exploring their biological details. This unit describes how the various types of functional data available at SGD can be searched, retrieved, and analyzed. Starting with the guided tour of the SGD Home page and Locus Summary page, this unit highlights how to retrieve data using YeastMine, how to visualize genomic information with GBrowse, how to explore gene expression patterns with SPELL, and how to use Gene Ontology tools to characterize large-scale datasets. PMID:21901739

  12. Satellite DNA-based artificial chromosomes for use in gene therapy.

    PubMed

    Hadlaczky, G

    2001-04-01

    Satellite DNA-based artificial chromosomes (SATACs) can be made by induced de novo chromosome formation in cells of different mammalian species. These artificially generated accessory chromosomes are composed of predictable DNA sequences and they contain defined genetic information. Prototype human SATACs have been successfully constructed in different cell types from 'neutral' endogenous DNA sequences from the short arm of the human chromosome 15. SATACs have already passed a number of hurdles crucial to their further development as gene therapy vectors, including: large-scale purification; transfer of purified artificial chromosomes into different cells and embryos; generation of transgenic animals and germline transmission with purified SATACs; and the tissue-specific expression of a therapeutic gene from an artificial chromosome in the milk of transgenic animals.

  13. Expression of organophosphorus-degradation gene ( opd) in aggregating and non-aggregating filamentous nitrogen-fixing cyanobacteria

    NASA Astrophysics Data System (ADS)

    Li, Qiong; Tang, Qing; Xu, Xudong; Gao, Hong

    2010-11-01

    Genetic engineering in filamentous N2-fixing cyanobacteria usually involves Anabaena sp. PCC 7120 and several other non-aggregating species. Mass culture and harvest of such species are more energy consuming relative to aggregating species. To establish a gene transfer system for aggregating species, we tested many species of Anabaena and Nostoc, and identified Nostoc muscorum FACHB244 as a species that can be genetically manipulated using the conjugative gene transfer system. To promote biodegradation of organophosphorus pollutants in aquatic environments, we introduced a plasmid containing the organophosphorus-degradation gene ( opd) into Anabaena sp. PCC 7120 and Nostoc muscorum FACHB244 by conjugation. The opd gene was driven by a strong promoter, P psbA . From both species, we obtained transgenic strains having organophosphorus-degradation activities. At 25°C, the whole-cell activities of the transgenic Anabaena and Nostoc strains were 0.163±0.001 and 0.289±0.042 unit/μg Chl a, respectively. However, most colonies resulting from the gene transfer showed no activity. PCR and DNA sequencing revealed deletions or rearrangements in the plasmid in some of the colonies. Expression of the green fluorescent protein gene from the same promoter in Anabaena sp. PCC 7120 showed similar results. These results suggest that there is the potential to promote the degradation of organophosphorus pollutants with transgenic cyanobacteria and that selection of high-expression transgenic colonies is important for genetic engineering of Anabaena and Nostoc species. For the first time, we established a gene transfer and expression system in an aggregating filamentous N2-fixing cyanobacterium. The genetic manipulation system of Nostoc muscorum FACHB244 could be utilized in the elimination of pollutants and large-scale production of valuable proteins or metabolites.

  14. The ANGULATA7 gene encodes a DnaJ-like zinc finger-domain protein involved in chloroplast function and leaf development in Arabidopsis.

    PubMed

    Muñoz-Nortes, Tamara; Pérez-Pérez, José Manuel; Ponce, María Rosa; Candela, Héctor; Micol, José Luis

    2017-03-01

    The characterization of mutants with altered leaf shape and pigmentation has previously allowed the identification of nuclear genes that encode plastid-localized proteins that perform essential functions in leaf growth and development. A large-scale screen previously allowed us to isolate ethyl methanesulfonate-induced mutants with small rosettes and pale green leaves with prominent marginal teeth, which were assigned to a phenotypic class that we dubbed Angulata. The molecular characterization of the 12 genes assigned to this phenotypic class should help us to advance our understanding of the still poorly understood relationship between chloroplast biogenesis and leaf morphogenesis. In this article, we report the phenotypic and molecular characterization of the angulata7-1 (anu7-1) mutant of Arabidopsis thaliana, which we found to be a hypomorphic allele of the EMB2737 gene, which was previously known only for its embryonic-lethal mutations. ANU7 encodes a plant-specific protein that contains a domain similar to the central cysteine-rich domain of DnaJ proteins. The observed genetic interaction of anu7-1 with a loss-of-function allele of GENOMES UNCOUPLED1 suggests that the anu7-1 mutation triggers a retrograde signal that leads to changes in the expression of many genes that normally function in the chloroplasts. Many such genes are expressed at higher levels in anu7-1 rosettes, with a significant overrepresentation of those required for the expression of plastid genome genes. Like in other mutants with altered expression of plastid-encoded genes, we found that anu7-1 exhibits defects in the arrangement of thylakoidal membranes, which appear locally unappressed. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.

  15. Ibogaine signals addiction genes and methamphetamine alteration of long-term potentiation.

    PubMed

    Onaivi, Emmanuel S; Ali, Syed F; Chirwa, Sanika S; Zwiller, Jean; Thiriet, Nathalie; Akinshola, B Emmanuel; Ishiguro, Hiroki

    2002-06-01

    The mapping of the human genetic code will enable us to identify potential gene products involved in human addictions and diseases that have hereditary components. Thus, large-scale, parallel gene-expression studies, made possible by advances in microarray technologies, have shown insights into the connection between specific genes, or sets of genes, and human diseases. The compulsive use of addictive substances despite adverse consequences continues to affect society, and the science underlying these addictions in general is intensively studied. Pharmacological treatment of drug and alcohol addiction has largely been disappointing, and new therapeutic targets and hypotheses are needed. As the usefulness of the pharmacotherapy of addiction has been limited, an emerging potential, yet controversial, therapeutic agent is the natural alkaloid ibogaine. We have continued to investigate programs of gene expression and the putative signaling molecules used by psychostimulants such as amphetamine in in vivo and in vitro models. Our work and that of others reveal that complex but defined signal transduction pathways are associated with psychostimulant administration and that there is broad-spectrum regulation of these signals by ibogaine. We report that the actions of methamphetamine were similar to those of cocaine, including the propensity to alter long-term potentiation (LTP) in the hippocampus of the rat brain. This action suggests that there may be a "threshold" beyond which the excessive brain stimulation that probably occurs with compulsive psychostimulant use results in the occlusion of LTP. The influence of ibogaine on immediate early genes (IEGs) and other candidate genes possibly regulated by psychostimulants and other abused substances requires further evaluation in compulsive use, reward, relapse, tolerance, craving and withdrawal reactions. It is therefore tempting to suggest that ibogaine signals addiction gene products.

  16. From genes to genomes: a new paradigm for studying fungal pathogenesis in Magnaporthe oryzae.

    PubMed

    Xu, Jin-Rong; Zhao, Xinhua; Dean, Ralph A

    2007-01-01

    Magnaporthe oryzae is the most destructive fungal pathogen of rice worldwide and because of its amenability to classical and molecular genetic manipulation, availability of a genome sequence, and other resources it has emerged as a leading model system to study host-pathogen interactions. This chapter reviews recent progress toward elucidation of the molecular basis of infection-related morphogenesis, host penetration, invasive growth, and host-pathogen interactions. Related information on genome analysis and genomic studies of plant infection processes is summarized under specific topics where appropriate. Particular emphasis is placed on the role of MAP kinase and cAMP signal transduction pathways and unique features in the genome such as repetitive sequences and expanded gene families. Emerging developments in functional genome analysis through large-scale insertional mutagenesis and gene expression profiling are detailed. The chapter concludes with new prospects in the area of systems biology, such as protein expression profiling, and highlighting remaining crucial information needed to fully appreciate host-pathogen interactions.

  17. Gene Identification of Pheromone Gland Genes Involved in Type II Sex Pheromone Biosynthesis and Transportation in Female Tea Pest Ectropis grisescens

    PubMed Central

    Li, Zhao-Qun; Ma, Long; Yin, Qian; Cai, Xiao-Ming; Luo, Zong-Xiu; Bian, Lei; Xin, Zhao-Jun; He, Peng; Chen, Zong-Mao

    2018-01-01

    Moths can biosynthesize sex pheromones in the female sex pheromone glands (PGs) and can distinguish species-specific sex pheromones using their antennae. However, the biosynthesis and transportation mechanism for Type II sex pheromone components has rarely been documented in moths. In this study, we constructed a massive PG transcriptome database (14.72 Gb) from a moth species, Ectropis grisescens, which uses type II sex pheromones and is a major tea pest in China. We further identified putative sex pheromone biosynthesis and transportation-related unigenes: 111 cytochrome P450 monooxygenases (CYPs), 25 odorant-binding proteins (OBPs), and 20 chemosensory proteins (CSPs). Tissue expression and phylogenetic tree analyses showed that one CYP (EgriCYP341-fragment3), one OBP (EgriOBP4), and one CSP (EgriCSP10) gene displayed an enriched expression in the PGs, and that EgriOBP2, 3, and 25 are clustered in the moth pheromone-binding protein clade. We considered these our candidate genes. Our results yielded large-scale PG sequence information for further functional studies. PMID:29317471

  18. Immunological metagene signatures derived from immunogenic cancer cell death associate with improved survival of patients with lung, breast or ovarian malignancies: A large-scale meta-analysis

    PubMed Central

    Garg, Abhishek D.; De Ruysscher, Dirk; Agostinis, Patrizia

    2016-01-01

    ABSTRACT The emerging role of the cancer cell-immune cell interface in shaping tumorigenesis/anticancer immunotherapy has increased the need to identify prognostic biomarkers. Henceforth, our primary aim was to identify the immunogenic cell death (ICD)-derived metagene signatures in breast, lung and ovarian cancer that associate with improved patient survival. To this end, we analyzed the prognostic impact of differential gene-expression of 33 pre-clinically-validated ICD-parameters through a large-scale meta-analysis involving 3,983 patients (‘discovery’ dataset) across lung (1,432), breast (1,115) and ovarian (1,436) malignancies. The main results were also substantiated in ‘validation’ datasets consisting of 818 patients of same cancer-types (i.e. 285 breast/274 lung/259 ovarian). The ICD-associated parameters exhibited a highly-clustered and largely cancer type-specific prognostic impact. Interestingly, we delineated ICD-derived consensus-metagene signatures that exhibited a positive prognostic impact that was either cancer type-independent or specific. Importantly, most of these ICD-derived consensus-metagenes (acted as attractor-metagenes and thereby) ‘attracted’ highly co-expressing sets of genes or convergent-metagenes. These convergent-metagenes also exhibited positive prognostic impact in respective cancer types. Remarkably, we found that the cancer type-independent consensus-metagene acted as an ‘attractor’ for cancer-specific convergent-metagenes. This reaffirms that the immunological prognostic landscape of cancer tends to segregate between cancer-independent and cancer-type specific gene signatures. Moreover, this prognostic landscape was largely dominated by the classical T cell activity/infiltration/function-related biomarkers. Interestingly, each cancer type tended to associate with biomarkers representing a specific T cell activity or function rather than pan-T cell biomarkers. Thus, our analysis confirms that ICD can serve as a platform for discovery of novel prognostic metagenes. PMID:27057433

  19. Identification of host transcriptional networks showing concentration-dependent regulation by HPV16 E6 and E7 proteins in basal cervical squamous epithelial cells

    PubMed Central

    Smith, Stephen P.; Scarpini, Cinzia G.; Groves, Ian J.; Odle, Richard I.; Coleman, Nicholas

    2016-01-01

    Development of cervical squamous cell carcinoma requires increased expression of the major high-risk human-papillomavirus (HPV) oncogenes E6 and E7 in basal cervical epithelial cells. We used a systems biology approach to identify host transcriptional networks in such cells and study the concentration-dependent changes produced by HPV16-E6 and -E7 oncoproteins. We investigated sample sets derived from the W12 model of cervical neoplastic progression, for which high quality phenotype/genotype data were available. We defined a gene co-expression matrix containing a small number of highly-connected hub nodes that controlled large numbers of downstream genes (regulons), indicating the scale-free nature of host gene co-expression in W12. We identified a small number of ‘master regulators’ for which downstream effector genes were significantly associated with protein levels of HPV16 E6 (n = 7) or HPV16 E7 (n = 5). We validated our data by depleting E6/E7 in relevant cells and by functional analysis of selected genes in vitro. We conclude that the network of transcriptional interactions in HPV16-infected basal-type cervical epithelium is regulated in a concentration-dependent manner by E6/E7, via a limited number of central master-regulators. These effects are likely to be significant in cervical carcinogenesis, where there is competitive selection of cells with elevated expression of virus oncoproteins. PMID:27457222

  20. Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

    PubMed Central

    Li, Xia; Rao, Shaoqi; Jiang, Wei; Li, Chuanxing; Xiao, Yun; Guo, Zheng; Zhang, Qingpu; Wang, Lihong; Du, Lei; Li, Jing; Li, Li; Zhang, Tianwen; Wang, Qing K

    2006-01-01

    Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network) to address the underlying regulations of genes that can span any unit(s) of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex gene regulations related to the development, aging and progressive pathogenesis of a complex disease where potential dependences between different experiment units might occurs. PMID:16420705

  1. BIG: a large-scale data integration tool for renal physiology.

    PubMed

    Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya; Knepper, Mark A

    2016-10-01

    Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: "How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?" This is the type of problem that has motivated the "Big-Data" revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/.

  2. Integrative analyses of RNA editing, alternative splicing, and expression of young genes in human brain transcriptome by deep RNA sequencing.

    PubMed

    Wu, Dong-Dong; Ye, Ling-Qun; Li, Yan; Sun, Yan-Bo; Shao, Yi; Chen, Chunyan; Zhu, Zhu; Zhong, Li; Wang, Lu; Irwin, David M; Zhang, Yong E; Zhang, Ya-Ping

    2015-08-01

    Next-generation RNA sequencing has been successfully used for identification of transcript assembly, evaluation of gene expression levels, and detection of post-transcriptional modifications. Despite these large-scale studies, additional comprehensive RNA-seq data from different subregions of the human brain are required to fully evaluate the evolutionary patterns experienced by the human brain transcriptome. Here, we provide a total of 6.5 billion RNA-seq reads from different subregions of the human brain. A significant correlation was observed between the levels of alternative splicing and RNA editing, which might be explained by a competition between the molecular machineries responsible for the splicing and editing of RNA. Young human protein-coding genes demonstrate biased expression to the neocortical and non-neocortical regions during evolution on the lineage leading to humans. We also found that a significantly greater number of young human protein-coding genes are expressed in the putamen, a tissue that was also observed to have the highest level of RNA-editing activity. The putamen, which previously received little attention, plays an important role in cognitive ability, and our data suggest a potential contribution of the putamen to human evolution. © The Author (2015). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, IBCB, SIBS, CAS. All rights reserved.

  3. Estimating mutual information using B-spline functions – an improved similarity measure for analysing gene expression data

    PubMed Central

    Daub, Carsten O; Steuer, Ralf; Selbig, Joachim; Kloska, Sebastian

    2004-01-01

    Background The information theoretic concept of mutual information provides a general framework to evaluate dependencies between variables. In the context of the clustering of genes with similar patterns of expression it has been suggested as a general quantity of similarity to extend commonly used linear measures. Since mutual information is defined in terms of discrete variables, its application to continuous data requires the use of binning procedures, which can lead to significant numerical errors for datasets of small or moderate size. Results In this work, we propose a method for the numerical estimation of mutual information from continuous data. We investigate the characteristic properties arising from the application of our algorithm and show that our approach outperforms commonly used algorithms: The significance, as a measure of the power of distinction from random correlation, is significantly increased. This concept is subsequently illustrated on two large-scale gene expression datasets and the results are compared to those obtained using other similarity measures. A C++ source code of our algorithm is available for non-commercial use from kloska@scienion.de upon request. Conclusion The utilisation of mutual information as similarity measure enables the detection of non-linear correlations in gene expression datasets. Frequently applied linear correlation measures, which are often used on an ad-hoc basis without further justification, are thereby extended. PMID:15339346

  4. Shaping skeletal growth by modular regulatory elements in the Bmp5 gene.

    PubMed

    Guenther, Catherine; Pantalena-Filho, Luiz; Kingsley, David M

    2008-12-01

    Cartilage and bone are formed into a remarkable range of shapes and sizes that underlie many anatomical adaptations to different lifestyles in vertebrates. Although the morphological blueprints for individual cartilage and bony structures must somehow be encoded in the genome, we currently know little about the detailed genomic mechanisms that direct precise growth patterns for particular bones. We have carried out large-scale enhancer surveys to identify the regulatory architecture controlling developmental expression of the mouse Bmp5 gene, which encodes a secreted signaling molecule required for normal morphology of specific skeletal features. Although Bmp5 is expressed in many skeletal precursors, different enhancers control expression in individual bones. Remarkably, we show here that different enhancers also exist for highly restricted spatial subdomains along the surface of individual skeletal structures, including ribs and nasal cartilages. Transgenic, null, and regulatory mutations confirm that these anatomy-specific sequences are sufficient to trigger local changes in skeletal morphology and are required for establishing normal growth rates on separate bone surfaces. Our findings suggest that individual bones are composite structures whose detailed growth patterns are built from many smaller lineage and gene expression domains. Individual enhancers in BMP genes provide a genomic mechanism for controlling precise growth domains in particular cartilages and bones, making it possible to separately regulate skeletal anatomy at highly specific locations in the body.

  5. Large-scale image-based profiling of single-cell phenotypes in arrayed CRISPR-Cas9 gene perturbation screens.

    PubMed

    de Groot, Reinoud; Lüthi, Joel; Lindsay, Helen; Holtackers, René; Pelkmans, Lucas

    2018-01-23

    High-content imaging using automated microscopy and computer vision allows multivariate profiling of single-cell phenotypes. Here, we present methods for the application of the CISPR-Cas9 system in large-scale, image-based, gene perturbation experiments. We show that CRISPR-Cas9-mediated gene perturbation can be achieved in human tissue culture cells in a timeframe that is compatible with image-based phenotyping. We developed a pipeline to construct a large-scale arrayed library of 2,281 sequence-verified CRISPR-Cas9 targeting plasmids and profiled this library for genes affecting cellular morphology and the subcellular localization of components of the nuclear pore complex (NPC). We conceived a machine-learning method that harnesses genetic heterogeneity to score gene perturbations and identify phenotypically perturbed cells for in-depth characterization of gene perturbation effects. This approach enables genome-scale image-based multivariate gene perturbation profiling using CRISPR-Cas9. © 2018 The Authors. Published under the terms of the CC BY 4.0 license.

  6. Transcriptome Profiling of Lotus japonicus Roots During Arbuscular Mycorrhiza Development and Comparison with that of Nodulation

    PubMed Central

    Deguchi, Yuichi; Banba, Mari; Shimoda, Yoshikazu; Chechetka, Svetlana A.; Suzuri, Ryota; Okusako, Yasuhiro; Ooki, Yasuhiro; Toyokura, Koichi; Suzuki, Akihiro; Uchiumi, Toshiki; Higashi, Shiro; Abe, Mikiko; Kouchi, Hiroshi; Izui, Katsura; Hata, Shingo

    2007-01-01

    Abstract To better understand the molecular responses of plants to arbuscular mycorrhizal (AM) fungi, we analyzed the differential gene expression patterns of Lotus japonicus, a model legume, with the aid of a large-scale cDNA macroarray. Experiments were carried out considering the effects of contaminating microorganisms in the soil inoculants. When the colonization by AM fungi, i.e. Glomus mosseae and Gigaspora margarita, was well established, four cysteine protease genes were induced. In situ hybridization revealed that these cysteine protease genes were specifically expressed in arbuscule-containing inner cortical cells of AM roots. On the other hand, phenylpropanoid biosynthesis-related genes for phenylalanine ammonia-lyase (PAL), chalcone synthase, etc. were repressed in the later stage, although they were moderately up-regulated on the initial association with the AM fungus. Real-time RT–PCR experiments supported the array experiments. To further confirm the characteristic expression, a PAL promoter was fused with a reporter gene and introduced into L. japonicus, and then the transformants were grown with a commercial inoculum of G. mosseae. The reporter activity was augmented throughout the roots due to the presence of contaminating microorganisms in the inoculum. Interestingly, G. mosseae only colonized where the reporter activity was low. Comparison of the transcriptome profiles of AM roots and nitrogen-fixing root nodules formed with Mesorhizobium loti indicated that the PAL genes and other phenylpropanoid biosynthesis-related genes were similarly repressed in the two organs. PMID:17634281

  7. Gene expression profile of the plant pathogen Xylella fastidiosa during biofilm formation in vitro.

    PubMed

    de Souza, Alessandra A; Takita, Marco A; Coletta-Filho, Helvécio D; Caldana, Camila; Yanai, Giane M; Muto, Nair H; de Oliveira, Regina C; Nunes, Luiz R; Machado, Marcos A

    2004-08-15

    A biofilm is a community of microorganisms attached to a solid surface. Cells within biofilms differ from planktonic cells, showing higher resistance to biocides, detergent, antibiotic treatments and host defense responses. Even though there are a number of gene expression studies in bacterial biofilm formation, limited information is available concerning plant pathogen. It was previously demonstrated that the plant pathogen Xylella fastidiosa could grow as a biofilm, a possibly important factor for its pathogenicity. In this study we utilized analysis of microarrays to specifically identify genes expressed in X. fastidiosa cells growing in a biofilm, when compared to planktonic cells. About half of the differentially expressed genes encode hypothetical proteins, reflecting the large number of ORFs with unknown functions in bacterial genomes. However, under the biofilm condition we observed an increase in the expression of some housekeeping genes responsible for metabolic functions. We also found a large number of genes from the pXF51 plasmid being differentially expressed. Some of the overexpressed genes in the biofilm condition encode proteins involved in attachment to surfaces. Other genes possibly confer advantages to the bacterium in the environment that it colonizes. This study demonstrates that the gene expression in the biofilm growth condition of the plant pathogen X. fastidiosa is quite similar to other characterized systems.

  8. Comprehensive analysis of area-specific and time-dependent changes in gene expression in the motor cortex of macaque monkeys during recovery from spinal cord injury.

    PubMed

    Higo, Noriyuki; Sato, Akira; Yamamoto, Tatsuya; Oishi, Takao; Nishimura, Yukio; Murata, Yumi; Onoe, Hirotaka; Isa, Tadashi; Kojima, Toshio

    2018-05-01

    The present study aimed to assess the molecular bases of cortical compensatory mechanisms following spinal cord injury in primates. To accomplish this, comprehensive changes in gene expression were investigated in the bilateral primary motor cortex (M1), dorsal premotor cortex (PMd), and ventral premotor cortex (PMv) after a unilateral lesion of the lateral corticospinal tract (l-CST). At 2 weeks after the lesion, a large number of genes exhibited altered expression levels in the contralesional M1, which is directly linked to the lesioned l-CST. Gene ontology and network analyses indicated that these changes in gene expression are involved in the atrophy and plasticity changes observed in neurons. Orchestrated gene expression changes were present when behavioral recovery was attained 3 months after the lesion, particularly among the bilateral premotor areas, and a large number of these genes are involved in plasticity. Moreover, several genes abundantly expressed in M1 of intact monkeys were upregulated in both the PMd and PMv after the l-CST lesion. These area-specific and time-dependent changes in gene expression may underlie the molecular mechanisms of functional recovery following a lesion of the l-CST. © 2018 Wiley Periodicals, Inc.

  9. Explaining human uniqueness: genome interactions with environment, behaviour and culture.

    PubMed

    Varki, Ajit; Geschwind, Daniel H; Eichler, Evan E

    2008-10-01

    What makes us human? Specialists in each discipline respond through the lens of their own expertise. In fact, 'anthropogeny' (explaining the origin of humans) requires a transdisciplinary approach that eschews such barriers. Here we take a genomic and genetic perspective towards molecular variation, explore systems analysis of gene expression and discuss an organ-systems approach. Rejecting any 'genes versus environment' dichotomy, we then consider genome interactions with environment, behaviour and culture, finally speculating that aspects of human uniqueness arose because of a primate evolutionary trend towards increasing and irreversible dependence on learned behaviours and culture - perhaps relaxing allowable thresholds for large-scale genomic diversity.

  10. Explaining human uniqueness: genome interactions with environment, behaviour and culture

    PubMed Central

    Varki, Ajit; Geschwind, Daniel H.; Eichler, Evan E.

    2009-01-01

    What makes us human? Specialists in each discipline respond through the lens of their own expertise. In fact, ‘anthropogeny’ (explaining the origin of humans) requires a transdisciplinary approach that eschews such barriers. Here we take a genomic and genetic perspective towards molecular variation, explore systems analysis of gene expression and discuss an organ-systems approach. Rejecting any ‘genes versus environment’ dichotomy, we then consider genome interactions with environment, behaviour and culture, finally speculating that aspects of human uniqueness arose because of a primate evolutionary trend towards increasing and irreversible dependence on learned behaviours and culture — perhaps relaxing allowable thresholds for large-scale genomic diversity. PMID:18802414

  11. Fermentative production of l-galactonate by using recombinant Saccharomyces cerevisiae containing the endogenous galacturonate reductase gene from Cryptococcus diffluens.

    PubMed

    Matsubara, Takeo; Hamada, Shohei; Wakabayashi, Ayaka; Kishida, Masao

    2016-11-01

    The GAR1 gene, encoding d-galacturonate reductase in Cryptococcus diffluens, was isolated, and the GAR1-expression plasmid was constructed by insertion of GAR1 downstream of the yeast constitutive promoter in the yeast-integrating vector. Recombinant Saccharomyces cerevisiae expressing C. diffluensd-galacturonate reductase from a genome integrated copy of the gene was cultured for use the conversion of d-galacturonic acid to l-galactonic acid. The optimum conditions for l-galactonic acid production were determined in terms of the initial concentration of d-galacturonic acid, fermentation pH, and mixed sugars. The following conditions yielded high efficiency in the conversion of d-galacturonic acid to l-galactonic acid in large-scale cultures: 0.1% initial d-galacturonic acid concentration, pH 3.5, and glucose as additional sugar. The aerobic condition was necessary for the conversion of d-galacturonic acid. Subculture of that recombinant was not showing to decrease of the d-galacturonic acid conversion rate even though it was repeated in ten generations. Culturing in scale-up, the conversion rate of d-galacturonic acid to l-galactonic acid was increased. Copyright © 2016 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  12. Normalization of RNA-seq data using factor analysis of control genes or samples

    PubMed Central

    Risso, Davide; Ngai, John; Speed, Terence P.; Dudoit, Sandrine

    2015-01-01

    Normalization of RNA-seq data has proven essential to ensure accurate inference of expression levels. Here we show that usual normalization approaches mostly account for sequencing depth and fail to correct for library preparation and other more-complex unwanted effects. We evaluate the performance of the External RNA Control Consortium (ERCC) spike-in controls and investigate the possibility of using them directly for normalization. We show that the spike-ins are not reliable enough to be used in standard global-scaling or regression-based normalization procedures. We propose a normalization strategy, remove unwanted variation (RUV), that adjusts for nuisance technical effects by performing factor analysis on suitable sets of control genes (e.g., ERCC spike-ins) or samples (e.g., replicate libraries). Our approach leads to more-accurate estimates of expression fold-changes and tests of differential expression compared to state-of-the-art normalization methods. In particular, RUV promises to be valuable for large collaborative projects involving multiple labs, technicians, and/or platforms. PMID:25150836

  13. MPIGeneNet: Parallel Calculation of Gene Co-Expression Networks on Multicore Clusters.

    PubMed

    Gonzalez-Dominguez, Jorge; Martin, Maria J

    2017-10-10

    In this work we present MPIGeneNet, a parallel tool that applies Pearson's correlation and Random Matrix Theory to construct gene co-expression networks. It is based on the state-of-the-art sequential tool RMTGeneNet, which provides networks with high robustness and sensitivity at the expenses of relatively long runtimes for large scale input datasets. MPIGeneNet returns the same results as RMTGeneNet but improves the memory management, reduces the I/O cost, and accelerates the two most computationally demanding steps of co-expression network construction by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on two different systems using three typical input datasets shows that MPIGeneNet is significantly faster than RMTGeneNet. As an example, our tool is up to 175.41 times faster on a cluster with eight nodes, each one containing two 12-core Intel Haswell processors. Source code of MPIGeneNet, as well as a reference manual, are available at https://sourceforge.net/projects/mpigenenet/.

  14. Gene length as a biological timer to establish temporal transcriptional regulation

    PubMed Central

    Kirkconnell, Killeen S.; Magnuson, Brian; Paulsen, Michelle T.; Lu, Brian; Bedi, Karan; Ljungman, Mats

    2017-01-01

    ABSTRACT Transcriptional timing is inherently influenced by gene length, thus providing a mechanism for temporal regulation of gene expression. While gene size has been shown to be important for the expression timing of specific genes during early development, whether it plays a role in the timing of other global gene expression programs has not been extensively explored. Here, we investigate the role of gene length during the early transcriptional response of human fibroblasts to serum stimulation. Using the nascent sequencing techniques Bru-seq and BruUV-seq, we identified immediate genome-wide transcriptional changes following serum stimulation that were linked to rapid activation of enhancer elements. We identified 873 significantly induced and 209 significantly repressed genes. Variations in gene size allowed for a large group of genes to be simultaneously activated but produce full-length RNAs at different times. The median length of the group of serum-induced genes was significantly larger than the median length of all expressed genes, housekeeping genes, and serum-repressed genes. These gene length relationships were also observed in corresponding mouse orthologs, suggesting that relative gene size is evolutionarily conserved. The sizes of transcription factor and microRNA genes immediately induced after serum stimulation varied dramatically, setting up a cascade mechanism for temporal expression arising from a single activation event. The retention and expansion of large intronic sequences during evolution have likely played important roles in fine-tuning the temporal expression of target genes in various cellular response programs. PMID:28055303

  15. Genome-Wide Transcriptome Analyses of Silicon Metabolism in Phaeodactylum tricornutum Reveal the Multilevel Regulation of Silicic Acid Transporters

    PubMed Central

    Sapriel, Guillaume; Quinet, Michelle; Heijde, Marc; Jourdren, Laurent; Tanty, Véronique; Luo, Guangzuo; Le Crom, Stéphane; Lopez, Pascal Jean

    2009-01-01

    Background Diatoms are largely responsible for production of biogenic silica in the global ocean. However, in surface seawater, Si(OH)4 can be a major limiting factor for diatom productivity. Analyzing at the global scale the genes networks involved in Si transport and metabolism is critical in order to elucidate Si biomineralization, and to understand diatoms contribution to biogeochemical cycles. Methodology/Principal Findings Using whole genome expression analyses we evaluated the transcriptional response to Si availability for the model species Phaeodactylum tricornutum. Among the differentially regulated genes we found genes involved in glutamine-nitrogen pathways, encoding putative extracellular matrix components, or involved in iron regulation. Some of these compounds may be good candidates for intracellular intermediates involved in silicic acid storage and/or intracellular transport, which are very important processes that remain mysterious in diatoms. Expression analyses and localization studies gave the first picture of the spatial distribution of a silicic acid transporter in a diatom model species, and support the existence of transcriptional and post-transcriptional regulations. Conclusions/Significance Our global analyses revealed that about one fourth of the differentially expressed genes are organized in clusters, underlying a possible evolution of P. tricornutum genome, and perhaps other pennate diatoms, toward a better optimization of its response to variable environmental stimuli. High fitness and adaptation of diatoms to various Si levels in marine environments might arise in part by global regulations from gene (expression level) to genomic (organization in clusters, dosage compensation by gene duplication), and by post-transcriptional regulation and spatial distribution of SIT proteins. PMID:19829693

  16. Macrogenomic engineering via modulation of the scaling of chromatin packing density.

    PubMed

    Almassalha, Luay M; Bauer, Greta M; Wu, Wenli; Cherkezyan, Lusik; Zhang, Di; Kendra, Alexis; Gladstein, Scott; Chandler, John E; VanDerway, David; Seagle, Brandon-Luke L; Ugolkov, Andrey; Billadeau, Daniel D; O'Halloran, Thomas V; Mazar, Andrew P; Roy, Hemant K; Szleifer, Igal; Shahabi, Shohreh; Backman, Vadim

    2017-11-01

    Many human diseases result from the dysregulation of the complex interactions between tens to thousands of genes. However, approaches for the transcriptional modulation of many genes simultaneously in a predictive manner are lacking. Here, through the combination of simulations, systems modelling and in vitro experiments, we provide a physical regulatory framework based on chromatin packing-density heterogeneity for modulating the genomic information space. Because transcriptional interactions are essentially chemical reactions, they depend largely on the local physical nanoenvironment. We show that the regulation of the chromatin nanoenvironment allows for the predictable modulation of global patterns in gene expression. In particular, we show that the rational modulation of chromatin density fluctuations can lead to a decrease in global transcriptional activity and intercellular transcriptional heterogeneity in cancer cells during chemotherapeutic responses to achieve near-complete cancer cell killing in vitro. Our findings represent a 'macrogenomic engineering' approach to modulating the physical structure of chromatin for whole-scale transcriptional modulation.

  17. Rapid and efficient cDNA library screening by self-ligation of inverse PCR products (SLIP).

    PubMed

    Hoskins, Roger A; Stapleton, Mark; George, Reed A; Yu, Charles; Wan, Kenneth H; Carlson, Joseph W; Celniker, Susan E

    2005-12-02

    cDNA cloning is a central technology in molecular biology. cDNA sequences are used to determine mRNA transcript structures, including splice junctions, open reading frames (ORFs) and 5'- and 3'-untranslated regions (UTRs). cDNA clones are valuable reagents for functional studies of genes and proteins. Expressed Sequence Tag (EST) sequencing is the method of choice for recovering cDNAs representing many of the transcripts encoded in a eukaryotic genome. However, EST sequencing samples a cDNA library at random, and it recovers transcripts with low expression levels inefficiently. We describe a PCR-based method for directed screening of plasmid cDNA libraries. We demonstrate its utility in a screen of libraries used in our Drosophila EST projects for 153 transcription factor genes that were not represented by full-length cDNA clones in our Drosophila Gene Collection. We recovered high-quality, full-length cDNAs for 72 genes and variously compromised clones for an additional 32 genes. The method can be used at any scale, from the isolation of cDNA clones for a particular gene of interest, to the improvement of large gene collections in model organisms and the human. Finally, we discuss the relative merits of directed cDNA library screening and RT-PCR approaches.

  18. Structural covariance networks are coupled to expression of genes enriched in supragranular layers of the human cortex.

    PubMed

    Romero-Garcia, Rafael; Whitaker, Kirstie J; Váša, František; Seidlitz, Jakob; Shinn, Maxwell; Fonagy, Peter; Dolan, Raymond J; Jones, Peter B; Goodyer, Ian M; Bullmore, Edward T; Vértes, Petra E

    2018-05-01

    Complex network topology is characteristic of many biological systems, including anatomical and functional brain networks (connectomes). Here, we first constructed a structural covariance network from MRI measures of cortical thickness on 296 healthy volunteers, aged 14-24 years. Next, we designed a new algorithm for matching sample locations from the Allen Brain Atlas to the nodes of the SCN. Subsequently we used this to define, transcriptomic brain networks by estimating gene co-expression between pairs of cortical regions. Finally, we explored the hypothesis that transcriptional networks and structural MRI connectomes are coupled. A transcriptional brain network (TBN) and a structural covariance network (SCN) were correlated across connection weights and showed qualitatively similar complex topological properties: assortativity, small-worldness, modularity, and a rich-club. In both networks, the weight of an edge was inversely related to the anatomical (Euclidean) distance between regions. There were differences between networks in degree and distance distributions: the transcriptional network had a less fat-tailed degree distribution and a less positively skewed distance distribution than the SCN. However, cortical areas connected to each other within modules of the SCN had significantly higher levels of whole genome co-expression than expected by chance. Nodes connected in the SCN had especially high levels of expression and co-expression of a human supragranular enriched (HSE) gene set that has been specifically located to supragranular layers of human cerebral cortex and is known to be important for large-scale, long-distance cortico-cortical connectivity. This coupling of brain transcriptome and connectome topologies was largely but not entirely accounted for by the common constraint of physical distance on both networks. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  19. MAPK Signaling Pathway Alters Expression of Midgut ALP and ABCC Genes and Causes Resistance to Bacillus thuringiensis Cry1Ac Toxin in Diamondback Moth

    PubMed Central

    Wu, Qingjun; Wang, Shaoli; Xie, Wen; Zhu, Xun; Baxter, Simon W.; Zhou, Xuguo; Jurat-Fuentes, Juan Luis; Zhang, Youjun

    2015-01-01

    Insecticidal crystal toxins derived from the soil bacterium Bacillus thuringiensis (Bt) are widely used as biopesticide sprays or expressed in transgenic crops to control insect pests. However, large-scale use of Bt has led to field-evolved resistance in several lepidopteran pests. Resistance to Bt Cry1Ac toxin in the diamondback moth, Plutella xylostella (L.), was previously mapped to a multigenic resistance locus (BtR-1). Here, we assembled the 3.15 Mb BtR-1 locus and found high-level resistance to Cry1Ac and Bt biopesticide in four independent P. xylostella strains were all associated with differential expression of a midgut membrane-bound alkaline phosphatase (ALP) outside this locus and a suite of ATP-binding cassette transporter subfamily C (ABCC) genes inside this locus. The interplay between these resistance genes is controlled by a previously uncharacterized trans-regulatory mechanism via the mitogen-activated protein kinase (MAPK) signaling pathway. Molecular, biochemical, and functional analyses have established ALP as a functional Cry1Ac receptor. Phenotypic association experiments revealed that the recessive Cry1Ac resistance was tightly linked to down-regulation of ALP, ABCC2 and ABCC3, whereas it was not linked to up-regulation of ABCC1. Silencing of ABCC2 and ABCC3 in susceptible larvae reduced their susceptibility to Cry1Ac but did not affect the expression of ALP, whereas suppression of MAP4K4, a constitutively transcriptionally-activated MAPK upstream gene within the BtR-1 locus, led to a transient recovery of gene expression thereby restoring the susceptibility in resistant larvae. These results highlight a crucial role for ALP and ABCC genes in field-evolved resistance to Cry1Ac and reveal a novel trans-regulatory signaling mechanism responsible for modulating the expression of these pivotal genes in P. xylostella. PMID:25875245

  20. Evaluation of Two Outlier-Detection-Based Methods for Detecting Tissue-Selective Genes from Microarray Data

    PubMed Central

    Kadota, Koji; Konishi, Tomokazu; Shimizu, Kentaro

    2007-01-01

    Large-scale expression profiling using DNA microarrays enables identification of tissue-selective genes for which expression is considerably higher and/or lower in some tissues than in others. Among numerous possible methods, only two outlier-detection-based methods (an AIC-based method and Sprent’s non-parametric method) can treat equally various types of selective patterns, but they produce substantially different results. We investigated the performance of these two methods for different parameter settings and for a reduced number of samples. We focused on their ability to detect selective expression patterns robustly. We applied them to public microarray data collected from 36 normal human tissue samples and analyzed the effects of both changing the parameter settings and reducing the number of samples. The AIC-based method was more robust in both cases. The findings confirm that the use of the AIC-based method in the recently proposed ROKU method for detecting tissue-selective expression patterns is correct and that Sprent’s method is not suitable for ROKU. PMID:19936074

Top